Macros¶
NOTE: Because scala macros require a separate stage of compilation, they've been broken out into their own package in GeoTrellis. Otherwise, the functionality to be found there fits most neatly into
geotrellis.raster
.
Why macros?
Macros are complex and harder to read than most code. As such, it is reasonable to demand justification when they are employed and to be suspicious of their necessity. Here are some reasons you'll find macros in GeoTrellis:
Boxing and Unboxing
The main purpose for all of the macros employed throughout GeoTrellis
(though mostly in geotrellis.raster
) is to avoid the JVM's so-called
'boxing' of primitive types. Boxing, in other contexts, is often called
'wrapping' and it involves passing around primitive values (which normally
are lightweight and which require no special setup to work with) inside
objects that are far heavier (a JVM double is 8 bytes while the boxed
variant requires 24 bytes!) and which require processing time to unwrap.
Readability and consistency of performant code
Above, it was pointed out that macros are harder to read. This is true, but there are some special circumstances in which their use can improve readability and help to ensure consistency. When writing performant code, it is often not possible to stay DRY (Don't Repeat Yourself). This adds significant burdens to future modifications of shared behavior (you have to change code all over the library) and it reduces readability by exploding the sheer amount of text which must be read to make sense of a given portion of code.
How Macros Are Used
NoData checks
Throughout geotrellis.raster
, there are lots of checks about whether or
not a given value is data or whether its value represents NoData
.
isData(Int.MinValue) // false
isNoData(Int.MinValue) // true
isData(Double.NaN) // false
isNoData(Double.NaN) // true
This macro provides inlined code which checks to see if a given value is the
GeoTrellis-internal notion of NoData
. Int.MinValue
and Double.NaN
are
the two NoData
values Geotrellis isData
and isNoData
check against.
Type conversion
Similar to the NoData
checks mentioned above, type conversion macros
inline functionality which converts NoData
values for different
CellType
s (see the documentation about celltypes
for more on the different NoData
values). This is a boon to performance
and it reduces the lines of code fairly significantly.
Instead of this:
val someValue: Int = ???
val asFloat =
if (someValue == Int.MinValue) Float.NaN
else someValue.toFloat
We can write:
val someValue: Int = ???
val asFloat = i2f(someValue)
Tile Macros
Unlike the above macros, tile macros don't appreciably improve readability. They've been introduced merely to overcome shortcomings in certain boxing-behaviors in the scala compiler and understanding their behavior isn't necessary to read/understand the GeoTrellis codebase.
Micro-Optimizations¶
Loops
In Scala "for
-loops" are more than just
loops. A commonly-seen
feature throughout the codebase is the use of cfor
- or while
-loops where
it would seem that an ordinary "for
-loop" would be sufficient; we avoid
using them in most cases because the flexibility of Scala's for
construct
can come at a cost.
For example, the following simple for
-loop
for(i <- 0 to 100; j <- 0 to 100) { println(i+j) }
does not just just put the value 0 into a couple of variables, execute the loop body, increment the variables and as appropriate, and either branch or fall-through as appropriate. Instead, the Scala compiler generates objects representing the ranges of the outer- and inner-loops, as well as closures representing the interior of each loop. That results in something like this:
(0 to 100).foreach({ x => (0 to 100).foreach({ y => println(x+y) }) })
which can lead to unnecessary allocation and garbage collection. In the case
of more complicated for
-loops, the translation rules can even result in
boxing of primitive loop variables.
The cfor
construct from the Spire library avoids this problem because it
is translated into the while
construct, which does not incur the same
potential performance penalties as the for
construct.
Specialization
Another strategy that we imply to avoid unnecessary boxing is use of the
@specialized
decorator.
An example is the Histogram[T]
type, which is used to compute either
integer- or double-valued histograms. The declaration of that type looks
something like this:
abstract trait Histogram[@specialized (Int, Double) T <: AnyVal] { ... }
The @specialized
decorator and its two arguments tell the compiler that it
should generate three versions of this trait instead of just one:
Histogram[Int]
, Histogram[Double]
and the customary generic version
Histogram[T]
. Although this multiplies the amount of bytecode associated
with this type by roughly a factor of three, it provides the great advantage
of preventing boxing of (most) arguments and variables of type T
. In
addition, specialization also opens up additional opportunities for
optimization in circumstances where the compiler knows that it is dealing
with a particular primitive type instead of a object.
Mutable Types
Although use of immutable data structures is preferred in Scala, there are
places in the codebase where mutable data structures have been used for
performance reasons. This pattern frequently manifests as use of foreach
on a collection rather than filter
and/or map
. This is helpful because
less allocation of intermediate objects reduces garbage collection pressure.
The Tile Hierarchy¶
One of the most broadly-visible performance-related architectural features
present in GeoTrellis is the tile hierarchy. Prompted by concerns similar to
those which motivated the use of the @specialized
decorator, this
hierarchy is designed to prevent unnecessary boxing. The hierarchy provides
a structure of relationships between tiles of conceptually similar types,
for example IntArrayTile
s and DoubleArrayTile
, but they are connected
via type-neutral traits rather than traits or base classes with a type
parameter.
As brief example of the advantage that is provided, the types IntArrayTile
and DoubleArrayTile
both share a common ancestor, ArrayTile
, which
guarantees that they provide an apply
method. That method is used to index
the underlying array. In the case of IntArrayTile
it directly indexes the
array and in the case of DoubleArrayTile
the array is indexed and then the
retrieved value is converted form a double
to an Int
and returned. A
reliable interface is provided, but without the risk of boxing that use of a
type parameter would have.
Along similar lines, the fact that IntArrayTile
and UByteGeoTiffTile
share a common ancestor Tile
gurantees that they both provide the method
foreach
, which allows a function to be applied to each pixel of a tile.
This is possible even though those two types are backed by very different
data structures: an array for the first one and complex TIFF structure for
the second.
Some of the tile-related code is partially-auto generated using Miles
Sabin's
Boilerplate
mechanism. In particular, this mechanism is used to generate the code
related to TileCombiner
s.
Spark¶
The two principal Spark-related performance optimizations used throughout the GeoTrellis codebase concern improved serialization performance and avoiding shuffles.
In order to improve serialization performance, we do two things: we use Kryo serialization instead of standard Java serialization and we preregister classes with Kryo.
Kryo serialization is faster and more compact than standard Java serialization. Preregistration of classes with Kryo provides a good performance boost because it reduces the amount of network traffic. When a class is not preregistered with Kryo, that class' entire name must be transmitted along with the a serialized representation of that type. However when a class is preregistered, an index into the list of preregistered classes can be sent instead of the full name.
In order to reduces shuffles, we prefer aggregateByKey
or reduceByKey
over groupByKey
as recommended by the
Spark documentations.