Settings and tuning
By adopting the TopoJSON format is possible to store geographical data as topology. Adopting this approach makes smaller files than its GeoJSON counterpart. This process allows varies options, which are described below:
- topology
- prequantize
- topoquantize
- presimplify
- toposimplify
- shared_coords
- prevent_oversimplify
- simplify_with
- simplify_algorithm
- winding_order
topology
boolean
Specify if the topology should be computed for deriving the TopoJSON. Default is True
.
Example 🔧
Given the following two linestrings with one side sharing:
import topojson as tp
from shapely import geometry
data = geometry.MultiLineString([
[[0, 0], [1, 0], [1, 1], [0, 1], [0, 0]],
[[1, 0], [2, 0], [2, 1], [1, 1], [1, 0]]
])
data
By setting topology=False
a TopoJSON structured file format is created without considering shared segments (the setting prequantize=False
avoids computing the delta-encoding):
tp.Topology(data, topology=False, prequantize=False)
Topology( {'arcs': [[[0.0, 0.0], [1.0, 0.0], [1.0, 1.0], [0.0, 1.0], [0.0, 0.0]], [[1.0, 0.0], [2.0, 0.0], [2.0, 1.0], [1.0, 1.0], [1.0, 0.0]]], 'bbox': (0.0, 0.0, 2.0, 1.0), 'coordinates': [], 'objects': {'data': {'geometries': [{'arcs': [[0], [1]], 'type': 'MultiLineString'}], 'type': 'GeometryCollection'}}, 'type': 'Topology'} )
As can be seen, the geometries are referenced by two segments ('arcs': [[0], [1]]
), where each segment is a single linestring (see: arcs
).
When doing the same with topology=True
, there are three arcs
. Where one arc is referenced two times, namely arc 2
(arc -3
is arc 2
reversed).
tp.Topology(data, topology=False, prequantize=False)
Topology( {'arcs': [[[1.0, 1.0], [0.0, 1.0], [0.0, 0.0], [1.0, 0.0]], [[1.0, 0.0], [2.0, 0.0], [2.0, 1.0], [1.0, 1.0]], [[1.0, 1.0], [1.0, 0.0]]], 'bbox': (0.0, 0.0, 2.0, 1.0), 'coordinates': [], 'objects': {'data': {'geometries': [{'arcs': [[-3, 0], [1, 2]], 'type': 'MultiLineString'}], 'type': 'GeometryCollection'}}, 'type': 'Topology'} )
prequantize
boolean, int
If the prequantization parameter is specified, the input geometry is quantized prior to computing the topology. The returned topology is quantized, and its arcs are delta-encoded.
Quantization is recommended to improve the quality of the topology if the input geometry is messy (i.e., small floating point error means that adjacent boundaries do not have identical values); typical values are powers of ten, such as 1e4
, 1e5
or 1e6
. Default is True
(which correspond to a quantize factor of 1e5
).
Example 🔧
Quantization is a two-step process, namely normalization and delta-encoding. Given the following two polygon with no sides shared, since the left-polygon has a x-max coordinate at 0.97
and the right-polygon has a x-min coordinate at 1.03
:
import topojson as tp
from shapely import geometry
data = geometry.MultiLineString([
[[0, 0], [0.97, 0], [0.97, 1], [0, 1], [0, 0]],
[[1.03, 0], [2, 0], [2, 1], [1.03, 1], [1.03, 0]]
])
data
The prequantize
option is defined as an integer number. It can be best understand as a value that defines the size of a rectangular grid, with the bottom left coordinate at (0,0)
. Next, the x
-numbers and y
-numbers of all coordinates are independency scaled and shifted on this rectangular grid (normalization on range). Here it is shown for the x
-numbers only:
# get the x-numbers of all coordinates
x = np.array([ls.xy[0] for ls in data])
print(f'x:\n{x}')
x: [[0. 0.97 0.97 0. 0. ] [1.03 2. 2. 1.03 1.03]]
# compute the scaling factor (kx) given the quantize factor (qf)
qf = 33
kx = (x.max() - x.min()) / (qf - 1)
print(f'kx: {kx}')
kx: 0.0625
# shift and apply the scaling factor to map the x-numbers on the integer range
xnorm = np.round((x - x.min()) / kx).astype(int)
print(f'x-normalized:\n{xnorm}')
x-normalized: [[ 0 16 16 0 0] [16 32 32 16 16]]
# denormalize happens as follow
print(f'x-denormalized:\n{xnorm * kx + x.min()}')
x-denormalized: [[0. 1. 1. 0. 0.] [1. 2. 2. 1. 1.]]
The delta-encoding is applied on the normalized coordinates and starting from the first coordinate, only the delta towards the following coordinate is stored. It is a character-reducing process, since the delta between two points is normally smaller than storing both coordinates. Here an example is shown for the x
-numbers only (1D), where in real it is a 2D process:
# delta encoding of normalized x-numbers
x_quant = np.insert(np.diff(xnorm), 0, xnorm[:,0], axis=1)
print(f'x-quantized (normalized-delta-encoded):\n{x_quant}')
x-quantized (normalized-delta-encoded): [[ 0 16 0 -16 0] [ 16 16 0 -16 0]]
# dequantization of quantized x-numbers
x_dequant = x_quant.cumsum(axis=1) * kx + x.min()
print(f'x-dequantized:\n{x_dequant}')
x-dequantized: [[0. 1. 1. 0. 0.] [1. 2. 2. 1. 1.]]
So, to apply this prequantize
value on the two no touching polygons, the polygons are touching as a result of it:
topo = tp.Topology(data, prequantize=33)
topo.to_svg()
topoquantize
boolean or int
If the topoquantization parameter is specified, the input geometry is quantized after the topology is constructed. If the topology is already quantized this will be resolved first before the topoquantization is applied. See for more details the prequantize
parameter. Default is False
.
See prequantize for an explained example.
Note: This is also supported by chaining. Meaning you could first compute the Topology (which can be cost-intensive) and afterwards apply the topoquantize
on the computed Topology.
Example 🔧
import topojson as tp
data = tp.utils.example_data_africa()
topo = tp.Topology(data)
topo_tq = topo.topoquantize(75)
print(f'length with topoquantization: {len(topo_tq.to_json())}')
print(f'length without topoquantization: {len(topo.to_json())}')
length with topoquantization: 20391 length without topoquantization: 32549
presimplify
boolean, float
Apply presimplify to remove unnecessary points from linestrings before the topology is constructed. This will simplify the input geometries. Use with care. Default is False
.
Example 🔧
Here we load continental Africa as data file and apply the presimplify
on the input-space. The plot shows the borders including linestring simplification, derived before the Topology
is computed.
import topojson as tp
data = tp.utils.example_data_africa()
topo = tp.Topology(data, presimplify=4)
# since SVG rendering is too small for this example,
# we use the (optional!) package Altair for visualization.
topo.to_alt().properties(title='presimplify')
toposimplify
boolean, float
Apply toposimplify to remove unnecessary points from arcs after the topology is constructed. This will simplify the constructed arcs without altering the topological relations. Sensible values for coordinates stored in degrees are in the range of 0.0001
to 10
. Defaults to False
.
Note 1: The units of toposimplify
are corresponding to the input space. The provided sensible values are for degrees (eg. epsg:4326
). When the projection of your data is in meters
you might need to test which value should be adopted.
Note 2: This is also supported by chaining. Meaning you could first compute the Topology (which can be cost-intensive) and afterwards apply the toposimplify
on the computed Topology.
Example 🔧
Here we load continental Africa as data file and apply the toposimplify
on the arcs. The plot shows the borders including linestring simplification, derived after the Topology
is computed.
import topojson as tp
data = tp.utils.example_data_africa()
topo = tp.Topology(data, toposimplify=4)
# since SVG rendering is too small for this example,
# we use the (optional!) package Altair for visualization.
topo.to_alt().properties(title='toposimplify')
shared_coords
boolean
Sets the strategy to detect junctions. When set to False
a path is considered shared when coordinates are the same path (path-connected
). The path-connected strategy is more ‘correct’, but slightly slower. When set to True
a path is considered shared when all coordinates appear in both paths (coords-connected
). Default is False
.
Example 🔧
The example shows two LineStrings, the are partially overlapping with one shared coordinate.
import topojson as tp
from shapely import geometry
data = geometry.MultiLineString([
[(0, 0), (10, 0), (10, 5), (20, 5)],
[(5, 0), (20, 0), (20, 5), (10, 5), (0, 5)]
])
data
The setting shared_coords=True
adopts a strategy of coords-connected
, meaning it will split only when coordinates are detected in both linestrings. Here we show it as SVG, with each derived segment/arc plotted separately.
tp.Topology(
data,
shared_coords=True,
prequantize=False
).to_svg(separate=True)
0 LINESTRING (0 0, 10 0, 10 5) 1 LINESTRING (5 0, 20 0, 20 5) 2 LINESTRING (20 5, 10 5) 3 LINESTRING (10 5, 0 5)
LineString 2 (20 5, 10 5)
is the shared segment.
When using the setting shared_coords=False
a path-connected
strategy is adopted, meaning it will split also when paths are overlapping without having common coordinates.
tp.Topology(
data,
shared_coords=False,
prequantize=False
).to_svg(separate=True)
0 LINESTRING (0 0, 5 0) 1 LINESTRING (10 0, 10 5) 2 LINESTRING (5 0, 10 0) 3 LINESTRING (10 0, 20 0, 20 5) 4 LINESTRING (20 5, 10 5) 5 LINESTRING (10 5, 0 5)
LineString 2 (5 0, 10 0)
and LineString 4 (20 5, 10 5)
are shared segments.
prevent_oversimplify
boolean
If this setting is set to True
, the simplification is slower, but the likelihood of producing valid geometries is higher as it prevents oversimplification. Simplification happens on paths separately, so this setting is especially relevant for rings with no partial shared paths. This is also known as a topology-preserving variant of simplification. Default is True
.
Example 🔧
The example shows a circle that is two times simplified. The first time with prevent_oversimplify=False
and the second time with prevent_oversimplify=True
.
import topojson as tp
from shapely import geometry
circle = geometry.Point(0, 0).buffer(1)
circle
# force simplification
tp.Topology(
circle,
toposimplify=2,
prevent_oversimplify=False
).to_svg()
# avoid oversimplification
tp.Topology(
circle,
toposimplify=2,
prevent_oversimplify=True
).to_svg()
simplify_with
str
Sets the package to use for simplifying (both pre- and toposimplify). Choose between shapely
or simplification
. Shapely adopts solely Douglas-Peucker and simplification both Douglas-Peucker and Visvalingam-Whyatt. The package simplification is known to be quicker than shapely. Default is shapely
.
Example 🔧
import topojson as tp
data = tp.utils.example_data_africa()
topo = tp.Topology(data, prequantize=200)
# we visualize with the (optional!) package Altair,
# since SVG rendering is too small for this example
topo.toposimplify(
epsilon=1,
simplify_algorithm='dp',
simplify_with='simplification',
prevent_oversimplify=False
).to_alt().properties(title='Douglas-Peucker (package simplification)')
simplify_algorithm
str
Choose between dp
and vw
, for Douglas-Peucker or Visvalingam-Whyatt respectively. vw
will only be selected if simplify_with
is set to simplification
. Default is dp
, since it still “produces the most accurate generalization” (Shi & Cheung, 2006).
Example 🔧
Here we use the vw
algorithm from Visvalingam-Whyatt included in the package simplification
.
import topojson as tp
data = tp.utils.example_data_africa()
topo = tp.Topology(data, prequantize=200)
# we visualize with the (optional!) package Altair,
# since SVG rendering is too small for this example
topo.toposimplify(
epsilon=1,
simplify_algorithm='vw',
simplify_with='simplification',
prevent_oversimplify=True
).to_alt().properties(title='Visvalingam-Whyatt (package simplification)')
winding_order
str
Determines the winding order of the features in the output geometry. Choose between CW_CCW
for clockwise orientation for outer rings and counter- clockwise for interior rings. Or CCW_CW
for counter-clockwise for outer rings and clockwise for interior rings. Default is CW_CCW
.
Example 🔧
Start with creating a Polygon
geometry
import topojson as tp
from shapely import geometry
data = geometry.shape({
"type": "Polygon",
"coordinates": [[[0,0],[10,0],[10,10],[0,10],[0,0]]]
})
data
Continue with defining two Topology objects, one with the outer rings clock wise and the inner rings counter clockwise (CW_CCW
) and one with the outer rings counter clock wise and the inner rings clock wise (CCW_CW
).
CW_CCW = tp.Topology(data, winding_order='CW_CCW', prequantize=False)
CCW_CW = tp.Topology(data, winding_order='CCW_CW', prequantize=False)
print(CW_CCW)
print(CCW_CW)
Topology( {'arcs': [[[0.0, 0.0], [0.0, 10.0], [10.0, 10.0], [10.0, 0.0], [0.0, 0.0]]], 'bbox': (0.0, 0.0, 10.0, 10.0), 'coordinates': [], 'objects': {'data': {'geometries': [{'arcs': [[0]], 'type': 'Polygon'}], 'type': 'GeometryCollection'}}, 'type': 'Topology'} ) Topology( {'arcs': [[[0.0, 0.0], [10.0, 0.0], [10.0, 10.0], [0.0, 10.0], [0.0, 0.0]]], 'bbox': (0.0, 0.0, 10.0, 10.0), 'coordinates': [], 'objects': {'data': {'geometries': [{'arcs': [[0]], 'type': 'Polygon'}], 'type': 'GeometryCollection'}}, 'type': 'Topology'} )
As you can see the arcs
for type Polygon
are reversed. The effect seems to be negligible, but the effect should be taken into account when using geographic projections, as it defines which part is ‘inside’ and ‘outside’ the Polygon
:
alt_top = CW_CCW.to_alt(projection='equalEarth', color='type:N').properties(title='CW_CCW')
alt_bottom = CCW_CW.to_alt(projection='equalEarth', color='type:N').properties(title='CCW_CW')
alt_top & alt_bottom