Monday, September 15, 2014

OpenJUMP can color polygons with five colors

Larry Reeder made a plugin for coloring polygons according to the 5 color theorem  so that adjacent polygons do not have the same color. The Map Coloring plugin is pretty easy to use.

1) Open a polygon dataset


2) Find the tool from the menu and run.

3) A new layer which is colored with five colors will be created. Initial, half-arbitrary coloring is applied automatically but the color index is also saved as an "autoclr" attribute which makes it possible to adjust colors later with the standard classification tools.

How to install

Map coloring plugin can be downloaded from Installation is done by unzipping the plugin into /lib/ext folder of OpenJUMP. Restart is needed if OpenJUMP was running.

Known issues

Map coloring plugin is currently at version 0.40 and it may have issues if the polygon layer contains thousands of polygons. Using 64-bit jre helps a  lot with memory issues.

Posted on 3:03 PM | Categories:

Wednesday, March 19, 2014

OpenJUMP and GeoPackage Part 3: Make it fast

GeoPackage file is a full relational SQLite database which is stored into one physical data file. Knowing this reveals the two most effective ways for making GeoPackage faster:

  1. Create appropriate indexes. A database query without an index must check every row of the table. If the aim is to select only features which  have certain attribute values a search from index can cut down the the number of rows which need to be scanned dramatically.
  2. SQLite data file can get fragmented. Reading data from the data file is most efficient when the data are stored into contiguous chunks. VACUUM is a tool for optimizing the data file.

When to index?

There is no need for indexing if GeoPackage is used only as a substitute for shapefiles and layers are read as they are without any filtering. Indexes are good to have if:

  • SQL query contains a selective WHERE clause which leaves lots of features out of the result set.
  • The query is executed frequently. Ten seconds once in a lifetime is not much but a ten second lag in refreshing a map  after each panning or zooming may be intolerable.
  • Table is relatively big. SQLite is fast and if a table has less than 10000 rows index does not make much difference.
  • Size limit is lower if there are joins in the query. One thousand rows in one table tested against thousand rows in another table makes a million comparisons.

How to index with OpenJUMP

Let's check first the initial speed of an unindexed GeoPackage by executing some reasonable SQL SELECT statement. It is good to repeat the query couple of times and ensure that operating system is utilizing caching effectively or otherwise the result can be unrealistically slow. Following query selects a few hundred features from a table that contains 1.2 million rows and it seems to take about 3 seconds. That is not especially slow but because the table is biggish and most rows are discarded it should be possible to make it faster with indexes.

Query is using column "knro" and "pinta_ala" for filtering so we will index them both. Indexes can be created directly from OpenJUMP with the DB Query plugin.


 Now we can repeat the test. This time we do not want to get unrealistically fast result that may come directly from cache so we will use new unused values in the WHERE clause.

 DB Query plugin shows execution times in full seconds but reference measurement with another tool revealed that the initial query took 2.78 seconds. Indexes made the query to run about 30 times faster in only 0.08 seconds.

When and how to VACUUM

SQLite data file may get fragmented after subsequent deletes and inserts. Deletes leave empty spaces into the data file and new inserted records are first written into these holes. However, it is also possible to create a GeoPackage database which is suboptimal from the very beginning. This happens if data are appended into same database tables from, for example, very many shapefiles. New data are written into the end of the GeoPackage file and extented indexes must also be written into end of the file. Finally both the data and indexes of the table are stored into the GeoPackage data file into at least as many chunks as was the number of original shapefiles.

Vacuum is a tool that re-organizes SQLite data file and corrects fragmentation. Fragmented GeoPackage tables are a real concern only if the final table has been collected from thousands or tens of thousands appends and that alone does not usually make vacuuming necessary. However, a GeoPackage data file can be unnecessary big and vacuuming corrects also that issue. The reason for a possibly unnecessary big file size is that deleting data from SQLite does not shrink the file but unused space is only marked as free. On the other hand, vacuuming is forcing a rebuild of the database and that reduces the size of the data file.

So how to know if there is a need to vacuum? When is comes to unused space, it is possible to evaluate how much the file size could be reduced by comparing the total page count of the database and the count of free pages. Commands can be sent with DB Query and result can be read from attributes of the new layer that will be created.

pragma page_count;
pragma freelist_count;

However, the fragmentation rate of tables cannot be checked with any SQL or pragma query. As a rule of a thumb, consider vacuuming: 
  • When you receive a new GeoPackage which is at least a few gigabytes in size and you want to feel comfortable.
  • When you are going to send a GeoPackage to other users.
  • If you have done a massive number of inserts, updates and deletes and speed matters.
  • After large deletes if file size matters.

 Vacuuming itself is simple and it can be done with DB Query plugin. Notice that vacuuming needs  at least two times more free space in the data directory than the size of the database file. Vacuuming a multi-gigabyte data file will also take quite a long time.


Friday, February 21, 2014

OpenJUMP and GeoPackage Part 2: Spatial index with libgpkg

OpenJUMP DB Query Plugin adds native parsing of geometry BLOBs into standard SQLite database engine. Therefore no additional libraries are needed for reading geodata from OGC GeoPackages or Spatialite databases. Normal attribute queries with SQL are also supported out-of-the-box as long as they use the SQL language as understood by the standard SQLite engine. However, by installing libpgkg as an extension OpenJUMP users can create spatial indexes on GeoPackage tables and make BBOX queries much faster. A Drawback is that libgpkg is a platform dependent component which makes installation a little bit harder.

It is also possible to use GDAL for creating and dropping spatial indexes of the GeoPackage vector layers and recent GDAL version creates spatial indexes by default. However, this is supported only in GDAL versions beginning from 1.11 and 2.0. The following method works for existing GeoPackages which are created without spatial indexes and it can be used directly from the OpenJUMP DB Query plugin.

How to install, test and use libgpkg with OpenJUMP

Libgpkg is used as a SQLite Run-Time Loadable Extension. Windows users can get ready made binaries directly from libgpkg download page. Linux and Mac users must read the project home page and follow the installation instructions. OpenJUMP users must notice that we are leaving the platform independent zone now and also Windows users must select if they are going to run OpenJUMP on 32-bit or 64-bit jre and download the right binaries.

Place the libgpkg extention (on Windows "gpkg.dll") into directory [home-of-OpenJUMP-PLUS]\lib\ext\. If you plan to run both 32-bit and 64-bit jre sometimes you can rename the files into "gpkg_32.dll" and "gpkg_64.dll" and keep them both available in the directory. Now you are ready to start OpenJUMP but read first the following note:

Inportant note:
By now the DB Query Plugin comes with SQLite jdbc driver file "sqlite-jdbc-3.7.2.jar". The libgpkg extension will not work with it. Download a new driver "sqlite-jdbc4-3.8.2-SNAPSHOT.jar" from the Xerial downloads and place the new file into directory [home-of-OpenJUMP-PLUS]\lib\ext\jumpdbquery-1.0.0\.  Close OpenJUMP, delete the old jdbc driver and start again.

Now we are ready for a test.

The DB Query Plugin tries to load a SQLite extension if it finds an option "?spatialite=file_name" after the name of the SQLite database file. Extension loader was made for loading Spatialite extension but it works similarly for all valid extensions. If there is any error with loading the extension, all SQL queries will fail too.

This test proves that a special function "ST_MinX" from the libgpkg extension is available and obviously all the other libgpkg functions as well. See the list of functions at

Next something useful that we can't do with the standard SQLite functions. Let's build a spatial query for selecting features which are totally or partly inside a 10 km by 10 km selection box.  Or to be exact, whose bounding boxes are totally or partly inside the selection box.

Nine seconds to select 1567 polygons out of 1,2 million.  It is not hopeless, but thanks to libgpkg we can be make it a lot faster.

One of the libgpkg functions is "CreateSpatialIndex" which, not surprisingly, creates a spatial index.  Let's run it and modify our spatial query so that it utilizes this brand new  index.

Much better now. Accurate timing is obviously slightly more than zero seconds but for sure less than one second. It may be surprising, but BBOX queries from SQLite are often considerably faster than from the heavy weight spatial databases. What is interesting is that the query that hits the spatial index is using only standard SQLite functions. If GeoPackage database is delivered so that it contains spatial indexes OpenJUMP can do flaming fast spatial BBOX queries without platform dependent extensions. However, libgpkg is needed for creating the spatial indexes because it is the only open source library that can do it at the moment.

Template for making a spatial query from index

Spatial index in GeoPackage, as well as in Spatialite, does not work transparently as it does in PostGIS. For utilizing the spatial index a spatial query must always contain a subquery which selects the IDs of the candidate features from the index which is actually a virtual table that holds the minx, miny, maxx and maxy values for each feature.  The logic of the subquery may feel odd at the beginning but the image below hopefully clarifies why it makes sense to compare minimum bounds of features to maximum bounds of the BBOX and vice versa.

Here is a copy-paste template for selecting all features whose bounding boxes are partly or totally inside the BBOX:

select * from tablename
where rowid in
(select id from rtree_tablename_geom
minx<BBOX_maxx and
maxx>BBOX_minx and
miny<BBOX_maxy and


Probable reason:
Sqlite-jdbc driver must be updated.

Probable reasons:
  • Typo in the dll name
  • Dll file not on the OpenJUMP search path
  • Wrong version of dll-file (32-bit/64-bit) 

Obvious reason:
Missing parameter "?spatialite=" for loading the extension

Wednesday, February 19, 2014

OpenJUMP PLUS reads OGC Geopackages

OGC GeoPackage is a new international standard which defines an open, non-proprietary, platform-independent SQLite container for distribution and direct use of geospatial data, including vector features and tile matrix sets. It was adopted by Open Geospatial Consortium (OGC®) as an international standard on 13 February 2014.

Thanks to Larry Reeder, OpenJUMP PLUS has already a fast, native, platform-independent support for reading vector data from OGC GeoPackage data files.

Create GeoPackage with GDAL from a shapefile or OpenStreetMap data

First we need a GeoPackage with some vector data. An excellent tool for creating such is GDAL, which has had an GPKG driver since end of January, 2014. The GPKG driver will be included in GDAL version 1.11 but by now the GeoPackage support is available only in GDAL development versions.  The following instruction shows hot to create GeoPackeges on Windows but GDAL works in a similar way also on Linux and Mac. Just remember that before GDAL 1.11 is released a development version from February 2014 or later is needed.

GDAL-dev versions can be downloaded from A direct link to Windows 64-bit zipfile package is No installation is needed, just unzip the file to some directory on your disk. After unzipping find the batch file "SDKShell.bat" from the new GDAL main directory.  Double click it and GDAL command window opens.

A shapefile "test.shp" is converted into GeoPackage "test.gpkg" with the following command:

ogr2ogr -f GPKG test.gpkg test.shp

This command converts an OpenStreetMap data extract downloaded from Geofabrik into GeoPackage:

ogr2ogr -f GPKG --config OSM_COMPRESS_NODES YES germany.gpkg germany.osm.pbf -progress

Note!  Beginning from GDAL v. 1.11 and 2.0 the GeoPackage driver creates also spatial indexes for all created vector layers by default.

Read data from GeoPackage with OpenJUMP PLUS

The recent OpenJUMP PLUS snaphots (download from come with DB Query plugin which adds a new menu item into File menu.

 DB Query Plugin is extremely simple and robust. On the other hand, it lacks some fancy features. It does not have a file browser to search where your SQLite/Spatialite/GeoPackage file is. It does not read any metadata tables for building drop down menus. This is the feature that makes DB Query so robust. Missing or faulty metadata don't make it fail - the database query that is sent to the database is exactly what you see in the SQL window.

Without bells and whistles, the DB Query plugin does the job. It sends a query to the database and tries to find geometry BLOBs from the response. The main jar file which takes care of this is merely 47 kB in size and it recognizes geometry blobs in PostGIS, Oracle, MySQL, FDO, Spatialite and GeoPackage formats.

At the moment the GDAL GPKG driver does not support creating spatial or even attribute indexes. However, SQLite database, which is what GeoPackage essentially is, can filter quite fast even without indexes. In this example, walking through 1.2 million polygons and selecting a subset of 3121 features without index took 4 seconds.

DB Query supports almost all that can be done with SQL As Understood By SQLite.  Main exception is that query must not return more than one geometry.

If a record has no geometry at all the plugin creates a null geometry as
"GEOMETRYCOLLECTION EMPTY" and new layer with all the records is added to the OpenJUMP map anyhow.

 User with some knowledge of SQL can utilise free queries as workarounds. For example, run a query and check the table list of the GeoPackage DB from the attributes of a new map layer.


The DB Query Plugin is shipped with basic SQLite3 binaries and, as mentioned, that is enough for using SQL As Understood By SQLite. However, the OGC GeoPackage standard introduces some additional SQL functions and those cannot be used with standard DB Query installation. Using the additional functions with OpenJUMP is possible by using the libpgkg SQLite extension by Luciad but that makes installation and running a bit more difficult and platform dependent. Obviously that is worth another OpenJUMP blog post.

Remember also that OpenJUMP is a totally memory bound Java program. For reading big datasets a 64-bit jre and lots of RAM is needed. Even then, reading the 6.1 million OpenStreetMap highways of Germany takes about 5.1 GB of committed memory and makes OpenJUMP to react somewhat slow even on a powerful computer.

More reading:

OpenJUMP DB Query Plugin:
The page is not up to date
Source code:
OGC GeoPackage standard:
GDAL GeoPackage driver:
GDAL OpenStreetMap driver:
GDAL ogr2ogr utility:
Luciad libgpkg site: