Friday, February 21, 2014

OpenJUMP and GeoPackage Part 2: Spatial index with libgpkg

OpenJUMP DB Query Plugin adds native parsing of geometry BLOBs into standard SQLite database engine. Therefore no additional libraries are needed for reading geodata from OGC GeoPackages or Spatialite databases. Normal attribute queries with SQL are also supported out-of-the-box as long as they use the SQL language as understood by the standard SQLite engine. However, by installing libpgkg as an extension OpenJUMP users can create spatial indexes on GeoPackage tables and make BBOX queries much faster. A Drawback is that libgpkg is a platform dependent component which makes installation a little bit harder.

It is also possible to use GDAL for creating and dropping spatial indexes of the GeoPackage vector layers and recent GDAL version creates spatial indexes by default. However, this is supported only in GDAL versions beginning from 1.11 and 2.0. The following method works for existing GeoPackages which are created without spatial indexes and it can be used directly from the OpenJUMP DB Query plugin.

How to install, test and use libgpkg with OpenJUMP


Libgpkg is used as a SQLite Run-Time Loadable Extension. Windows users can get ready made binaries directly from libgpkg download page. Linux and Mac users must read the project home page and follow the installation instructions. OpenJUMP users must notice that we are leaving the platform independent zone now and also Windows users must select if they are going to run OpenJUMP on 32-bit or 64-bit jre and download the right binaries.

Place the libgpkg extention (on Windows "gpkg.dll") into directory [home-of-OpenJUMP-PLUS]\lib\ext\. If you plan to run both 32-bit and 64-bit jre sometimes you can rename the files into "gpkg_32.dll" and "gpkg_64.dll" and keep them both available in the directory. Now you are ready to start OpenJUMP but read first the following note:


Inportant note:
By now the DB Query Plugin comes with SQLite jdbc driver file "sqlite-jdbc-3.7.2.jar". The libgpkg extension will not work with it. Download a new driver "sqlite-jdbc4-3.8.2-SNAPSHOT.jar" from the Xerial downloads and place the new file into directory [home-of-OpenJUMP-PLUS]\lib\ext\jumpdbquery-1.0.0\.  Close OpenJUMP, delete the old jdbc driver and start again.

Now we are ready for a test.

The DB Query Plugin tries to load a SQLite extension if it finds an option "?spatialite=file_name" after the name of the SQLite database file. Extension loader was made for loading Spatialite extension but it works similarly for all valid extensions. If there is any error with loading the extension, all SQL queries will fail too.


This test proves that a special function "ST_MinX" from the libgpkg extension is available and obviously all the other libgpkg functions as well. See the list of functions at https://bitbucket.org/luciad/libgpkg/overview







  
Next something useful that we can't do with the standard SQLite functions. Let's build a spatial query for selecting features which are totally or partly inside a 10 km by 10 km selection box.  Or to be exact, whose bounding boxes are totally or partly inside the selection box.



Nine seconds to select 1567 polygons out of 1,2 million.  It is not hopeless, but thanks to libgpkg we can be make it a lot faster.


One of the libgpkg functions is "CreateSpatialIndex" which, not surprisingly, creates a spatial index.  Let's run it and modify our spatial query so that it utilizes this brand new  index.








Much better now. Accurate timing is obviously slightly more than zero seconds but for sure less than one second. It may be surprising, but BBOX queries from SQLite are often considerably faster than from the heavy weight spatial databases. What is interesting is that the query that hits the spatial index is using only standard SQLite functions. If GeoPackage database is delivered so that it contains spatial indexes OpenJUMP can do flaming fast spatial BBOX queries without platform dependent extensions. However, libgpkg is needed for creating the spatial indexes because it is the only open source library that can do it at the moment.

Template for making a spatial query from index


Spatial index in GeoPackage, as well as in Spatialite, does not work transparently as it does in PostGIS. For utilizing the spatial index a spatial query must always contain a subquery which selects the IDs of the candidate features from the index which is actually a virtual table that holds the minx, miny, maxx and maxy values for each feature.  The logic of the subquery may feel odd at the beginning but the image below hopefully clarifies why it makes sense to compare minimum bounds of features to maximum bounds of the BBOX and vice versa.


Here is a copy-paste template for selecting all features whose bounding boxes are partly or totally inside the BBOX:

select * from tablename
where rowid in
(select id from rtree_tablename_geom
where
minx<BBOX_maxx and
maxx>BBOX_minx and
miny<BBOX_maxy and
maxy>BBOX_miny);



Troubleshooting



Probable reason:
Sqlite-jdbc driver must be updated.





Probable reasons:
  • Typo in the dll name
  • Dll file not on the OpenJUMP search path
  • Wrong version of dll-file (32-bit/64-bit) 


Obvious reason:
Missing parameter "?spatialite=" for loading the extension

Wednesday, February 19, 2014

OpenJUMP PLUS reads OGC Geopackages

OGC GeoPackage is a new international standard which defines an open, non-proprietary, platform-independent SQLite container for distribution and direct use of geospatial data, including vector features and tile matrix sets. It was adopted by Open Geospatial Consortium (OGC®) as an international standard on 13 February 2014.

Thanks to Larry Reeder, OpenJUMP PLUS has already a fast, native, platform-independent support for reading vector data from OGC GeoPackage data files.


Create GeoPackage with GDAL from a shapefile or OpenStreetMap data


First we need a GeoPackage with some vector data. An excellent tool for creating such is GDAL, which has had an GPKG driver since end of January, 2014. The GPKG driver will be included in GDAL version 1.11 but by now the GeoPackage support is available only in GDAL development versions.  The following instruction shows hot to create GeoPackeges on Windows but GDAL works in a similar way also on Linux and Mac. Just remember that before GDAL 1.11 is released a development version from February 2014 or later is needed.

GDAL-dev versions can be downloaded from http://gisinternals.com/sdk/. A direct link to Windows 64-bit zipfile package is http://gisinternals.com/sdk/Download.aspx?file=release-1600-x64-gdal-mapserver.zip. No installation is needed, just unzip the file to some directory on your disk. After unzipping find the batch file "SDKShell.bat" from the new GDAL main directory.  Double click it and GDAL command window opens.

A shapefile "test.shp" is converted into GeoPackage "test.gpkg" with the following command:

ogr2ogr -f GPKG test.gpkg test.shp

This command converts an OpenStreetMap data extract downloaded from Geofabrik into GeoPackage:

ogr2ogr -f GPKG --config OSM_COMPRESS_NODES YES germany.gpkg germany.osm.pbf -progress

Note!  Beginning from GDAL v. 1.11 and 2.0 the GeoPackage driver creates also spatial indexes for all created vector layers by default.

Read data from GeoPackage with OpenJUMP PLUS



The recent OpenJUMP PLUS snaphots (download from http://sourceforge.net/projects/jump-pilot/files/OpenJUMP_snapshots/) come with DB Query plugin which adds a new menu item into File menu.


 DB Query Plugin is extremely simple and robust. On the other hand, it lacks some fancy features. It does not have a file browser to search where your SQLite/Spatialite/GeoPackage file is. It does not read any metadata tables for building drop down menus. This is the feature that makes DB Query so robust. Missing or faulty metadata don't make it fail - the database query that is sent to the database is exactly what you see in the SQL window.

Without bells and whistles, the DB Query plugin does the job. It sends a query to the database and tries to find geometry BLOBs from the response. The main jar file which takes care of this is merely 47 kB in size and it recognizes geometry blobs in PostGIS, Oracle, MySQL, FDO, Spatialite and GeoPackage formats.


At the moment the GDAL GPKG driver does not support creating spatial or even attribute indexes. However, SQLite database, which is what GeoPackage essentially is, can filter quite fast even without indexes. In this example, walking through 1.2 million polygons and selecting a subset of 3121 features without index took 4 seconds.



DB Query supports almost all that can be done with SQL As Understood By SQLite.  Main exception is that query must not return more than one geometry.

If a record has no geometry at all the plugin creates a null geometry as
"GEOMETRYCOLLECTION EMPTY" and new layer with all the records is added to the OpenJUMP map anyhow.

 User with some knowledge of SQL can utilise free queries as workarounds. For example, run a query and check the table list of the GeoPackage DB from the attributes of a new map layer.


Limitations


The DB Query Plugin is shipped with basic SQLite3 binaries and, as mentioned, that is enough for using SQL As Understood By SQLite. However, the OGC GeoPackage standard introduces some additional SQL functions and those cannot be used with standard DB Query installation. Using the additional functions with OpenJUMP is possible by using the libpgkg SQLite extension by Luciad but that makes installation and running a bit more difficult and platform dependent. Obviously that is worth another OpenJUMP blog post.

Remember also that OpenJUMP is a totally memory bound Java program. For reading big datasets a 64-bit jre and lots of RAM is needed. Even then, reading the 6.1 million OpenStreetMap highways of Germany takes about 5.1 GB of committed memory and makes OpenJUMP to react somewhat slow even on a powerful computer.

More reading:

OpenJUMP DB Query Plugin: http://sourceforge.net/p/jumpdbqplugin/wiki/Documentation/
The page is not up to date
Source code: http://sourceforge.net/projects/jumpdbqplugin/files/jumpdbqplugin/jumpdbqplugin-1.0.0/
OGC GeoPackage standard: http://www.opengeospatial.org/standards/geopackage
GDAL GeoPackage driver: http://www.gdal.org/ogr/drv_geopackage.html
GDAL OpenStreetMap driver: http://gdal.org/ogr/drv_osm.html
GDAL ogr2ogr utility: http://gdal.org/ogr2ogr.html
Luciad libgpkg site: https://bitbucket.org/luciad/libgpkg/overview