Friday, February 21, 2014

OpenJUMP and GeoPackage Part 2: Spatial index with libgpkg

OpenJUMP DB Query Plugin adds native parsing of geometry BLOBs into standard SQLite database engine. Therefore no additional libraries are needed for reading geodata from OGC GeoPackages or Spatialite databases. Normal attribute queries with SQL are also supported out-of-the-box as long as they use the SQL language as understood by the standard SQLite engine. However, by installing libpgkg as an extension OpenJUMP users can create spatial indexes on GeoPackage tables and make BBOX queries much faster. A Drawback is that libgpkg is a platform dependent component which makes installation a little bit harder.

It is also possible to use GDAL for creating and dropping spatial indexes of the GeoPackage vector layers and recent GDAL version creates spatial indexes by default. However, this is supported only in GDAL versions beginning from 1.11 and 2.0. The following method works for existing GeoPackages which are created without spatial indexes and it can be used directly from the OpenJUMP DB Query plugin.

How to install, test and use libgpkg with OpenJUMP


Libgpkg is used as a SQLite Run-Time Loadable Extension. Windows users can get ready made binaries directly from libgpkg download page. Linux and Mac users must read the project home page and follow the installation instructions. OpenJUMP users must notice that we are leaving the platform independent zone now and also Windows users must select if they are going to run OpenJUMP on 32-bit or 64-bit jre and download the right binaries.

Place the libgpkg extention (on Windows "gpkg.dll") into directory [home-of-OpenJUMP-PLUS]\lib\ext\. If you plan to run both 32-bit and 64-bit jre sometimes you can rename the files into "gpkg_32.dll" and "gpkg_64.dll" and keep them both available in the directory. Now you are ready to start OpenJUMP but read first the following note:


Inportant note:
By now the DB Query Plugin comes with SQLite jdbc driver file "sqlite-jdbc-3.7.2.jar". The libgpkg extension will not work with it. Download a new driver "sqlite-jdbc4-3.8.2-SNAPSHOT.jar" from the Xerial downloads and place the new file into directory [home-of-OpenJUMP-PLUS]\lib\ext\jumpdbquery-1.0.0\.  Close OpenJUMP, delete the old jdbc driver and start again.

Now we are ready for a test.

The DB Query Plugin tries to load a SQLite extension if it finds an option "?spatialite=file_name" after the name of the SQLite database file. Extension loader was made for loading Spatialite extension but it works similarly for all valid extensions. If there is any error with loading the extension, all SQL queries will fail too.


This test proves that a special function "ST_MinX" from the libgpkg extension is available and obviously all the other libgpkg functions as well. See the list of functions at https://bitbucket.org/luciad/libgpkg/overview







  
Next something useful that we can't do with the standard SQLite functions. Let's build a spatial query for selecting features which are totally or partly inside a 10 km by 10 km selection box.  Or to be exact, whose bounding boxes are totally or partly inside the selection box.



Nine seconds to select 1567 polygons out of 1,2 million.  It is not hopeless, but thanks to libgpkg we can be make it a lot faster.


One of the libgpkg functions is "CreateSpatialIndex" which, not surprisingly, creates a spatial index.  Let's run it and modify our spatial query so that it utilizes this brand new  index.








Much better now. Accurate timing is obviously slightly more than zero seconds but for sure less than one second. It may be surprising, but BBOX queries from SQLite are often considerably faster than from the heavy weight spatial databases. What is interesting is that the query that hits the spatial index is using only standard SQLite functions. If GeoPackage database is delivered so that it contains spatial indexes OpenJUMP can do flaming fast spatial BBOX queries without platform dependent extensions. However, libgpkg is needed for creating the spatial indexes because it is the only open source library that can do it at the moment.

Template for making a spatial query from index


Spatial index in GeoPackage, as well as in Spatialite, does not work transparently as it does in PostGIS. For utilizing the spatial index a spatial query must always contain a subquery which selects the IDs of the candidate features from the index which is actually a virtual table that holds the minx, miny, maxx and maxy values for each feature.  The logic of the subquery may feel odd at the beginning but the image below hopefully clarifies why it makes sense to compare minimum bounds of features to maximum bounds of the BBOX and vice versa.


Here is a copy-paste template for selecting all features whose bounding boxes are partly or totally inside the BBOX:

select * from tablename
where rowid in
(select id from rtree_tablename_geom
where
minx<BBOX_maxx and
maxx>BBOX_minx and
miny<BBOX_maxy and
maxy>BBOX_miny);



Troubleshooting



Probable reason:
Sqlite-jdbc driver must be updated.





Probable reasons:
  • Typo in the dll name
  • Dll file not on the OpenJUMP search path
  • Wrong version of dll-file (32-bit/64-bit) 


Obvious reason:
Missing parameter "?spatialite=" for loading the extension

0 comments: