CKAN extension to enable geospatial searches within the datastore
by U.K. Natural History Museum

ckanext-dataspatial

This extension is no longer maintained.

Travis Coveralls CKAN

A CKAN extension that provides geospatial awareness of datastore data.

Overview

NB: This extension is unmaintained. There have been some syntax updates so it should work with CKAN 2.8+, but it hasn’t been tested and is not currently in use by the Museum.

This extension provides geospatial awareness of datastore data. This includes:

  • Geospatial searches within datasets;
  • Spatial extent of datastore searches;
  • Support for PostGIS;
  • Support for Solr via the ckanext-datasolr extension.

Installation

Path variables used below:

  • $INSTALL_FOLDER (i.e. where CKAN is installed), e.g. /usr/lib/ckan/default
  • $CONFIG_FILE, e.g. /etc/ckan/default/development.ini
  1. Clone the repository into the src folder:
cd $INSTALL_FOLDER/src
git clone https://github.com/NaturalHistoryMuseum/ckanext-dataspatial.git
  1. Activate the virtual env:
. $INSTALL_FOLDER/bin/activate
  1. Install the requirements from requirements.txt:
cd $INSTALL_FOLDER/src/ckanext-dataspatial
pip install -r requirements.txt
  1. Run setup.py:
cd $INSTALL_FOLDER/src/ckanext-dataspatial
python setup.py develop
  1. Add ‘dataspatial’ to the list of plugins in your $CONFIG_FILE:
ckan.plugins = ... dataspatial

Configuration

There are a number of options that can be specified in your .ini config file. They all have defaults set, so none are required.

Name Description Default
dataspatial.query_extent Which backend to use for query_extent queries (either 'postgis' or 'solr') postgis
dataspatial.postgis.field WGS data field in the PostGIS database _geom
dataspatial.postgis.mercator_field Mercator field in the PostGIS database _the_geom_webmercator

The following options only apply if the ckanext-datasolr extension is also installed.

Name Description Default
dataspatial.solr.index_field Spatial index in Solr _geom
dataspatial.solr.latitude_field Latitude index in Solr latitude
dataspatial.solr.longitude_field Longitude index in Solr longitude

Further Setup

Geospatial searches and query extent work both with PostGIS and Solr, but both require further setup before they can be used.

PostGIS

To use the PostGIS backend, your PostgreSQL database must have PostGIS support.

  1. Install the correct version of PostGIS for your version of PostgreSQL, e.g. for PostgreSQL 9.1 on Ubuntu:
sudo apt-get install postgresql-9.1-postgis
  1. Run scripts to install the extension and change permissions ($DATASTORE_DB_NAME is the name of your PostgreSQL database that holds the datastore, and $DB_USER is your database user).
sudo -u postgres psql -d $DATASTORE_DB_NAME -f /usr/share/postgresql/9.1/contrib/postgis-1.5/postgis.sql
sudo -u postgres psql -d $DATASTORE_DB_NAME -c "ALTER TABLE geometry_columns OWNER TO $DB_USER"
sudo -u postgres psql -d $DATASTORE_DB_NAME -c "ALTER TABLE spatial_ref_sys OWNER TO $DB_USER"
sudo -u postgres psql -d $DATASTORE_DB_NAME -f /usr/share/postgresql/9.1/contrib/postgis-1.5/spatial_ref_sys.sql
  1. You will then need to create PostGIS columns on your resources. Invoking the command below will create the two columns named above (dataspatial.postgis.field and dataspatial.postgis.mercator_field) on table $RESOURCE_ID. One represents the WGS (World Geodetic System) data, and one uses the web mercator projection, which is useful for generating maps.

    paster --plugin=ckanext-dataspatial dataspatial create-columns $RESOURCE_ID -c $CONFIG_FILE
    

Solr

When using Solr, you will need to make sure the spatial data is indexed; this extension does not provide any tools for doing this. To use Solr you will need to install and configure the ckanext-datasolr extension for the datasets you wish to use Solr on.

Usage

Actions

create_geom_columns

Creates the PostGIS columns on the $RESOURCE_ID table. Can also populate the columns (if populate is True) and create an index (if index is True).

from ckan.plugins import toolkit

toolkit.get_action('create_geom_columns')(
    context,
    {
        # The resource id, required.
        'resource_id': '...',

        # If True then populate the geom columns from lat/long
        # points after creating them. Optional, defaults to True
        'populate': True,

        # If True, then create an index of the geom columns. Optional, defaults to
        # True
        'index': True,

        # The dataset fields containing the latitude and longitude columns.
        # Required if (and only if) populate is True.
        'latitude_field': 'latitude',
        'longitude_field': 'longitude'
    }
)

update_geom_columns

Updates the geospatial column when a row is updated (this is not done automatically so must be implemented in your own workflow). Equivalent to the populate-columns command.

from ckan.plugins import toolkit

toolkit.get_action('update_geom_columns')(
    context,
    {
        'resource_id': 'RESOURCE_ID',
        'latitude_field': 'LATITUDE_COLUMN',
        'longitude_field': 'LONGITUDE_COLUMN'
    }
)

Searching by geospatial fields involves passing a custom filter to datastore_search. The filter _tmgeom contains a WKT (Well-Known Text) string representing the area to be searched (currently, only the types POLYGON or MULTIPOLYGON will work). e.g.:

from ckan.plugins import toolkit

search_params = {
    'resource_id': 'RESOURCE_ID',
    'filters': '_tmgeom:POLYGON(36 114, 36 115, 37 115, 37 114, 36 114)'
}
search = toolkit.get_action(u'datastore_search')(context, search_params)

datastore_query_extent

To see the geospatial extent of the query, the same parameters as above can be submitted to the action datastore_query_extent:

from ckan.plugins import toolkit

search_params = {
    'resource_id': 'RESOURCE_ID',
    'filters': '_tmgeom:POLYGON(36 114, 36 115, 37 115, 37 114, 36 114)'
}
search = toolkit.get_action(u'datastore_query_extent')(context, search_params)

This will return a dict: Key|Description —|———– total_count|Total number of rows matching the query geom_count|Number of rows matching the query that have geospatial information bounds|((lat min, long min), (lat max, long max)) for the queries rows

Commands

dataspatial

  1. create-columns: create the PostGIS columns on the $RESOURCE_ID table. bash paster --plugin=ckanext-dataspatial dataspatial create-columns $RESOURCE_ID -c $CONFIG_FILE

  2. create-index: create index for PostGIS columns on the $RESOURCE_ID table. bash paster --plugin=ckanext-dataspatial dataspatial create-index $RESOURCE_ID -c $CONFIG_FILE

  3. populate-columns: populate the PostGIS columns from the given lat & long fields. Equivalent to the update_geom_columns() action. bash paster --plugin=ckanext-dataspatial dataspatial populate-columns $RESOURCE_ID -l $LATITUDE_COLUMN -g $LONGITUDE_COLUMN -c $CONFIG_FILE

Testing

Test coverage is currently extremely limited.

To run the tests, use nosetests inside your virtualenv. The --nocapture flag will allow you to see the debug statements.

nosetests --ckan --with-pylons=$TEST_CONFIG_FILE --where=$INSTALL_FOLDER/src/ckanext-dataspatial --nologcapture --nocapture

Recent Activity