Socrata harvester to allow CKAN to keep in sync with a Socrata store
by OpenDataMonitor


A harvester to allow CKAN directories to keep in sync with a catalogue that provides API in order to fetch metadata.

In order to use this tool, you need to have the ODM CKAN harvester extension ( installed and loaded for your CKAN instance. Tested with CKAN v2.2 (


This work is based on the socrata harvester extension ( The socrata-harvester plugin adds support in using the mongo DB as metadata repository. Also, changes or modifications added to original code to comply with ODM project’s ( requirements (see below).


Main modifications are:

  • add extra metadata fields (language, country, catalogue_url, platform) or use existing ones in different way (metadata_created and metadata_updated are synchronised to our platform’s timings overriding the client’s) check whether a metadata record is already present in the MongoDB database, and accordingly create or update


To build and use this plugin, simply:

git clone
cd socrata-harvester
pip install -r pip-requirements.txt
python develop

Then you will need to update your CKAN configuration to include the new harvester. This will mean adding the socrata_harvester plugin as a plugin. E.g.

ckan.plugins = harvestodm socrata_harvest

Also you need to add the odm_extension settings to the development.ini file in your ckan folder.



After setting this up, you should be able to go to: http://localhost:5000/harvest

In case that you don’t have the ckanext-htmlharvest extension installed (

Then go to:


And have a new “Socrata” harvest type show up when creating sources.


This work implements the ckanext-harvest template ( and thus licensed under the GNU Affero General Public License (AGPL) v3.0 (

Recent Activity