A harvester to allow CKAN directories to keep in sync with a catalogue that provides API in order to fetch metadata.
In order to use this tool, you need to have the ODM CKAN harvester extension (https://github.com/opendatamonitor/ckanext-harvestodm) installed and loaded for your CKAN instance. Tested with CKAN v2.2 (http://docs.ckan.org/en/ckan-2.2/).
This work is based on the socrata harvester extension (https://github.com/socrata/socrata-harvester). The socrata-harvester plugin adds support in using the mongo DB as metadata repository. Also, changes or modifications added to original code to comply with ODM project’s (www.opendatamonitor.eu) requirements (see below).
Main modifications are:
To build and use this plugin, simply:
git clone https://github.com/opendatamonitor/socrata-harvester.git
cd socrata-harvester
pip install -r pip-requirements.txt
python setup.py develop
Then you will need to update your CKAN configuration to include the new harvester. This will mean adding the socrata_harvester plugin as a plugin. E.g.
ckan.plugins = harvestodm socrata_harvest
Also you need to add the odm_extension settings to the development.ini file in your ckan folder.
[ckan:odm_extensions]
mongoclient=localhost
mongoport=27017
log_path=/var/local/ckan/default/pyenv/src/
After setting this up, you should be able to go to: http://localhost:5000/harvest
In case that you don’t have the ckanext-htmlharvest extension installed (https://github.com/opendatamonitor/ckanext-htmlharvest)
Then go to:
http://localhost:5000/harvest/new
And have a new “Socrata” harvest type show up when creating sources.
This work implements the ckanext-harvest template (https://github.com/ckan/ckanext-harvest) and thus licensed under the GNU Affero General Public License (AGPL) v3.0 (http://www.fsf.org/licensing/licenses/agpl-3.0.html).