Tagmanager offers tools and interfaces for cleaning tags in CKAN open data portals.
The main features are the detection of similar tags, and the possibility to merge them. This is useful for portals with many contributors, where tags are not always kept consistent.
We offer three modes for detecting similar tags:
This extensions is intended to fill the tag management gap of CKAN. In the future, we plan to offer the creation of relationships between tags, and a tag recommendation structure.
Before installing tagmanager, make sure you have:
Levenshtein python library:
pip install python-Levenshtein
Unidecode python library:
pip install unidecode
NLTK library:
pip install nltk
NLTK data:
python -m nltk.downloader all
To install ckanext-tagmanager:
Activate your CKAN virtual environment, for example:
. /usr/lib/ckan/default/bin/activate
Install the ckanext-tagmanager Python package into your virtual environment:
pip install ckanext-tagmanager
Run the database migration:
paster tagmanager migrate -c /etc/ckan/default/production.ini
tagmanager
to the ckan.plugins
setting in your CKAN config file (by default the config file is located at /etc/ckan/default/production.ini
).Restart CKAN. For example if you’ve deployed CKAN with Apache on Ubuntu:
sudo service apache2 reload
To install ckanext-tagmanager for development, activate your CKAN virtualenv and do:
git clone https://github.com/alantygel/ckanext-tagmanager.git
cd ckanext-tagmanager
python setup.py develop
paster tagmanager migrate -c /etc/ckan/default/development.ini
pip install -r dev-requirements.txt
Navigate to yoursite/tagmanager
This work was driven in the context of the research STODaP project, developed at the Federal University of Rio de Janeiro (Brazil) and the University of Bonn (Germany)