- Adds error handling for the SearchIndexError exception
- Updates ckanext-dcat to version 1.7.0
- Updates ckanext-dcat to version 1.6.0
- The ckanext-harvest extension is no longer required to use the DCAT-AP.de RDF profile
- Updates and cleans up dependencies
- Standardization of the
test.ini
file
- Removes support for old CKAN versions prior 2.9 and Python 2
- Updates ckanext-dcat to version 1.5.1 and removes support for all properties in class dcat:DataService
except for
dcatde:licenseAttributionByText
, as these are now supported directly in ckanext-dcat. - Updates ckanext-harvest to version 1.5.5
- Fixes SA warnings from sqlalchemy that occur since version 1.4
- Adds support for CKAN 2.10.0
- Improves duplicate detection: Adds support for setting a priority to the harvester configuration. The remote dataset is imported if the modified dates of the remote and local dataset are equal and for the harvester of the remote dataset was specified a higher priority than for the harvester of the local dataset.
- Fixes tests for Python 2
- Updates ckanext-harvest to version 1.4.2
- Adds support for Python 3.9
- Updates ckanext-dcat to version 1.4.0
- Remove pinning version for cryptography dependency
- Updates ckanext-harvest to version 1.4.1
- Adds support for property dcat:accessService in class dcat:Distribution
- Add last_modified as fallback value to resource modified date
- Add support for property dcatde:licenseAttributionByText in class dcat:DataService
- Fixes correlation when parsing additionally contact point properties
- Updates pylint configuration to latest version and fixes several warnings
- Adds support for property dct:references in class dcat:Dataset
- Fixes string assertion error with Python 2
- Fixes test for invalid URIRef
- Internal changes: Switches Python environment from Python 3.6 to Python 3.8
- Improves the referencing of distribution when adding DCAT-AP.de properties
- Adds support for property dcatap:availability in class dcat:Dataset
- Updates ckanext-dcat to version 1.3.0
- Updates ckanext-harvest to version 1.4.0 and ckanext-dcat to version 1.2.0
- Updates deprecated contributor IDs in migration command
- Fixes saving resource extras in the migration functions and in the DCAT-AP.de profile
- Fixes clear harvest source history command call in cronjob shell script
- Fixes CKAN action context DCAT-AP.de migration command
- Support for Python 3
- Updates ckanext-harvest to version 1.3.4
- Updates deprecated contributor IDs in migration command
- Skip saving harvested datasets in the triple store without distribution and harvest source config param
resources_required: true
is set - Reads the ContributorID from the harvester config and add it to the dataset graph if not already present
- Saves ContributorID in addition to the organizationID in the triple store with the validation results
- Adds ContributorID from the harvester config to CKAN dataset if not already present
- Ensure that tags keep minimum length after normalization
- Fixes harvest object state. Marks corresponding harvest objects as not current when deleting duplicate datasets and datasets which contain no resources anymore
- Deletes deprecated datasets in CKAN regardless of whether the dataset could previously be renamed or not
- Deletes deprecated datasets from triple store even if there is no owner org defined in the harvest source
- Adds the option
--keep-current=true
to the clear harvest jobs shell script.
- Fixes dev-requirements.txt: Broken version 1.7.0 of lazy-object-proxy was banned
- Fixes saving information about the harvested datasets saved in the triple store if there is more than one harvest source linked to the same organisation
- Explicitly disallow incorrect version of python-dateutil
- Updates requirement ckanext-harvest to internal version 1.3.3.dev1
- Updates requirement ckanext-dcat to official version 1.1.3
- Updates the README.md with information about the support of a triplestore and a SHACL validation
- Adds the option to delete specific dataset by their URIs from the triplestore to the ckan command "triplestore"
- Updates requirement ckanext-dcat to fix publisher URI handling
- Updates requirement ckanext-dcat to support URIRef values in "rights" and "accessRights"
- Changes the serialization format for metadata from "xml" to "turtle", because it is more strict and fails if URIRef elements contain invalid characters
- Limit size of harvesting fetch and gather consumer log files
- Updates requirement ckanext-dcat to version 1.1.2
- Adds the possibility to update the contributorID for manually maintained datasets when the contributorID has changed
- Make requirement for ckanext-harvest optional (#10)
- Adds new migration script option
contributor-id-migrate
to add the contributorID to existing manually maintained datasets - Adds SHACL validation support to the triplestore ckan command
- Introduce the possiblity to validate the dataset graph by SHACL when updating the dataset and save the validation result in a triplestore
- Handle requests exceptions if the triplestore endpoint is not reachable
- Also deletes data in triplestore when dataset is deleted in CKAN
- Adds the triplestore ckan command
- Improve logging messages in duplicate detection
- Improve logging when updating data in the triplestore
- Remove pinning version for cryptography dependency. Version >=3.3.1 is working again.
- Improve exception handling when updating data in the triplestore
- Pin version for cryptography dependency avoiding build errors with version >=3.3
- Implemented: Add harvested data into a triplestore
- Avoid crashes of the fetch consumer in case deletion harvest objects are corrupted
- Fixed problem with python dependency 'pycountry' that caused the build to fail.
- When remote datasets without resources/distributions are rejected (
resources_required
), any local version of the dataset is deleted if present. - Fix line endings to match .gitattributes
- Fix harvester plugin docs (#11)
- Update requirement ckanext-dcat to version 1.1.0
- Catch exception if 'email-validator' is not available in older CKAN versions
- Remove patch disabling SSL verification for older Python 2.7 versions
- Adds support for the different VCARD representations for DCAT.contactPoint
- Update version for requirements ckanext-harvest and ckanext-dcat
- Remove the restriction to a specific version of CKAN
- Fix in RDF profile: Remove prefix "mailto:" from values in fields containing an email address in method parse_dataset
- Change in DCAT-AP.de RDF harvester: Remove validator 'email-validator' from create/update package schema
- Improve logic of the duplicate detection and add deletion of older duplicates within the duplicate detection
- Map older licenses in resources from DCAT-AP.de version v1.0 to the latest version v1.0.2
- Improve comparing dates with and without time zone information used by the duplicate detection
- Add different implementation for cleaning tags/keywords
- Add harvest source configuration
resources_required
, which logs and skips all datasets without distributions (CKAN resources)
- Fix possible error in logging message when setting default license
- Add support for class FOAF.Agent as rdf:type in dcatde:originator, dcatde:maintainer, dct:contributor and dct:creator
- Set default license (
http://dcat-ap.de/def/licenses/other-closed
) in the resources of a dataset if no license is provided and write a log entry with additional information about the harvest source, dataset and resource in the info level. Introduce configuration parameterckanext.dcatde.harvest.default_license
for defining the default license. - Serialize dcatde:contributorID as type UriRef if the value is an URI, otherwise as Literal
- Rename environment names for internal ci/cd pipeline
- Update ckanext-dcat to v0.0.9
- Update ckanext-harvest to v1.1.4
- Remove patches (Fixes #6)
- Delete requirements subfolder which contained pre-built wheels
- Add supervisor config for harvesting
gather_consumer
andfetch_consumer
- Add cronjob scripts to run and clear harvest jobs. These scripts are used with GovData and were previously included in ckanext-govdatade.
- Add support for dct:type in dcatde:originator, dcatde:maintainer, dct:contributor and dct:creator
- The profile and examples now use the DCAT-AP.de v1.0.1 Namespace
- Renamed
legalbasisText
tolegalBasis
andgeocodingText
togeocodingDescription
- Renamed
- Added logic to parse older DCAT-AP.de Namespaces
- Improved dct:format and dcat:mediaType handling
- Improved selecting of the default language
- Fix problem with not deleting metadata without guid while harvesting
- Fix handling of downloadURL and accessURL
- Select title, description and names in the default language if available
- Fix error in in graph_from_dataset() if there is no contactPoint exists in the graph
- Updated the examples for the licenses in CKAN and the license mapping to DCAT-AP.de v1.0.1
- Updated the example for the RDF endpoint to DCAT-AP.de v1.0.1
- Added patch for DCAT harvester that it uses the default
_get_user_name
logic of ckanext-harvest - Added patch for ckanext-harvest that the default dataset name suffix is configurable
- OGD
metadata_original_id
is now mapped todct:identifier
instead ofadms:identifier
- Added new migration script option
adms-id-migrate
to fix existing DCAT-AP.de datasets
- Added new migration script option
- Correctly set
metadata_harvested_portal
for the custom RDF Harvester
- Avoiding an invalid rdf graph because of whitespaces in URIRef values by removing whitespaces before adding URIRef objects into the graph
- Added DCAT-AP.de specific RDF Harvester
- The dependency to ckanext-harvest was added
- Harmonized the version between the other CKAN-Plugins of GovData
- Initial version of the CKAN plugin
- Extends the Output-Mapping about the DCAT-AP.de specific fields
- Contains a script (CKAN paster command) to create CKAN groups from the DCAT-AP categories
- Contains a migration script (CKAN paster command) to migrate the datasets in the CKAN database from OGD to DCAT-AP.de structure
- Contains a shell script to purge the CKAN groups representing the OGD categories