Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote "Metadata in Table Schema" recipe to the specs #899

Open
roll opened this issue Apr 4, 2024 · 6 comments · Fixed by #961
Open

Promote "Metadata in Table Schema" recipe to the specs #899

roll opened this issue Apr 4, 2024 · 6 comments · Fixed by #961
Assignees
Labels
Milestone

Comments

@roll
Copy link
Member

roll commented Apr 4, 2024

Overview

The recipe is published here - https://datapackage.org/recipes/metadata-in-table-schema/

It's heavily used in one of the most prominent Data Package adoption cases - http://schema.data.gouv.fr/

I'm ccing @johanricher for more details here

@roll roll added this to the v2-final milestone Apr 4, 2024
@amelie-rondot
Copy link

Context

In 2019, we introduced a pattern for schema metadata properties, to be able to describe their name, description and other caracteristics. This helps users to understand schemas and increase their sharing and reuse, for example as part of a cataloging use case.

These metadata properties have since been used by a significative number of schemas, most of which have been created in France and cataloged on schema.data.gouv.fr.

Examples of adoption:

Some of the properties have also been implemented in frictionless-py:

  • descriptor (optional)
  • name (optional)
  • type (optional)
  • title (optional)
  • description (optional)

Proposition

In order to solidify the growing adoption of the metadata properties and increase the coherence between the spec and implementation, we propose to add a subset of those properties, those most frequently used, to the Table Schema specification and documentation, as part of the v2 Frictionless Data specs.

We will also propose via an issue on the frictionless-py repository to implement those properties in the library and document them.

All those properties would stay optional to ensure the retrocompatibility of the spec and implementations with existing schemas.

  • name:
    • Description: An identifier string for this schema.
    • Format: string in lower-case and containaing only alphanumeric characters along with "_" or "-" characters, without any space
    • Example: 'schema-static-ev-charger'
  • title:
    • Description: A human-readable title for this schema.
    • Format: string limited to 100 characters
    • Example: 'Static EV charger'
  • description:
    • Description: A text description for this schema.
    • Format: string
    • Example: "Specification of the exchange file for data concerning the geographical location and technical characteristics of electric vehicle charging stations and points."
  • homepage:
  • path:
    The direct path to the schema itself can be useful to help accessing it (i.e. machine readability).
  • sources:
    • Description: A list of dictionnaries containing documentation sources titles and urls related to the schema
    • Format: json array of documentation sources described with these properties "title" and "path"
    • Example:
    [
            {
                "title": "Décret n° 2017-26 du 12 janvier 2017 relatif aux infrastructures de recharge pour véhicules électriques et portant diverses mesures de transposition de la directive 2014/94/UE du Parlement européen et du Conseil du 22 octobre 2014 sur le déploiement d’une infrastructure pour carburants alternatifs",
                "path": "https://www.legifrance.gouv.fr/jo_pdf.do?id=JORFTEXT000033860620"
            }
    ]
    
  • keywords:
    • Description: A list of short keywords related to the schema
    • Format: list of string
    • Example:
    [
      "electric vehicle",
      "ev",
      "charging station",
      "mobility"
    ]
    
  • resources:
    Oftentimes, schemas are shared with example resources to illustrate them, with valid or even invalid files (e.g. with constraint errors).
    • Description: Example tabular data resource(s) validated or invalidated against this schema.
    • Format: json array of data file(s) described with these properties "title" and "path"
    • Example:
      [
              {
                  "title": "Exemple de fichier IRVE valide",
                  "path": "https://raw.githubusercontent.com/etalab/schema-irve/v2.3.0/statique/exemple-valide-statique.csv"
              }
      ]
      
  • created:
    • Description: The date on which this schema was created.
    • Format: date
    • Example: "2018-06-29"
  • lastModified:
    • Description: The date on which this schema was last modified.
    • Format: date
    • Example: "2022-10-10"
  • version:
    • Description: A unique version number for this schema, in the semantic versioning format, possibly prefixed with the letter "v".
    • Format: string
    • Examples: "2.3.0" or "v2.3.0" or "2.3.0-beta"
  • contributors:
    • Description: The contributors to this schema.
    • Format: json array of contributors described by these properties "title", "email", "organisation", "role"
    • Example:
      [
          {
              "title": "Alexandre Bulté",
              "email": "validation@data.gouv.fr",
              "organisation": "Etalab",
              "role": "author"
          },
          {
              "title": "Pierre Dittgen",
              "email": "pierre.dittgen@jailbreak.paris",
              "organisation": "Jailbreak",
              "role": "contributor"
          },
          ...
      ]
      

Adding other custom properties, would still be allowed and tolerated by implementations such as frictionless-py

Next

  • Collect feedback
  • Work on spec (PR)
  • Propose and work on implementation (issue+PR)

We propose to contribute to all or part of this work.

@roll
Copy link
Member Author

roll commented Apr 8, 2024

Thanks a lot, @amelie-rondot!

Would you be interested in creating a PR for this change (please take a look at the v2 Contribution Guideline), or would you like me to work on it?

@amelie-rondot
Copy link

Hello @roll,
From now, I will not have enough time to continue to work on frictionless and Validata projects. But my french colleague Pierre Camilleri takes the lead on it and is interested to create this PR to adopt this change.

@roll
Copy link
Member Author

roll commented Apr 15, 2024

Hi @pierrecamilleri,

Amazing! Please let me know if you need any help

@roll
Copy link
Member Author

roll commented Apr 25, 2024

Hi @pierrecamilleri,

Please take into account that the changes that will consist Data Package (v2) needs to be accepted by the Working Group by the end of May so we need to make a proposal on this one in next few weeks otherwise it will be in later versions

@pierrecamilleri
Copy link

Thanks for the reminder ! I am currently working on it so I should propose a PR in the coming days if not today.

@roll roll modified the milestones: v2.0-final, v2.1, v2.0 Jun 24, 2024
@roll roll changed the title Promote "Metadata in Table Schema" recipe to the specs? Promote "Metadata in Table Schema" recipe to the specs Jun 26, 2024
@roll roll added feat and removed Table Schema labels Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment