Standard Metadata Model for Data Products — Open Data Product Specification

Data Product Business
6 min readOct 9, 2022

--

The Open Data Product Specification aims for the same impact in the Data Economy as what OpenAPI specification did for the API Economy. Open Data Product Specification is one candidate to be part of the European Data Spaces building blocks. “Open” refers only to the openness of the standard, not the data product content.

​The data products and data as a service solutions are spread around increasing amount of market places, tool stack for the data product design, development and management is a wild west, consumers have a hard time knowing what they are purchasing or how to compare data products to find a best possible fit in their situation. We need international de facto standard to describe data products.

International standards are a vital tool in ensuring products and services are interchangeable and compatible across borders, removing barriers to trade, reducing production and supply chain costs and building confidence in business services and protecting consumers.

Open Data Product Specification (ODPS)

The Open Data Product Specification is a vendor-neutral, open-source machine-readable data product metadata model. It defines the objects and attributes as well as the structure of digital data products. The work is based on existing standards (schema.org), best practices and emerging concepts like Data Mesh. The reasoning is that we reuse and proudly copy instead of reinventing the wheel.

https://opendataproducts.org

The specification has been designed with four major aspects of the data product in mind: 1) technical (infrastructure & access), 2) business (pricing & plans), 3) legal (licensing & IPR), and 4) ethical (privacy & mydata). The four aspects are described in 5 elements, which contain attributes and other elements.

The specification enables frictionless value chain development operations between data product design tools, data marketplaces, data platforms, and data pipelines. The need for shared open standard was discovered while consulting companies developing data products

CONNECT THE DATA VALUE CHAIN DOTS!

Benefits for the Data Economy

  • Enable interoperability between organizations, data platforms, marketplaces, and tools.
  • Reduce data product metadata conversions and errors between systems and organizations
  • Increase the speed of designing, testing, and implementing data products.
  • Speed up tools development around data product design, development and management.
  • Enable creation of automated data product deployment with standard methods (DataOps)

The core has 4 build-in aspects

The order of the above things is not random. Too often wesee data product development starting from drafting data flows and technical architecture for the solution. That should be the last thing to do before deciding if the product idea is good enough.

The first step is that you need to evaluate the business value of your possible data product. Does it make any sense? What kind of problem does it solve? Who are the customers? Is the market mature enough for it? These are just example questions and you should craft your business viability evaluation questions based on your needs.

The second step is to evaluate if you are allowed to do it legally. You might have a great idea but one of the data sources you will need to use is licensed in a way that it’s just not possible. Or then the local legislation prohibits you to use the data for that specific purpose. Here the lawyers often kill otherwise sound business plans and prevent you to make heavy losses in court. The solution might also be that we just decide to find a workaround or negotiate better deals with data owners.

The third aspect is ethical — is the data product creating outcomes that are good for the related participants. The ethical side of doing business is now becoming more and more important. Your idea might make sense in business numbers, and it might be legal to implement it, but it might not be ethical. In that case, you should not proceed but find an alternative way.

The final and last aspect is technical. This is not the problem nowadays. If all the above aspects have a green light, then the technical solution should be designed for implementation.

Open Data Product Initiative

Open Data Product Initiative (“ODPI”) provides an open source community, within which industry participants may easily contribute to building vendor-neutral, portable and open specifications for providing technical metadata for Data Products and Services — such as the “Open Data Product Specification” — and supporting tooling for validating the integrity of the specifications or instantiations of it.​

The ODPI is as such not intended to be a destination for community/consumer-focused tooling outside of the specification itself.

At the moment we have 2 permanent working groups

Development and maintenance is managed in Technical Steering Committee (“TSC”), open to any participant (Chair: Jussi Niilahti). This group is responsible for managing pull requests (Github), any suggested improvements towards the specification (channels: github Issues / typeform feedback), and releases of the specification. Organizations that purchase a membership in the Initiative will have a vote in the possible votes and position in decision making. Anyone can participate in the group, but not have a vote right automatically.

​The second work group is Strategy Group (Chair: Jarkko Moilanen). This group focuses on the business side of the Initiative, accepts new members and is the body to change community rules and memberships. This group also manages the overall design and related design principles of the standard.

Temporary working groups

Any of the membership companies can suggest (to Strategy group) and lead temporary working group for a spesific task. Strategy Group must accept the creation and possible budget of the group.

https://opencollective.com/odpi

How it all got started?

Jarkko Moilanen was invited to join the ranks of data platform development company in Dec 2018. In May 2019 Jarkko had worked with multiple data operating companies and discovered the need to build new tools — business tools for data owners. This resulted in the creation of Data Product Toolkit.

​The toolkit however was somewhat crippled without underlying machine-readable data product standard; something that binds together business, technology, legal and ethical aspects of the data product. The idea of Open Data Product Specification was born around Sept 2019. However, the idea was buried under other development tasks and the dream almost died.

​In Sept 2021 Jarkko resurrected the standardization idea and quickly finalized the first version of the specification in cooperation with Jussi Niilahti. The Open Data Prioduct Specification version 1.0 was published in 5th Feb 2022.

--

--

Data Product Business

We help you to capture real value from your data by practical methods, tools and world-class knowledge