by IDISS team
The Challenge
The general motivation of our project is to provide tooling to confront the complex and difficult problem of data transformation.
From our perspective, this need for transformations arises from the existence of data silos. They occur naturally over time, mirroring organizational structures. As each island (department, company, industry etc.) collects and stores its own data for its own purposes, it creates its own data silo.
Bridging data silos is an important task for not just memory institutions (such as libraries and archives), global businesses (needing to interface with external standards such as electronic invoices, messaging etc.), but also for data warehousing projects where one needs, resp. wants, to unify data towards improved data quality.
Since these “silos” occur over time and tend to be built on an internal culture they are often hard to be seen from within.
The Solution
The purpose of our tooling is to handle editing, maintenance, validation, and versioning of the core information of the data transformations. Our first objective has been to provide a Minimum Viable Product (MVP) to interested parties. As a first tool, we developed an editor (as an extension of Visual Studio Code).
Although our targets are arbitrary data transformations and we don’t care about their data encoding, we started with XML as (syntax) language, given its wide use in some of our initial study cases, especially electronic invoices.
We have chosen for the initial “real-world” usage scenario of our tooling, the task of maintaining the syntax binding of the EU e-procurement (EU CEN Standard EN16931).
DAPSI support
DAPSI helped us the entire way with equity-free financial support, top-class seminars, mentoring, coaching and networking. Our participation in DAPSI helped open a number of doors. Through DAPSI we also came into contact with NGI Tetra which provided us not only an extended training but also an additional 20 hours of best of class business coaching.
DAPSI journey – Achievements from first phase of the DAPSI programme
By the end of the first phase, we were able to present our technical proof of concept.
It included insights that we gleaned from different sources. We reconfirmed and adjusted our initial requirements by interviewing several potential users. The most important phase to us was the evaluation of existing open-source tools and libraries to base our work upon to avoid the “reinventing the wheel” syndrome – and profiting from other community efforts, achieving more and higher quality in less time. Initial choices were made, but we faced a high and difficult learning curve. Therefore, we were proud to deliver our initial technical milestone in October:
DAPSI journey – Achievements from second phase of the DAPSI programme
The participation in the NGI DAPSI project allowed our team to focus on innovative problem solving and completely provide open-source technology. Only the financial support made it possible to raise sufficient manpower to tackle the high complexity of creating a toolset for an often-reappearing complex interoperability problem.
Simultaneously, our team was being prepared by DAPSI mentorship to stand on our own feet after the project funding and never to lose our focus on customers’ needs – just like some start-ups!
DAPSI gave prove how to successfully push innovative ideas into the European economy by doing the bottom-up approach!
Lessons learnt
We learned how to build a company. As geeks, we knew the technology and had ideas about how to solve specific technical problems, but we had only limited exposure on how to build a functioning company. DAPSI provided us with the necessary mentoring and helped us to see beyond our infatuation with technical solutions to redraw our roadmap to be better suited to a sustainable business.
What’s next
In IDISS we follow an Interlingual paradigm. Our tools are designed to assist in the creation, visualization, and maintenance of data models and next also for the automatic generation of transformations based on a central semantic model. In the future, we aim to combine existing code generators (like OpenAPI generator for APIs) with our semantic centered approach defining mappings. But we will explore new territories by the formal mapping of an API call to its dependent data change, which we base on the data’s grammar. By referring to grammar (representing all valid data instances) we will then be able to define state changes in a consistent generic way, allowing us to bring the automation of transformation (bridges) between data silos to a completely new level. In addition, we strive to provide machine learning tools to ease use cases lacking highly qualified domain experts to assist the process of mapping.
More information
You can find more information and the source code at https://github.com/DAPSI-IDISS/