Research Motivation and Objectives

The German government has developed a high-tech strategy called “Industry 4.0”, which shifts the industrial technology paradigm by adopting the key technologies: cyber-physical-systems, the internet of things, services, and people as the basis. This shift aims to interconnect human and cyber-physical systems (i.e., physical objects, hardware, software) and allow them to communicate via the internet of things or services to collaborate and exchange data. However, these cyber-physical systems are generally developed by different vendors. Therefore, they might have various structures, formats, and vocabularies for exchanging the data, and since the human stakeholders might have interdisciplinary backgrounds, interoperability problems among the systems and these stakeholders might arise.

Each node in the internet of things generates data and is accessible via the internet. In our world, it can be imagined that there will be billions of such nodes. However, not all data provided by those nodes are useful to solve our problem. Hence, it could cause new problems that we are drowning in data but starving for knowledge or useful information. Those big data come from heterogeneous sources. Ensuring the quality of those data, whether they are in a consistent form and able to fulfill the applications’ requirements, will be another challenge.

Ontologies, Data FAIRification, Knowledge Graph and Linked Data

This approach employs a knowledge/information model to integrate heterogeneous schema and data sources. A W3C standard, called The OWL (Web Ontology Language), is used as the model representation, which serves as a shared vocabulary and schema among the heterogeneous systems and humans. Ontology allows the conceptualization and formalization of human knowledge elements; thus, it creates an inter-human-machine understandable model. OWL ontologies provide flexibility to express logical statements through the ability to depict description logic and the possible integration of rules. Therefore, logical reasonings, including fuzzy reasoning, are also possible in the knowledge model.

Data FAIRification aims make data following the FAIR (Findable, Accessible, Interoperable, and Reusable) principles by extracting the semantics of heterogeneous data sources, e.g., relational (SQL), semi-structured, and non-structured (natural language texts, images, videos), based on the concepts defined in the ontology. Depending on the data source, different methods have been developed for semantic uplifts, such as model-to-model transformation, model matching based on similarities, machine learning, natural language processing, and object detection from images

Machine learning and explainable AI (XAI)

machine learning techniques, such as regression, association analysis, classification, artificial neural network, and clustering, are artificial intelligence approaches to identify interesting patterns and dependencies in the data. The information or knowledge extracted from machine learning is extracted and annotated with ontology elements or converted to rules, and they are then integrated into the knowledge graph. The rules gained from machine learning do not have 100% validity. Therefore, the rules are transformed into fuzzy rules that allow fuzzy assertion. The rules are integrated into the knowledge graph by transforming them into Semantic Web Rule Language (SWRL) format.


We developed an XAI approach by linking the machine learning models to the knowledge graph as an explainer. The nodes of the knowledge graph are used to annotate and describe the input, output, and hidden layers in the black box models. The knowledge graph explains the relationships between inputs, outputs, and training data. Thus, it will bring the machine learning models to the right level of semantics and interoperability. Furthermore, my team and I are currently developing a metric to assess the explainability of machine learning models. The metric provides a holistic perspective of explainability with straightforward indicators in the form of indices. The explainability metric covers multiple criteria such as clarity, completeness, soundness, simplicity, and broadness. The metric measures different approaches of XAI, such as model-based, attribute-based, and example-based explanations. Using the metric, the developed knowledge graph-based XAI will be compared to XAI approaches such as SHAP, LIME, DeepLIFT.

We also developed causal machine learning approaches to extract causal relationships instead of correlations from data. It is essential since extracting actionable insights from data is only possible based on causality. Causal machine learning has better explainability than conventional XAI approach such as post-hoc explanation (e.g. LIME, SHAP) due to its ante hoc explanability by integrating domain experts in generating causal explanations (causal graph) before the models are built. Our research annotates the causal graph with the knowledge graph that models domain expert knowledge.

Data-driven simulation and optimization

The data-driven simulation and optimization approach has been used, for example, to improve production planning and scheduling, to optimize product configuration, and to optimize the decision in buying or selling energy. The existing exact optimization approaches cannot give solutions for large problems within reasonable periods of time. Meta-heuristics, by contrast, work on general models that do not correspond to reality.  For this reason, a hyper-heuristic optimization approach is developed, yielding a flexible but still real problem-relevant model. The hyper-heuristic approach allows for incorporating different user-defined heuristics strategies, optimization objectives, constraints, and other configurations from the ontology model. The model is also generated or adjusted using machine learning algorithms. We are also currently developing reinforcement learning approaches to solve optimization problems, for example, in the Delfine project.

Semantic Enrichment of Geometry Data to Improve Visualization and Interaction in VR/AR environment

We developed OntoCAD tool (see Häfner, P.; Häfner, V.; Wicaksono, H.; Ovtcharova, J. (2013)) that extracts the semantic information from CAD drawings in a semi-automatic way. The drawing primitives from CAD files are used to perform the pattern matching and classification algorithms to extract the semantic information. The resulting semantic information is then mapped to the corresponding ontology classes of a T-Box ontology. Finally, individuals of the corresponding classes are created to populate the ontology, and their geometric properties, like world coordinate position and bounding box, are set. This approach enables the linking or annotation of geometry data with semantic information. By using semantic information, the relationships between objects for visualization can be easily modeled. The complex interaction rules can also be modeled using SWRL or ontology axioms.

Applications of the approach in an interdisciplinary setting

The approaches described above have been applied for different processes across value chains in multiple industry sectors, for example:

  • to facilitate data interoperability and improve the explainability of AI used in digital twin for the transportation sector (see Talenta project)
  • to measure the interpretability/explainability of predictive models for demand and yield prediction of agricultural products and to improve the interpretability/explainability of the models for non-expert users such as farmers and distributors (see xAgri project)
  • to extract the causality of external and internal processes in the supply chain of the automotive industry (see causalSky project)
  • to forecast the electricity demand, to predict the availability of green electricity, and to optimize the use of green electricity in manufacturing (see Delfine project)
  • to predict and find the product configuration that matches customers’ needs through machine learning and ontology-based recommendation system (see DIALOG project)
  • to predict energy consumption behavior and to improve resource and energy efficiency in buildings in the (see KEHL, KnoHolEM, SERUM, and SWIMing projects),
  • to improve resource and energy efficiency in production in (WertProNET, wEnPro and ecoBalance projects)
  • to improve energy efficiency and to create an optimized service-oriented business model in smart city context (see DAREED project).