Project leader - main PI
Panagiotis Papapetrou, Professor, SU


  • Cristian Rojas, Associate Professor, KTH
  • Lars Asker, Associate Professor, DSV, SU
  • Rami Mochaourab, Researcher, RISE
  • Stanley Greenstein, Senior Lecturer, Dept. of Law, SU


  • Ioanna Miliou, Postdoctoral researcher, DSV, SU
  • Ioannis Pavlopoulos, Senior Lecturer (fixed-term), DSV, SU
  • Isak Samsten, Senior Lecturer, DSV, SU
  • Luis Quintero, DSV, SU
  • Sugandh Sinha, RISE
  • Vasiliki Kougia, Research Assistant, DSV, SU
  • Zhendong Wang, PhD student, DSV, SU


Project period 2020-01-01 - 2023-12-31
Funding Digital Futures




This project intends to build a novel data management and analytics framework, focusing on three pillars: (1) data integration and federated learning, (2) explainable machine learning, and (3) legal and ethical integrity of predictive models. The final product will be a set of methods and tools for integrating massive and heterogeneous medical data sources in a federated manner, a set of predictive models for learning from these data sources, with emphasis on interpretability and explainability of the models rationale for the predictions, while focusing on maintaining ethical integrity and fairness in the underlying decision making mechanisms that govern machine learning. The project will focus on two critical application areas: adverse drug event detection and heart failure treatment. The project is a collaborative effort between four research institutions: the department of Computer and Systems Sciences at Stockholm University, the Department of Law at Stockholm University, RISE Research Institutes Sweden, Division ICT, and the Department of Automatic Control, School of Electrical Engineering and Computer Science at KTH. This is a continuation of the EXTREME pilot project.

The main objectives of the project include:

Objective 1: Unified data representation and integration. We will define novel unifying space representations, similarity measures, and methods for searching and indexing large and complex data spaces. The basic challenge is the temporal nature of the data spaces and the inherent temporal dependencies that may exist within the same and across different data sources in these spaces. Particular emphasis will be given on providing theoretical guarantees on the performance of the proposed indexing techniques in terms of retrieval accuracy and efficiency.

Objective 2: Explainable predictive models. We will develop novel predictive modeling mechanisms for combining and enhancing the aggregate knowledge from heterogeneous data sources, with particular emphasis on the temporal properties of the data. The main challenge will be how to extract and fuse meaningful static and temporal features from multiple data sources, with focus on sequential and temporal data [Xing 2011]. The constructed models will be interpretable to the domain experts by employing explainable features and rules.

Objective 3: Legal and ethical implications of machine learning models. We will focus on the legal and ethical risks, implications, and potential harms resulting from the development and use of predictive modelling in relation to the analysis of healthcare data. To this end, we will embed existing predictive modelling schemes with legal and ethical considerations, thereby making them more accessible to regulatory and policy demands. 

Implementation: The implementation of the project is organized in five implementation WPs, one for each of the three objectives (WP1, WP2, WP3), one for validation on real data sources (WP4), and one for dissemination and exploitation (WP5). The project coordination (WP6) is done by SU-DSV.

  • WP1: unifying representation of complex data spaces
  • WP2: explainable machine learning models
  • WP3: adherence to legal and ethical frameworks
  • WP4: validation in real data domains
  • WP5: dissemination and exploitation
  • WP6: project management

The project is a scheduled to last for 4 years between January 2020 and December 2023 with a total budget of 8.4M SEK.