DSV project leade
Panagiotis Papapetrou, Professor

Participants / project team at DSV
Lars Asker, Associate Professor and Luis Eduardo Quintero, Research Assistant

Project period 2019-06-01 - 2020-05-31
Funding ICT-TNG (pilot project)

Description

In this project, we will propose a novel data management and analytics framework, focusing on three pillars: (1) data integration, (2) explainable machine learning methods, and (3) ethical integrity of predictive models. The final product will be a set of methods and tools for integrating massive and heterogeneous medical data sources, a set of predictive models for learning from these data sources, with emphasis on interpretability and explainability of the models rationale for the predictions, while focusing on maintaining ethical integrity and fairness in the underlying decision making mechanisms that govern machine learning. The project will focus on two critical application areas: adverse drug event detection and heart failure treatment. The project is a collaborative effort between four research institutions: the department of Computer and Systems Sciences at Stockholm University, the Department of Law at Stockholm University, RISE Research Institutes Sweden, Division ICT, and the Department of Automatic Control, School of Electrical Engineering and Computer Science at KTH.

 

The main objectives of the project include:

  • Objective 1: Unified data representation and integration. We will define novel unifying space representations, similarity measures, and methods for searching and indexing large and complex data spaces. The basic challenge is the temporal nature of the data spaces and the inherent temporal dependencies that may exist within the same and across different data sources in these spaces. Particular emphasis will be given on providing theoretical guarantees on the performance of the proposed indexing techniques in terms of retrieval accuracy and efficiency.

  • Objective 2: Explainable predictive models. We will develop novel predictive modelling mechanisms for combining and enhancing the aggregate knowledge from heterogeneous data sources, with particular emphasis on the temporal properties of the data.  The challenge is how to extract and fuse meaningful static and temporal features from multiple data sources, with focus on temporal data. The models will be interpretable to the domain experts by employing explainable features and rules.

  • Objective 3: Legal and ethical implications of machine learning models. We will focus on the legal and ethical risks, implications, and potential harms resulting from the development and use of predictive modelling in relation to the analysis of healthcare data. To this end, we will embed existing predictive modelling schemes with legal and ethical considerations, thereby making them more accessible to regulatory and policy demands. 

Implementation: The implementation of the project is organized in five implementation WPs, one for each of the three objectives (WP1, WP2, WP3), one for validation on real data sources (WP4), and one for dissemination and exploitation (WP5). The project coordination (WP6) is done by SU-DSV.

  • WP1: unifying representation of complex data spaces
  • WP2: explainable machine learning models
  • WP3: adherence to legal and ethical frameworks
  • WP4: validation in real data domains
  • WP5: dissemination and exploitation
  • WP6: project management

The project is a scheduled to last for 1 year between June 2019 and May 2020 with a total budget of 3.85 MSEK.