A particular focus of our rewearch within data science is on rich and complex data sources, such as sequential and temporal data, histogram data, text, and graphs.  In particular, we are interested in building indexing structures for efficient search in large complex databases, predictive models for time series and sequence classification, as well as subgroup and rule discovery in transactional and sequential data.

Another focus of our research within data science is on ensemble methods, i.e., techniques for generating sets of models that collectively form predictions by voting, and on methods for generating interpretable models, e.g., rule learning. Interpretability has gained more attention in recent years, since data science methods and models have started to be used to larger extend in both industry and society as a whole. It  can be quantified at the model level, i.e., by providing a description of the whole model to the human or by instances, i.e., explaining for each decision the reasons and motivate behind the decision.  There exists lot of aspects on models that relate to interpretability: stability of model, size of model, dimensionality reduction, visualization to name a few.

Moreover, text mining is one of our focus areas, and in particular efficient and resource lean methods using language technology for very large text sets, as well as on semantic analysis, e.g., negation, speculation, and temporality, in order to be able to extract situation-specific, accurate and relevant information from texts.

One main application area for the research at the department is healthcare analytics, which aims for providing efficient and effective decision support for health care and pharmaceutical research. The research group has collaborated for several years with computational chemists in the pharmaceutical industry. This has resulted in new techniques and tools for building predictive models from observed biological activities, e.g., toxicity, of chemical compounds, which are currently being used in the industry.

Another application area that we are involved in is facilitating integrated vehicle health monitoring (IVHM) of heavy trucks. The focus then was to investigate how to predict the vehicle's health status by calculating the components remaining life to be able for create better decisions for example: (1) using health status to schedule maintenance so that unplanned downtime is minimized, (2) creating a system that optimizes maintenance plans based in part on the health status and customer preferences, and (3) creating a system that provides decision support to drivers and fleet planners utilization of vehicles.