Health Bank - Swedish Health Record Research Bank is an unique research infrastructure containing a large sets of electronic patient records. For example the Stockholm EPR (Electronic Patient Record) Corpus. The corpus contains data from over 512 clinical units from Karolinska University Hospital encompassing the years 2006‒2014 and over two million patients. Stockholm EPR Corpus stems from the TakeCare electronic patient records system that is used at the Karolinska University Hospital. All patient records are deidentified. This big data corpus contains both structured information and unstructured information. The structured information contains a serial number for each patient, age, gender, ICD-10 diagnosis codes, drugs but also lab and blood values as well as admission and discharge time and date. The unstructured data contains text written under different headings. The whole corpus contains over 500 million tokens.

Stockholm EPR Corpus has been used and is used in a number of research projects carried out by the Clinical Text Mining Group. The research is approved by the Regional Ethical Review Board in Stockholm (Etikprövningsnämnden i Stockholm) under various research plans.

We have developed a number and Clinical text mining tools based on the Stockholm EPR Corpus. Here is a description in Swedish of all annotated data in Health Bank.

Contact: Hercules Dalianis, Professor, Director Health Bank

For reference to Health Bank please use:

Dalianis, H., A. Henriksson, M. Kvist, S. Velupillai and R. Weegar. 2015. HEALTH BANK - A Workbench for Data Science Applications in Healthcare. Proceedings of the CAiSE-2015 Industry Track co-located with 27th Conference on Advanced Information Systems Engineering (CAiSE 2015), J. Krogstie, G. Juell-Skielse and V. Kabilan, (Eds.), Stockholm, Sweden, June 11, 2015, CEUR, Vol-1381, urn:nbn:de:0074-1381-0, pp 1-18, pdf.