HEALTH BANK - Swedish Health Record Research Bank is unique research resource containing a large sets of electronic patient records. For example the Stockholm EPR (Electronic Patient Record) Corpus. The corpus contains data from over 512 clinical units from Karolinska University Hospital encompassing the years 2006‒2014 and over two million patients. Stockholm EPR Corpus stems from the TakeCare electronic patient records system that is used at Karolinska University Hospital. All patient records are deidentified. This big data corpus contains both structured information and unstructured information. The structured information contains a serial number for each patient, age, gender, ICD-10 diagnosis codes, drugs but also lab and blood values as well as admission and dicharge time and date. The unstructured data contains text written under different headings. The whole corpus contains over 500 million tokens.

Stockholm EPR Corpus has been used and is used in a number of research projects carried out by the Clinical Text Mining Group. The research is approved by the Regional Ethical Review Board in Stockholm (Etikprövningsnämnden i Stockholm) under various research plans.

We have developed a number and Clinical text mining tools based on the Stockholm EPR Corpus. Here is a description of all in Swedish of all annotated data in HEALTH BANK .