More Website Templates @!

Welcome to Epidemiological Text Mining!

Epidemiological studies are rich in information and are important sources for evidence based medicine. However, epidemiologists experience difficulties in recognising and aggregating key characteristics in related research due to an increasing number of published articles.

The main aim is to explore how text mining techniques can assist epidemiologists to identify information of interest and detect and integrate key knowledge for further research.

We have developed a methodology for the extraction and normalization of key epidemiological characteristics from all types of epidemiological research articles in order to explore and aggregate concepts related to a health care problem. Read more

More specifically:

  1. a generic rule-based approach was designed and implemented for the identification of six key characteristics, including study design, population, exposure, outcome, covariate and effect size (see Characteristics).

  2. In order to facilitate knowledge integration and aggregation, the extracted characteristics are normalized and mapped to existing resources (an expanded version of the Ontology of Clinical Research (OCRe), UMLS).


We focus on the extraction of these characteristics from epidemiological literature at the document level since the aim of our method is to provide a summarization of the key information presented in each study abstract.

The evaluation of our proposed methodology was an average micro F-score of 87% for recognition of key characteristics and 91% accuracy for their normalization suggesting reliable results.

For more information see: Case study, Data, Extraction, Normalization, Evaluation, Contact.