Mining molecular interaction data and its context
The project involves extraction of the context of molecular interaction data from the scientific literature. So far, little attempt has been made to capture the context of molecular interaction, how reliable it is, what is the nature of interaction etc. The project aims to study the way findings, experiments and knowledge about molecular interactions is presented in the literature, and in particular how contextual information that details molecular interactions are encoded and presented. The project implements a text mining framework to extract (from full-text articles) contextual information and link it with data in other resources to support informed decisions for understanding the complexity of interactions. The project is collaboration with Pfizer and thus the focus is placed on pharmaceutically relevant data sets including various pathogens such as HIV, hepatitis viruses, malaria etc.
In previous work (with M. Gerner, S. Farzaneh, C. Bergman) we have developed BioContext, a system for extracting and integrating information about molecular processes in biomedical articles. Using the data extracted by BioContext, it is possible to get an overview of a range of biomolecular processes relating to a particular gene or anatomical location. The current project is part of Dan’s PhD, with Prof D. Robertson (Bioinformatics) and Drs B. Sidders (Pfizer) and G. Nenadic as supervisors.