Mining term associations and events from bio-literature
This is a long-term project that aims at developing text mining methods that can provide efficient and sophisticated knowledge acquisition, offer plausible hypotheses for testing, prevent unnecessary repetition of previous work, and help in experimental design for specific research scenarios. We investigate various text mining approaches to establishing literature-based associations and links among various biological entities such as proteins, genes, species, cells, and experiments. The work was partially funded by BBSRC (“Mining term associations to support knowledge discovery in biology”) to explore suitable technologies for modelling user-elicited biological text mining scenarios to support hypothesis generation, and builds on a previous BBSRC project (“Protein Functional Classification using Text Data-mining”) that has developed automatic text-based classification of proteins to functional categories (based on the Gene ontology) using machine learning techniques and various textual features.
We are specifically interested in extraction of various molecular events, including gene expressions (see GETM), positive and negative regulations, bindings, etc. (see BioNLP as well as “Mining molecular events and their context” below).