DEPEND | gnTEAM

Mining free-text patient feedback comments

As part of the NIHR-funded project Developing and Enhancing the Usefulness of Patient Experience and Narrative Data (DEPEaND), we have developed a text mining software to analyse themes and sentiment expressed in free-text patient service feedback comments (e.g. Family and Friends Test).

The topic-specific opinion mining techniques have been applied to extract commonly mentioned themes from patient comments and to detect the polarity related to each theme. Following an initial manual inspection of a small sample, we focused on four main themes (staff attitude, quality of care, waiting time and environment) and associated sentiment (positive and negative/neutral). Two machine-learning methods have been developed (using Python and R), focusing on the segmentation of patient comments, and then prediction of the themes (and sentiment) using various machine learning algorithms look answer to the question. The system can also combine outputs of the two systems using a probability-threshold technique, and show top comments for each of the themes, for which the system is most confident (i.e. have the most confident prediction). The approach has been tested in two clinical settings – a general hospital and a mental health trust.

The methodology is explained in the paper “Segmentation-based mining of free-text patient feedback comments” (in preparation).

The code to train the software is available at here. The user manual is available here. Note that an annotated corpus is needed to generate the models.

The output of the software can be further analysed (using Stata) and visualised (using LaTeX) to automatically generate reports containing main themes, sentiment and most representative comments over a period of time. The software for this is available here (Stata code) and here (LaTeX). We note that Stata does require a licence.

Funded by:

Please contact Goran Nenadic.