Investigating Data Quality Aspects of Question and Answer Reports
As the quantity of available data increases, the level of “quality” varies significantly, and this becomes a critically important factor for the effectiveness of organisations and individuals. Most of the business and scientific data is represented in unstructured and semi-structured formats. However, most current data quality methodologies work solely on structured data from conceptual perspective. Furthermore, question and answer reports are gaining momentum as a way to collect responses that can be used for data brokers, for instance, in business (customer satisfaction reports and FAQ). However these reports suffer from many data quality issues that affect their performance and efficiency in use. Therefore, we have been working on developing a data quality methodology with an associated data quality assistant tool that can improve the data quality of these reports by linguistically analysing them in order to track and identify the data quality problems found in such reports before they are deployed into a data store system or used for data analysis. This is Mona’s PhD work, with G. Nenadic and B. Theodoulidis as supervisors.