Data Analytics is the process of identifying structure in collections of data. The data can be of many types each of which have their own demands.
The most basic division of data is into the measurement types of nominal (categorical), ordinal, interval and ratio. An analytics package needs to apply different functions to properly produce correct analysis for each of these types. Clinical work is dominated by categorical and ratio variables, which require different operators to perform appropriate functions to either retrieve data records from data sites or to aggregate data values.
Date semantics are equally important to achieving effective data analytics and relating the correct measurement type to the correct semantics and correct operator require a careful design and decision making.
Understanding the complete taxonomy of data analytics for clinical work is a craft that is given little clear air in today’s clinical work. HLA’s Data Analytics taxonomy helps expose the true richness of the field and the need for software to be more comprehensive that it has been in the past. Data Analytics is a much more comprehensive function that enables the user to identify structure in the chaos of many data records of seemingly disparate content.
HLA’s DA provides for 5 types of analytics: Ad hoc questioning, hypothesis testing, scientific test assessment, semantic text extraction and predictive modelling.
An ad hoc query is the need to ask a question once without expecting to ask it again. The variables used in an ad hoc query need to be readily recognisable, and the query about them needs to be framed easily and recognisably by the clinician as semantically validly formed. Repeated use of a given ad hoc question might lead a user to develop an equivalent selective report if it was perceived it was needed frequently enough for routine work.
Hypothesis Testing is the standard process of asking for a comparison between two groups to be evaluated statistically. It is assumed the two groups can be separated by a small number of non-confounding variables.
Scientific Studies is the task of framing a set of questions to identify multiple cohorts in a systematic study and statistically test comparative variables for their variation between groups.
Semantic Concept Identification uses categorical variables defined in SNOMED CT to identify semantic concepts in text. It adds to the other types of analytics searching the free text components of the clinical record to retrieve the concepts to be used as data in the evaluation. The recognition of the free text concepts is a statistical NLP process that uses SNOMED CT descriptions as a representation of the semantics of the variables to be extracted. The accuracy of concept recognition is a function of the level of correspondence between the writing in the notes and the phrasal formations in SNOMED CT concept descriptions.
Predictive Modelling is the construction of evidenced based models used for determining patient prognoses.