The US subsidiary of Australian firm Health Language Analytics (HLA) has signed an agreement with California’s Public Health Institute (PHI) to use its Horizon machine learning platform to extract and codify cancer data from pathology text reports for the California Cancer Registry (CCR) program.
Health Language Analytics (HLA) is the US arm of HLA, set up in Sydney several years ago by well known computer scientist and information technologist Jon Patrick to commercialise his clinical natural language processing technology.
HLA has since developed the Horizon platform, which is able to read unstructured clinical text in documents such as pathology reports, clinicians’ notes, radiology reports and discharge summaries and convert it into highly structured and coded clinical information.
The platform focuses on the 80 per cent of what Professor Patrick calls the “dark content” or unstructured text in clinical settings that existing big data analytics can’t read and analyse.
It is particularly useful for disease registries such as the CCR, which has been the cornerstone of a substantial amount of research on cancer in the California population.
To date, the CCR has collected detailed information on over seven million cases of cancer among Californians diagnosed from 1988 forward, and more than 175,000 new cases are added annually.
Professor Patrick said the deal will be Horizon’s biggest single project to date and its first US contract. The technology is used for language processing services by a number of Australian hospitals and cancer registries.
He cited a 2015 research study that showed the national language processing market will have numerous opportunities for growth, such as the growing need for personalised medicine, disease awareness raising and healthcare investment.
“The volume of content in the medical record is 80 per cent text and so far no one has been able to mine it for useful purposes,” Professor Patrick said in a statement.
“Big data has actually only focused on 20 per cent of the medical record. Our work in ‘big text’ will enable the capability of big data to be effectively supercharged overnight to 100 per cent and enable a vastly greater range of research projects along with massively increased scale.”
Read more here (subscription required)