Mining text unveils patterns in the brain
Spotting patterns in health across a population is a complex task, but sophisticated language technology programmes are delivering increasingly impressive results.
The WARBLER project is Scotland’s first nation-wide text mining project aiming to analyse radiology reports from across the country.
Led by Neurologist Prof. William Whiteley, it is supported by Dr. Beatrice Alex, Senior Lecturer and DDI Chancellor’s Fellow at the Edinburgh Futures Institute and members of the Edinburgh Clinical Natural Language Processing Group.
She uses natural language processing (NLP) to spot patterns in the text accompanying brain scans, leading to the identification of groups of patients with similar phenotypes.
“It means we analyse text automatically, and detect and extract information and patterns related to conditions like strokes and brain tumours,” Dr Alex explained.
“The project is to analyse radiology reports written by radiologists which describe what’s seen in the image. Our goal is to extract information and report on different types of brain disease, where they are in the brain, when they happened in time, and where they happened in the brain.”
The team began with a pilot to develop a text processing pipeline. They called it EdIE-R and it is designed to extract and label phenotypes from the reports.
This pipeline was originally developed on the Edinburgh Stroke Study data (a stroke register in Lothian). Following that, it was further developed and tested on data in Generation Scotland (Lothian, Tayside, Fife, Greater Glasgow and Clyde, as well as Grampian), a consented cohort of more than 24,000 volunteers who have agreed for their data to be used for research.
“We couldn’t have done this work without such datasets being available, to build and test an NLP system which is able to analyse electronic text,” Dr Alex said.
Recently, the team obtained access to the Scottish Medical Imaging dataset, which contains over 1.3 million radiology brain imaging reports for the Scottish population collected over the last 10 years. “We are now processing this data using the EdIE-R pipeline,” Dr. Alex explained.
“Text mining of electronic healthcare records is notoriously challenging due to the data privacy issues involved, so we use Public Health Scotland’s, the national safe haven to process this data securely,” Dr Alex continued.
The Electronic Data Research and Innovation Service (eDRIS) is part of Public Health Scotland and is a secure environment for research on NHS Scotland patient data. Accessing this platform requires ethics, Patient and Public Involvement and Engagement (PPIE) as well as NHS Scotland’s Public Benefit and Privacy Panel approval. Safe and appropriate use of data is vital for conducting research on electronic health records, so all this oversight is necessary.
The wider goal of the project is not just to extract the information on reports of brain images, but to link that information to other data that’s recorded for patients. This could be blood pressure or BMI or even mental health data, when it is available. This will allow medical staff and researchers to see the connection between mental health issues, like dementia, and strokes.
“This work is very much a team effort achieved by several members of the Edinburgh Clinical NLP Group, which I co-founded and lead,” emphasised Dr Alex. “It’s fantastic that we are now able to apply NLP to large datasets for the whole population of Scotland.”
Read the latest Case studies
It has been estimated the UK will need to increase its number of electric vehicle…
Companies are increasingly exposed to regulations that price greenhouse gas emissions and Carbon Glance has…
Smart Tourism In the Age of the General Data Protection Regulation: Capturing Rich Visitor Flow Data Without Risking Privacy Invasion
Edinburgh welcomes more than four million visitors a year, so understanding tourist behaviours is vital…