Metaphors frame our thinking, and not always in obvious ways. According to speechwriter Simon Lancaster, “We use metaphors on average six times a minute or once every sixteen words.” (You Are Not Human: How Words Kill. Biteback Publishing, 2018). Metaphors in the data-driven world often have analog origins. The use of ‘cc’ in email harks back to the ‘carbon copies’ produced by manual typewriters. And mining used to be done mostly underground, with shovels and pitchforks.
Today, text mining involves using digital tools to find patterns in text-based resources, and converting the unstructured text into structured data. No hard hat required (usually). Beatrice Alex, a Chancellor’s Fellow at the Edinburgh Futures Institute and hosted by the School of Literatures, Languages and Cultures, holds an ESRC and Edinburgh Stanford Link-funded PhD in Computational Linguistics. Dr Alex leads the Edinburgh Language Technology Group, a research and development group with longstanding expertise in text mining and natural language engineering. She is also a Research Fellow at the Institute for Language, Cognition and Computation at the University’s School of Informatics, and a Fellow at The Alan Turing Institute.
Dr Alex worked on Palimpsest, an AHRC-funded project geoparsing Edinburgh’s literature. In other words, Palimpsest mined literary texts for geographical references, down to street and building level, as a means of visualising Edinburgh’s literary places. Visualisation (another metaphor) is important for Dr Alex’s work. “It is vital to combine text mining with visualisation to be able to tell a story, and provide better access to and navigation through data,” she says. “In a lot of my previous work, the combination of text mining and visualisation has been a winning formula.” Her current work includes collaborating with medical historian Lukas Engelmann on mining historical reports about the third global plague epidemic, which killed 10 million people in India alone between 1855 and 1960. She also leads a Turing Institute project uncovering new insights from brain image reports for disease observations specific to stroke.
The transdisciplinary scope of the Futures Institute was the main attraction for Dr Alex taking up her Chancellor’s Fellowship position. “I work with scholars from diverse backgrounds and with a wide range of types of textual data,” she notes, adding “this is what I love about my job.” She sees Data-Driven Innovation as an exciting prospect. “Data can be used in so many different and innovative ways. DDI can help to fund interesting research projects and initiatives that make use of data across disciplines and are relevant to the City of Edinburgh, the region and beyond.” While Palimpsest only looked at literary data, it’s easy to imagine the wealth of data generated by Edinburgh’s diverse festival activities, museums and galleries, gardens and other attractions. “It is also important to think of historical information as data,” Dr Alex points out, citing books, newspapers, library archives, oral history archives, and even herbarium collections. “Text mining can help to make them accessible to future researchers and to the general public,” she explains. To give just one example, “DDI can be used to teach school children about Edinburgh’s literary landscape,” – now made possible by the Palimpsest project, using the LitLong.org interface – “but also to drive different types of tourism in the area.”
I work with scholars from diverse backgrounds and with a wide range of types of textual data