Newsletter Signup

Back to project pipeline

Text and Data Mining

  • DDI Funding: Funding Proposed £410k (Phase 1 Funding Released - £133k)
  • Timescale: 2019 - 2022
  • Lead Hub: Edinburgh Futures Institute

The ability to search and analyse texts and data at scale is fundamental to deliver Data Driven Innovation; text is our main method of knowledge transfer. Over 500 million tweets were sent every day in 2014 and 269 billion emails were sent per day in 2017. The total volume of published academic literature is doubling at least every nine years.

To help students and academics, as well as businesses and the third and public sectors, to gain the data skills needed to analyse the wealth of textual data we create every day, EDINA is developing a Text and Data mining service.

This service will improve data literacy and drive innovation and research by providing an entry point for the analysis of unstructured text, to support data-based decision-making in academia, business and the wider public.

Critically the service will develop an intuitive, easy-to-use text and data mining pilot service for all, co-created with users. The simplicity of the service will enable access for many users who would otherwise not interact with data through text mining techniques, delivering a service that is truly inclusive.

Working with researchers, students and external audiences, EDINA will build on the Defoe tool created by Professor Melissa Terras and Dr Rosa Filgueira Vicente from EPCC, taking forward the Digital Spring work funded by JISC at UCL and the British Library.

Stakeholders across the region will be involved in developing the new service, including academic and professional services colleagues across the University of Edinburgh, including all five of the new Institutions, the National Library of Scotland, the Data Lab, the Scottish Government, Project Jupyter, small and medium size businesses and charities.

The service will deliver to all of the City Region outcomes. As well as increasing data literacy and research, and improved services and new businesses, the service will act as a focal point to attract datasets as a result of new and increased capacity to mine text, and is anticipated to attract new partnerships to the region.


Unlocking text and data mining capability across the City Deal region