Frequently Asked Questions
The Big Data Network Phase 2 (BDN2)
The enormous volume and complexity of data that are being collected by government departments, businesses and other organisations represents a significant resource within the UK, which can be used to the mutual benefit of academic research, organisations and society as a whole.
The ESRC invested in the Big Data Network Phase 2 (BDN2) in order to help optimise this resource. The BDN2 is composed of three organisations working towards fulfilment of this objective: the Business and Local Government Data Research Centre, the Consumer Data Research Centre, and the Urban Big Data Centre. Each one comprises a range of staff with expertise in sourcing, managing, linking and analysing data.
There is also the Big Data Network Phase 1 (known as the Administrative Data Research Network (ADRN)) and a Big Data Network Phase 3 focussing on civil society and social media data.
What does BDN2 do?
Centres in the Big Data Network Phase 2 make data that is routinely collected by business and local government organisations available for social science research purposes. This is research that makes a difference: it shapes public policies and improves business and service planning; it makes voluntary bodies and other organisations more effective; and it helps inform wider society about what is really happening in modern Britain.
Further details are available on the ESRC website.
Where does BDN2 funding come from?
Centres in the Big Data Network Phase 2 are funded by the Economic and Social Research Council (ESRC). The ESRC is the UK's largest organisation for funding research on economic and social issues - supporting independent, high quality research that has an impact on business, the public sector and civil society. ESRC is an executive non-departmental public body, sponsored by the Department for Business, Energy & Industrial Strategy but working independently from government and private sector companies.
What are the benefits of using big data in research?
Big data can help researchers to understand social and economic patterns and trends across the UK, at scales from the neighbourhood to the national. It can help improve the provision of services such as social housing or public transport, and help policymakers to be more effective in how they look after people’s health and wellbeing, by understanding where health needs lie and what treatments work best for which types of people. Linking and analysing multiple sources of big data can also add to information that has already been collected and thereby enrich it.
What do you mean by ‘big data’?
‘Big data’ is a term to describe data sets that are so large or complex that traditional data processing applications are inadequate. Analysis issues include: high volumes of data; data forms (e.g. images, text); mixed data origins (e.g. social media, sensors, GPS); and data accessibility or securitisation (e.g. personal data in administrative datasets; commercially sensitive data).
What information is being collected and linked?
The information content of linked data will be different for each project, depending on what the researcher is investigating. Researchers will work with ‘de-identified’ data, which means that all directly identifying information is removed. Researchers need to be able to build up consistent pictures so they can identify trends and patterns in the general population. This in turn shapes the strategies and policies that promote social well-being.
During data acquisition, datasets are fully described in metadata and their quality is evaluated by reference to, among other things, their physical characteristics, information content and methodology of collection. Descriptions are periodically revised to incorporate emerging information about the data and reflect usage that has taken place.
What is data linking?
Data linking is the joining together of multiple datasets from the same or different organisations (where permissions allow) to create a more comprehensive data collection that can be used for detailed research and analysis.
What are the benefits of linking data?
Researchers are often interested in bigger pictures than can be developed from a single dataset. They need to bring together multiple datasets from different organisations or sources to inform the research that in turn informs better decisions. A more comprehensive and relevant data assemblage is thus created for detailed research and analysis. For example, linking data on road and pavement repairs with health and social care data on elderly citizens falling in the street can build a picture of how well citizens’ health and safety needs are being catered for, and enable better planning.
Will data be used for commercial purposes?
The primary use of data is to conduct research in order to generate outputs, which can be utilised to offer a better understanding of society. We operate in a mixed economy and goals such as better targeting of energy efficiency measures or improving public transport services may often be most effectively achieved in partnership with business or commercial interests. Any commercial use of data will be dependent upon the license terms associated with that dataset. This is consistent with the Economic and Social Research Council’s 2015 Strategic Plan for facilitating partnerships and realising impact.
Who will be able to see my data?
We offer researchers who are affiliated with an academic institution, public sector researchers and private sector analysts secure access to data from business, local government and any other sources in full accordance with legislative requirements and the data owner’s specifications.
What are personal data?
The Information Commissioner’s Office defines personal data as data which relate to a living individual who can be identified:
(a) from the data, or
(b) from the data used alongside other information which is in the possession of, or is likely to come into the possession of, the data controller. This includes any expression of opinion about the individual and any indication of the intentions of the data controller or any other person in respect of the individual.
How will you protect data about me?
Information security management is of utmost importance to the Centre. Information that is held by the Centre is subjected to a range of information security controls based on the ISO 27001 series of standards and the third party services providing linkage and access services for controlled data are also ISO 27001 compliant.
- secure data provisioning and backup delivered through state-of-the-art secure data technology
- robust information governance procedures based on approved researchers, users and organisations
- secure File Transfer Protocols to support the transmission of data
- separation of roles and robust indexing procedures designed to minimise risks of breaching individuals' privacy when using confidential, sensitive data
- data linkages created and maintained using rigorous, internationally accepted privacy preserving protocols, direct or probabilistic matching, with clerical review available where required to increase matching accuracy where required
- secure environments for researchers to analyse anonymised individual level or summarised records
- restricted access to securely stored confidential data including statistical disclosure control methods on statistical outputs, such as graphics, tables or regression analyses
How do I know that the people using my personal data will be responsible?
- Every researcher will be accredited
- In addition to civil and criminal penalties for data breach and misuse, sanctions are in place for data breaches in applicable cases, which may include exclusion from use of BDN2 data services and loss of eligibility for future ESRC funding
- Every research project is assessed by a research approvals process, undertaken by independent experts and sometimes in association with the data provider
- Researchers must take a compulsory training programme in administrative data management and security standards or, in the case of CDRC, a safe researcher training course
- The researcher is required to sign a declaration to confirm that they understand their personal responsibilities and obligations
What do you mean by an accredited researcher?
Accredited researchers are those who have undergone specific training to ensure that they understand how to access controlled datasets safely and securely. Training covers: data security and personal responsibility, including legal background; security models; breaches and penalties; and statistical disclosure control to ensure that all outputs are safe to use and do not identify individuals.
Will my personal data be for sale?
No. The BDN2 has been set up for social benefit and is not a commercial enterprise or marketing organisation. We only provide access to data for accredited researchers who are carrying out research with a clear potential public benefit.
How does BDN2 oversee the safe use of data?
BDN2 is managed day to day by its management committee, which consists of the principal investigators or directors of the three Centres. This committee reports to the Economic and Social Research Council, which itself is accountable to the Department for Business, Energy and Industrial Strategy (BEIS).
The UBDC is also governed by an independent Advisory Group, chaired by Prof. David Bannister, University of Oxford. There is also an independent Research Approvals Committee, chaired by Prof. Cecilia Wong, University of Manchester, to approve projects wanting to work with the UBDC and data purchases. This Committee has a permanent Lay Member representing the public interest.
How do I know that the data will be used for public good?
All research projects must have the potential to benefit society and improve quality of life. All new individual research projects wanting to work with personal data undergo an approvals process, in which each project must show that:
- It addresses a legitimate research question and has a clear potential public benefit.
- The results of the project will be made public.
- It needs to use the BDN2, and would not be more appropriately served by other research council investments (for example Farr Institute, UK Data Service, or one of the longitudinal studies support services).
Do you ask for consent before you use the data?
We only use data that we have a right to access legally. We work with the data owner to obtain permissions where necessary.
What is the difference between ‘de-identified’ and ‘anonymised’ data?
‘De-identified’ data refers to data where any element that directly identifies any individual is completely removed from the dataset. This includes data attributes such as name, address, tax reference number, or National Insurance number.
The Information Commissioner's Office defines anonymisation of data as the process of turning data into a form which does not identify individuals and where identification is not likely to take place, allowing for a much wider use of the source.
Where are data stored?
The UBDC offers a three tier data service (open, safeguarded or controlled) and the type of storage is governed by this. Open and safeguarded data held by the Centre is stored in a secure location at the University of Glasgow. Data which is considered to be controlled is held with the electronic Data Research and Innovation Service (eDRIS) - a highly secure computing environment where it is possible to closely monitor who works on the data and to ensure no personal data leaves the system. In the latter case, datasets constructed for each project are destroyed on completion of the research.
Who can access the data?
Researchers affiliated to an academic institution, public sector researchers and private sector analysts can all apply to access data. All requests are considered on an individual basis.
How do researchers get access to the data?
We train researchers to use data safely, lawfully and responsibly. It can be a long process as we have a number of safeguards in place to protect people’s privacy and ensure the data are secure at all times.
Can I see the results of the research?
Research from the Centre is published in academic journals and on the different Centre websites.
Is the BDN2’s service free for all users?
The BDN2’s service for accessing data is free at the point of use. We also have data scientists who can assist with proposal development, data sourcing and data management, but this resource is limited. If you need research staff or other resources, you must provide these or the funds to support them. Additionally, each Centre provides training courses and workshops which may be supported by a registration fee.
Non-UK researchers, and private individuals such as campaigners, community activists, and civic hackers, may access the open data from the BDN2 data collections, and in the case of UBDC, some of the Safeguarded Data Collection (on signing a user access agreement). However, the Controlled Data Service, which supports access to administrative data is only for data from UK organisations, and is subject to rigorous review of applicants; in general only UK-based researchers or policymakers would be considered. If you are interested in any other type of international research collaboration, please contact us.