UBDC Researcher Wins Prize at ONS Data Dive
The Urban Big Data Centre partnered with the Office of National Statistics Data Science Campus and The Alan Turing Institute in putting on a Data Dive in July 2017.
We supported this hackathon-style event by providing Strava, Zoopla, and other data from our collections for participants to hack with. We also supplied our resident Geospatial Data Scientist, Rod Walpole, to help with the data and support the participants. The event was for UK-based PhD students, postdoctoral fellows and early career researchers in data science, and we were ably represented by our UBDC Research Associate from the University of Glasgow’s School of Mathematics and Statistics, Dr Francesca Pannullo. So ably, in fact, that Francesca’s team won second prize! More on that below, including her prize-winning entry on building more homes for the UK.
Francesca on the event and the prize-winning entry:
I was invited to The Alan Turing Institute in London co-hosted by the ONS Data Science Campus to take part in a two-day policy-focused data dive to design data science solutions to some real world challenges that are facing our societies today. This data dive focused on issues that urban environments face, such as widening health inequalities between the rich and poor, decreasing green space, and decreasing locations where new houses can be built. These three issues were each assigned three teams of researchers encompassing a vast array of skills, using a range of data sources available from the Satellite Applications Catapult, the Urban Big Data Centre and ONS.
I took part in the housing issue to answer: where can we build more houses? My team developed a procedure to locate areas that could be used for building houses. This involved Zoopla, planning permission and green space data being analysed through GIS and mapping software as well as analysing satellite data at different wavelengths to locate specific sites, such as brownfield land (previously developed land that is not currently in use), that could be used to build houses.
This data dive allowed networking between people with numerous skills, and highlighted that combining these skills and working together can help generate concise and novel ideas. My team was awarded second prize for our novel approach at combining numerous different data sets and developing a clear methodology for locating new building areas that government bodies can adopt, while helping shape policy decisions with regards to the housing sector. I have summarised our prize-winning entry below.
The problem
There are numerous pressures on the UK housing market, leading to heavy discussions of how the housing market can be improved to benefit the population as a whole. One of the main issues is in terms of finding available and suitable locations that are able to be built on and are completely free of any potential restrictions, such as green space land, or particular areas that have to conform to wildlife protection or land-use issues. These restrictions mean it is costly and time-consuming to identify and locate available areas, which then adds to the pressure of new building contractors being able to actually start building.
Our potential solution
To develop a methodology or procedure to identify potential areas where houses can be built. Numerous data sources were available as part of the data dive; such as Sentinel satellite data, local planning application data from ONS, green space data from UBDC, and Zoopla data including both private rents and housing sales from UBDC. These data sets focused on the cities of Glasgow and Manchester, but my team and I decided to focus on Manchester due to the satellite data for Glasgow being too cloudy! Typical Scottish weather!
One of the researchers in the team was a physics PhD student and was used to analysing satellite data using Python, so he took on the role of assessing Manchester through different wavelengths in order to locate potential areas, such as brownfield land.
Myself and the rest of the team worked on utilising the planning application data as a way of locating potential areas that are already being monitored for numerous land uses. The planning application data only pertained to Manchester city, so this became our new area of focus.
We then made use of a classification system developed for the satellite data, which essentially classifies areas in terms of what they are being used for. This meant we could filter out all areas that were already developed, leaving the areas that could potentially be available for house building.
By combining these classes with the underlying map we were able to filter out potential areas of re-development.
Furthermore, it was beneficial to also filter out areas of green space, since these are areas that cannot be built on. This then gave us an overview of areas that could potentially lead to housing development (image below). However, with more time to work on this one could zoom into a potential location and use the Zoopla data to identify local housing sales and rental prices in order to ascertain whether it would be worth building in this particular area. Using the planning application data could further identify an area as being potential for house building if planning applications in the surrounding areas are successful. Both Zoopla and the planning application data are useful for showing activity, and supply/demand levels in areas, thus possibly aiding contractor’s decisions on whether it is worthwhile to build or not. These could also provide market signals in order to help determine the quality and validity of the potential location.
Future ideas
As a team we wanted to think out-of-the-box and brainstorm other ways of locating potential areas to be built on. This led to the popular idea of crowd sourcing, which is a fun way of involving the public in helping shape the UK’s housing market. Participants could be asked to tweet geotagged photos of potential areas or sites (e.g., unused retail land) in the neighbourhood using a specific hashtag (we tried to be funny and come up with our own hashtag to be #rustyrusty), which could then be used as a training set to validate and locate potential areas and sites.
Conclusion
Overall this was a very positive experience and I absolutely loved my two days in The Alan Turing Institute. I got to work with a host of researchers with different skills, and of course, meet a lot of wonderful people. It was an intense couple of days trying to get everything finished on time, but everyone who was there to help made it a very enjoyable experience and very much provided sound advice the entire way through. The food at the event was absolutely delicious! And the constant tea, coffee and food was much needed to fuel everyone’s brains working in overdrive. I would like to say a big thank you to The Alan Turing Institute, ONS Data Science Campus and UBDC for organising this event, providing the data and expertise for making this an excellent experience.