The Research Computing team provides bespoke consulting services for research projects at the University of Leeds. This brings computational expertise to a wide range of research problems and helps ensure we remain a cutting-edge research driven institution. Read below a case study of one project commissioned by researchers at the University that was developed and delivered by one of our research software engineers.
The project aimed to build on initial proof-of-concept work to modelling police demand and supply. It aimed to develop a microsimulation functionality into an existing toolkit for forecasting future rates of crimes on a daily basis within a specific geography that could act as an input into an agent based model that could model basic responses by police forces to crime-related demand.
This project built upon an existing Python package, the crime_sim_toolkit, and work was undertaken over 8 weeks to add additional functionality to this package. Namely, adding a function that would forecast crimes at an individual level using synthetic populations and the probabilities of victimisation. Synethic populations are simulated examples of populations (often forecast from census data), we utilised synthetic populations derived from the ukpopulation package from the SPENSER project. All the work was performed for data within the West Yorkshire Police force area (which included the local authorities of Leeds, Bradford, Wakefield, Calderdale and Kirklees).
The main challenge of this project was around determining these probabilities of victimisation. In order to derive these probabilites an initial dataset of crime events and victim characteristics was required which could then be used to create a table of victimisation probabiltiies. This project relied on open source data utilising the data.police.uk site as its main data source. This data has a number of caveats including: aggregation to 15 very broad crime categories, and crucially no victim data. The long term plans for this work is that eventually data sharing agreements would allow crime data that included information about victims to be used however for this work dummy data was created to develop the workflow.
In this simplistic example victimsation data was generated by evenly distributing characteristics from the synthetic population across a years worth of crime events. Yielding a dummy dataset of crime-victim data which could be used to develop a victimisation probability table that can be used by the microsimulation.
The work from this project was included in the open source project repository and included in the latest release of the crime_sim_toolkit package. The project was previous hosted on GitHub and a number of code quality and continuous integration practises were incorporated into the project. These included an automated build schedule using Travis CI, code quality checks via Codacy and SonarCloud, and test coverage reporting via Codacy. The project is also released via the Python package index.
This project also made use of the builtin wiki feature of GitHub to provide documentation behind the new
Microsimulator class. This was a required outcome of the project.
Dr. Dan Birks who commissioned the work said:
As research increasingly engages with computational methods alongside more traditional quantitative approaches, it’s increasingly important to ensure that professional software development practices become an integral part of the research enterprise. Working with the RSE team means that we can draw on expertise and skills to ensure that the software tools we develop to solve real research problems are robust, scalable and reproducible. It’s certainly something I’ll be doing again.