Skip to main content

Data science

Celebrating the power of community

A mug next to a laptop featuring a group of people on a virtual call

The Cross-Government and Public Sector Data Science Community provides regular opportunities for people in the public sector, with an interest in data science, to connect with their peers, learn new skills and collaborate. This autumn, we are hosting the first …

Collaborative learning: closer ties with academia

Some people stood in a semi circle looking at a window onto which post-it notes have been added

GDS x Imperial University Collaboration 2022 Collaboration and innovation are some of the key tenets of the Digital, Data and Technology (DDaT) profession. The Cabinet Office offers many avenues for productive collaboration, enabling internal and external partners to develop both …

Using Data Science for Next-Gen Statistics

Rap sticker on a laptop

As the 21st century progresses, using data effectively has become a priority for many organisations, including the Office for National Statistics (ONS). The ONS's unique focus, however, goes beyond just utilising data effectively. The organisations ultimate goal is to create …

Splink: Fast, accurate and scalable record linkage

Posted by: , Posted on: - Categories: Data Engineering, Data science, Python
Some of the graphical outputs of Splink

  A common data quality problem is to have multiple different records that refer to the same entity but no unique identifier that ties these entities together.  For example, customer data may have been entered multiple times by accident, or …