When findings of the Human Genome Project were published in 2001 after nearly 15 years of international collaboration, the director of the National Human Genome Research Institute in the US described the monumental undertaking as a “transformative” history book with an “incredibly detailed blueprint for building every human cell”, writes Dr Robert Wardrop, director & co-founder of the Cambridge Centre for Alternative Finance.
The Regulatory Genome Project (RGP), formally launched in December 2020 at the University of Cambridge, also seeks to be transformative to the financial services industry and beyond, providing open access to a repository of ever-changing regulations around the world that affect payments and many other areas of finance. The Regulatory Genome provides a detailed blueprint of regulatory obligations in a “machine-readable” form to level the AI playing field in the global regulatory ecosystem so the poorest country has equal access with the richest nation to the emerging form of these vital rules underpinning our digital society.
Why does this matter? Because national regulators and the companies they regulate need to know what’s happening around the world with regard to evolving regulatory obligations on cybersecurity, crypto assets, ESG (environmental, social and governance) and other aspects that govern this evolving digital world. For regulators, a repository of global regulations enables them to more rapidly identify and analyse “best practice” implementation of regulation elsewhere in the world that can be applied in their country. For regulated firms, aligning their risk controls with regulatory requirements is a growing problem as they digitally transform their activities while contending with steadily increasing regulation and regulatory change. Machine-based applications can improve these processes, but a lack of standards for machine-readable regulation is limiting their adoption.
Here’s how it works: the RGP uses machine learning and natural language processing to “sequence” the vast and growing quantity of regulatory documents produced by regulators around the world. In a nutshell, this sequencing entails constantly crawling the internet to capture these documents, breaking them into sections of text, and then using trained statistical models to analyse, classify and apply tags describing the text content.
The classification structures used by the statistical models to classify the regulatory documents will be made freely available via an application programming interface (API) in hope that others will adopt the Regulatory Genome structure, and thereby evolve a de facto standard for the representation of machine-readable regulation.
The project draws on research from the Cambridge Centre for Alternative Finance at Cambridge Judge Business School as well as the Department of Computer Science and Technology of the University of Cambridge. A new university-affiliated company called Regulatory Genome Development Ltd is supporting third parties building applications using Regulatory Genome content. Such third-party engagement is essential to establishing the Regulatory Genome as a reference standard for machine-readable regulatory information.
The first cohort of collaborators announced in March includes Mastercard, global law firms CMS and Macfarlanes, and professional services firm Grant Thornton UK LLP; we are in discussions with many other organisations, and expect to announce new collaborators in the project soon.
As the National Human Genome Research Institute says about the Human Genome Project: “Information is only as good as the ability to use it”. The RGP’s mission is to make it easy to access and apply regulatory information for the benefit of people, organisations and societies around the globe.