Since the second world war, mathematics and warfare have become increasingly intertwined. British codebreakers in Bletchley Park, most notably Alan Turing, famously shortened the war by breaking the Nazi’s Enigma Code. These academics went on to form the heart of the Government Communications Headquarters (GCHQ), bolstering the UK’s national security through a combination of Tradecraft and advanced mathematics, skills that were rare enough to provide a distinct advantage internationally.

Over 70 years later, warfare takes place increasingly online. Nation states and guns for hire still launch attacks to disable infrastructure or seize strategic assets using sophisticated technology. But now the targets are information and intellectual property rather than bridges or ammo dumps. And this weaponry is now being enhanced by approaches such as machine learning and Artificial Intelligence (AI), first conceived of by Turing in his famous paper “The Imitation Game”, where he hypothesised an AI that could pass for human when answering a series of questions.

Data science as a weapon

In the modern battlefield corporate data is a valuable asset, which has caused the line between public and private actors in cyberwarfare to become increasingly blurred. This makes all businesses vulnerable to breaches enabled by AI enhanced attacks. The increasing sophistication of these incidents has meant that data theft has become the most prevalent form of corporate fraud for the first time, according to Kroll. The weapons are changing as well as the targets, with research from Webroot showing that some 86% of security execs now worry that hackers will use AI against them.

In one example, a security professional opened an unimportant port and tracked over 100,000 unauthorised login attempts, the first after just five minutes. Bad guys have been using automated attacks to access data for many years, and determining its value after the fact. These actors can now apply AI style principles to improve these attacks over time by gathering data, learning and iterating. A rules-based defence that tries to define problems in advance will be useless in the face of these varied and autonomous attacks.

This new firepower of malicious actors means that the only way to successfully protect against such attacks is with an equally advanced defence, Security Data Science. While the security benefits are clear, applied data science is not easy. Gartner found that CIOs ranked AI as the hardest technology to implement, followed by digital security; applying both without disrupting security operations is a considerable challenge.

Ranking AI and security as separate technologies is incongruous; security is an ongoing process and data science is a tool. Automation is an essential element of Security Data Science that must be grounded in security tradecraft, and is not an end in itself. Security professionals determine the problem that needs solving, then work with data scientists to see if existing models can be applied or whether a new one is necessary.

Mathematical and technical expertise is required to assess algorithms for rates of detections and false positives, real time application and cost to resource. Data scientists developing supervised learning models will need to consider scalability from training data and decay rates, those using unsupervised models will typically have to assess and interpret results.

These models begin to be of value when security experts apply them according to a strategy that addresses existing threat vectors, while taking an open-ended outlook towards unknown vulnerabilities. Security teams need to put a symbiotic relationship between domain expertise and data science at the heart of their companies’ defences, to deal with a growing and evolving cybercrime sector.

Data science talent

Research from McKinsey shows that AI uptake is highest among large organisations with the resources to attract top talent. What the consultancy does not state outright is that this includes malicious actors backed by powerful international interests. It is an arms race where first movers can reap exponential rewards.

Data science remains an emerging field that requires deep technical expertise, and high demand has created a severe talent shortage. One recent report put the global talent pool for AI at just 22,000 individuals, and the number of doctoral students studying the discipline does not indicate an upcoming explosion in supply. Adding to this the cost of salaries and resources, as well as the managerial challenges to integration and inevitable churn, and it is clear that data science often needs to be outsourced to domain experts.

Data science is an agnostic tool, not a silver bullet or a superweapon. In cybersecurity, where we call it Security Data Science, it can be used to free up analysts’ time, making the security function less operational and more strategic for companies. But it also bolsters threats faced by organisations. AI represents a new paradigm that has the potential to revolutionise the knowledge economy in security and beyond, and security teams need to harness it before the bad guys do.

Breakthroughs in defence and information technology have always gone hand in hand, from radio to the internet. We are living through an age where Turing’s two intellectual breakthroughs are coming together, except this time it is the machines that are asking questions of humans. Whether building an in-house function or partnering with a domain expert, companies serious about security need to incorporate automation as part of their Security Data Science defence.