The AI agent era heightens the attack surface for critical infrastructure. Three kinds of cybersecurity startups can help.
Read my overview of recent cyber attacks, regulatory developments, and emerging tech trends, plus three startup proposals: machine-to-machine (M2M) authentication; digital twins; & AI theorem-proving.
Abstract: The rise of AI agents—software tools that autonomously execute complex workflows—introduces unprecedented cybersecurity risks to critical infrastructure, from power grids to financial markets. Through analysis of recent cyber attacks, regulatory developments, and emerging technology trends, this paper examines how AI agents create novel attack surfaces while complicating existing vulnerabilities in critical systems. To address these challenges, I propose startup innovation in three emerging research areas: machine-to-machine authentication for managing AI agent identities and permissions, predictive security using digital twins to simulate potential threats, and AI-enabled formal verification to mathematically prove system security properties. As policymakers work to understand and regulate frontier AI, entrepreneurs have an immediate opportunity to develop technical solutions that secure critical infrastructure in the AI agent era, ensuring this powerful new software architecture can safely integrate into our most essential systems.
Table of contents
1. Critical infrastructure is at risk more than ever before in the AI agent era
2. The broader AI safety and governance context
Recent cyber attacks from foreign adversaries
AI’s novel dangers in cyber warfare
The state of AI in critical infrastructure
4. AI agents are the future of enterprise software—and they complicate the cybersecurity picture
The rise of AI agents in enterprise software
Cybersecurity risks of AI agents
Attack surfaces involving a single AI agent
Attack surfaces involving untrusted external entities
Three types of AI cybersecurity risks to critical infrastructure
5. Three startup solutions for securing critical infrastructure in the AI agent era
I. Machine-to-machine (M2M) authentication
II. Predictive security and digital twins
III. AI-enabled formal verification and provable cybersecurity
6. The importance of proactive leadership from the startup community
1. Critical infrastructure is at risk more than ever before in the AI agent era
U.S. critical infrastructure is a hot target for cybersecurity attacks from foreign adversaries—and it’s only going to get more difficult to manage in the transformative artificial intelligence (AI) era.
Take the cases of the Chinese government’s latest U.S. hacking campaigns, Salt Typhoon and Volt Typhoon—in the past several years, hackers linked to the Chinese government have burrowed themselves deep into American critical infrastructure networks, from water treatment systems and airports to oil and gas pipelines and financial institutions. Or take the case of Russia’s sophisticated 2020 breach of the U.S. Treasury and Commerce Departments’ email systems, or cybercriminal group DarkSide’s 2021 Colonial Pipeline ransomware attack, leading to fuel shortages throughout the American Southeast.
It's only a matter of time before cybersecurity attacks on critical infrastructure converge with the AI boom in new ways for which we are not yet prepared.
At the 2024 Black Hat security convention, researcher Michael Bargury demonstrated how to hack Microsoft’s Copilot AI chatbot in order to steal funds, retrieve sensitive data, and direct users to phishing attack websites. Prompting attacks like these don't just make chatbots (e.g. Copilot, ChatGPT) vulnerable––they introduce new concerns for the autonomous applications built atop them.
AI agents—software tools built on deep learning algorithms that execute multi-step workflows on behalf of humans—are taking Silicon Valley by storm, promising to reduce busywork and skyrocket productivity.
From major software developments like Salesforce’s Agentforce and OpenAI’s Operator to startups like Harvey, Sierra, and LangChain, agentic AI products and their supportive infrastructure represent the future of enterprise software.
As businesses and government agencies increasingly adopt AI and grant permissions to agentic software, the cybersecurity threat landscape heightens—and technologists need to prepare for what that means for critical infrastructure.
This white paper overviews how securing critical infrastructure is core to ensuring a safe and beneficial transition to transformative AI, how the latest wave of enterprise software advancements poses new threat vectors, and three kinds of cybersecurity startups that could protect critical infrastructure in the agentic AI era.
2. The broader AI safety and governance context
The promise of AI is not without its perils.
There is no denying AI will dramatically impact how we live our lives and generate economic productivity gains. The bull case, as OpenAI co-founder and CEO Sam Altman explains, is utopian abundance: “Imagine a world where, for decades, everything–housing, education, food, clothing, etc.–became half as expensive every two years,” Altman posits. Even Anthropic co-founder and CEO Dario Amodei suggests that “most people are underestimating just how radical the upside of AI could be”––he argues that powerful AI has the potential to 10x the rate of discovery in biology, prevent and treat nearly all natural infectious disease, eliminate most cancer, improve mental illness interventions, optimize the distribution of health interventions and spur economic growth in in developing countries, enhance research on climate change solutions, and help level the playing field between humans.
Needless to say, AI has the potential to usher in a production revolution: in the same way that “advances in a single critical technology” in the agricultural and industrial revolutions brought about logarithmic-scale changes to economic productivity—via the domestication of plants and steam power, respectively—AI may similarly introduce irreversible changes to life and society. Scholars and practitioners differ, however, on what kinds of timeframes we can expect, what the sequence of changes will look like, and whether the impact of transformative AI will be more analogous to the advent of nuclear weapons, electricity, or something even greater.
Yet there are risks at every stage of AI development and transformation—even the earlier ones, like near-term AI use cases in enterprise software and government.
First, it is important to understand macro-level risks of transformative AI. For Allan Dafoe, director of frontier safety at Google DeepMind and founder of Oxford’s Centre for the Governance of AI (GovAI), there are four categories of AI-related extreme risks:
Inequality, turbulence, and authoritarianism: Democracy could erode in a winner-take-most labor market and/or totalitarian approach to digital surveillance, social manipulation, and autonomous warfare.
Great power war: AI could foster faster international crisis and warfare escalations than humans could manage.
Loss of control: As AI systems grow dramatically in their capabilities—and in the permissions humans grant them—experts worry that their intentions may eventually misalign with those of their human engineers and users. In the same vein, Dafoe worries about how increasingly powerful AI could create compounding problems with misaligned powerful institutions, like “corporations, military actors, or political parties.”
Value erosion from competition: Heightened competition in AI R&D could precipitate negative externalities, in the same way global economic competition has posed material threats to privacy, equality, and environmental sustainability.
While these risks can feel like alarming yet distant hypotheticals, they reflect long-term, compounded impacts of more immediate threat vectors across the AI supply chain and research and development (R&D) lifecycle. AI and machine learning (ML) algorithms “use computing power to execute algorithms that learn from data,” wrote Ben Buchanan, former scholar at Georgetown’s Center for Security and Emerging Technology (CSET) and White House AI advisor. This single sentence conveys the holy trinity of AI—algorithms, data, and computing power—at each stage of which there are public policy and national security implications, from algorithmic harms to data leaks and geopolitical tensions over chip export controls.
Above all, frontier AI R&D has far outpaced policymaker understanding, necessitating market-based interventions to ensure safe AI development and usage. Though the Organization for Economic Cooperation and Development’s (OECD) May 2019 AI Principles laid the groundwork for advanced economies to build national- and state-level AI regulation, with meaningful traction from the United Kingdom and European Union, the U.S. remains in regulatory limbo as Silicon Valley and Washington negotiate, replace the previous administration’s landmark executive order on AI governance, and leave California’s innovation economy without state-level guidance on AI risk mitigation. All the while, the pace of frontier AI and ML research publications has steeply accelerated since the 2017 breakout paper on the Transformer deep learning architecture, and the data universe—core to training generative models, which in turn power data infrastructure software—is projected to reach 660 zettabytes by 2030, equivalent to 610 iPhones (128-gigabyte) per person.
3. Recent cyber attacks highlight critical infrastructure vulnerabilities—which AI advances will compound
Recent cyber attacks from foreign adversaries
Over the past several years, state-sponsored and adversarial foreign cybercriminals have targeted U.S. critical infrastructure systems, from power and water supplies to financial institutions, information technology and communications.
Some of these cyber attacks are aggressive up front—by injecting an infected file via a dynamic link library (DLL) hijack, China-linked Salt Typhoon hackers have been able to spy on U.S. internet service providers (ISPs) and telecommunications networks, exfiltrate sensitive information and data, and stealthily remove their footsteps from memory. It’s the same method Russian intelligence hackers used to breach email systems across the U.S. Treasury and Commerce Departments in late 2020.
In other cyber attack scenarios, hackers are “lying in” or “living off the land” (LOTL): Volt Typhoon cybercriminals exploited weak administrator passwords, factory default logins, and devices overdue for key security updates, to stealthily embed fileless malware deep inside networks controlling U.S. power grids and water supplies. Poor cybersecurity hygiene, including a leaked password and lack of multifactor authentication (MFA), similarly made the Colonial Pipeline vulnerable to attack in 2021 by the ransomware criminal group DarkSide—leaving the southeastern U.S. with gasoline, diesel, and jet fuel shortages for five days.
AI’s novel dangers in cyber warfare
These examples, while notorious in recent memory, only scratch the surface. According to a January 2025 report on adversarial misuses of Google’s Gemini, government-backed threat actors in Iran, China, North Korea, and Russia have attempted to use the tech giant’s AI chatbot to support multiple stages of the cyber attack lifecycle—from malicious coding and scripting tasks, to researching publicly reported vulnerabilities, conducting reconnaissance on targets, and even developing webcam recording code, rewriting malware into another language, and automating processes around logging into compromised accounts and extracting emails from Gmail.
As AI systems increasingly become integrated into critical infrastructure—and, as the Gemini report shows, in everyday workflows, like chatbot searches—cybersecurity attacks will only grow in sophistication, and critical infrastructure will become more vulnerable in the absence of robust policy and cybersecurity advances.
The state of AI in critical infrastructure
AI is, after all, already in use in critical infrastructure sectors—and a future with applied generative AI in operational technology (OT), like critical infrastructure hardware, is not far off.
When CSET assessed AI use cases across reports and AI inventories from the U.S. Department of the Treasury, Department of Energy, and the U.S. Environmental Protection Agency, as well as critical infrastructure industry literature reviews, its researchers found that generative AI is, and has long been, in use in information technology (IT) across critical infrastructure sectors. In critical infrastructure IT, operators use AI to scan security logs for anomalies, detect malicious events in real-time, and mitigate code vulnerabilities. While the researchers found no explicit examples of active generative AI use cases in OT just yet, “future use cases are being actively considered, such as real-time control of energy infrastructure with humans in the loop” (p.8-9).
The October 2024 CSET report also notes that AI agents in particular present a double-edged sword for critical infrastructure use cases: they have the potential to automate either routine work streams or cyberattacks, which “deserves close watching” (p.8)––for instance, AI agents that can turn “human instructions into executable subtasks may soon be used for cyber offense” (p.11). And in corroboration with Google’s Gemini misuse report, CSET also found that AI-enabled or -enhanced cyberattacks to critical infrastructure include “scripting, reconnaissance, translation, and social engineering,” as well as large language model (LLM)-enhanced spear phishing (p.11).
LLM-enhanced spear phishing involves using AI to study an individual’s digital footprint, send the target a personalized email, and encourage the target to click a link that would secretly install and execute malware on their computer. In November 2024, a group of Harvard Kennedy School researchers found that AI-automated spear phishing attacks performed on par with human experts and 350% better than their control group, enabling attackers to target more individuals at lower cost and increase profitability by up to 50 times for larger audiences.
This is all to say: incorporating AI could increase critical infrastructure operators’ attack surface in new or unknown ways, particularly when AI is used for IT, OT, or one day, both (p. 12).
4. AI agents are the future of enterprise software—and they complicate the cybersecurity picture
Defining AI agents
An AI agent is an application that attempts to achieve a goal by observing the world and acting upon it using the tools that it has at its disposal, explained three Google AI agent product leaders in September 2024.
AI agent architectures solve problems via reasoning, planning, and tool calling: they think like large language models (LLMs), but go further by interacting with external data sources and software APIs. The latest wave of agentic technology is focused on memory: agents can now save and recall information. Once a user instructs an agent in natural language, it breaks the prompt down into tasks and subtasks, drawing on domain knowledge and data to complete its assigned task. These emerging capabilities introduce important cybersecurity threat vectors beyond those for foundation models alone—ones that impact large, dynamic vertical industries, the enterprise and government customers they serve, and societies at large.
This paper assumes that the enterprise software platforms, infrastructure, and language models that critical infrastructure operators use will increasingly incorporate agentic applications, introducing the same heightened cybersecurity attack surface for AI agents in general to U.S. critical infrastructure in particular.
The rise of AI agents in enterprise software
From mid-2024 to early 2025, enterprise software-as-a-service (SaaS) and LLM providers large and small have launched AI agent products and infrastructure—and many of these companies serve government agencies and critical infrastructure operators.
OpenAI and Anthropic, two leading foundation model providers, have introduced research previews of agentic products (Operator and Claude computer use), and both are doubling down on their government contract sales (see ChatGPT Gov and Anthropic’s growing public sector sales team). Salesforce, which has long sold to government agencies and has a product line for energy and utilities clients, has overhauled its go-to-market strategy to promote its Agentforce line of customer service agents that autonomously make decisions, solve problems, and carry out business tasks, and facilitate real-time data exchange with customer records. Oracle, too, is embedding generative AI agents into its product lineup, connecting natural language input and retrieval-augmented generation (RAG) with customers’ enterprise data. And ServiceNow’s AI agent product can provision software for new employees, or diagnose and resolve IT outages.
These SaaS and LLM providers represent the potential for AI agents to make their way into some of the largest commercial contracts and workflows in the U.S.-––but it is not just market leaders who are building and selling agentic AI.
Leading venture capital investors have identified AI agents and their supportive infrastructure as the next major scalable startup opportunity. Case in point: in the same way that React, the widely used Javascript library for web development, simplified the process of building websites, investors have observed the rise of an entire ecosystem of AI orchestration vendors simplifying the process of building agent-based applications.
So, how do agents specifically introduce cybersecurity concerns?
Cybersecurity risks of AI agents
The risks of AI misuse would compound if agents were in the wrong hands. AI agents could reasonably execute an offensive cyber operation end-to-end, explains the Institute for Security and Technology (IST). For instance, agents could execute the LLM cyber threats discussed earlier—target reconnaissance, malicious scripting, spear phishing attacks (malware installation), among others.
In addition to these common generative AI threats, one of the most plausible and concerning AI agent risks is hijacking.
Hijacking an agent has much farther-reaching consequences than hijacking an LLM chatbot alone. Hackers and red-teamers—their ethical counterparts working to uncover user safety and cybersecurity vulnerabilities in LLMs—have attempted to manipulate AI systems to produce unauthorized and harmful outputs (prompt injection), reveal their inner workings (prompt leaking), bias their training data and negatively influence their behavior (data training poisoning), and leak sensitive or confidential information (data extraction). Imagine, then, what could go wrong if an agent acted on the wrong information or malicious instructions, making a series of decisions with sensitive data. This is what losing control of bots could lead to, and it has been a concern even before the AI era.
As Jonathan Zittrain, director of Harvard’s Berkman Klein Center for Internet and Society, pointed out in The Atlantic, in the 2010 flash crash, “an army of bots briefly wiped out $1 trillion of value across the NASDAQ and other stock exchanges” due to high-frequency trading algorithms that briefly went haywire, mindlessly buying and selling contracts.
Bots making harmful decisions with material implications for humans is a key concern of the think tank RAND, which wrote that the greatest threat to financial systems may not be a sudden, catastrophic event—a “financial 9/11"—but rather a gradual erosion akin to “financial climate change.” In addition to flash crashes and other immediate disruptions, the agentic AI era poses more insidious dangers to critical infrastructure systems, including, but not limited to, financial markets.
These scenarios illustrate the point that just one “infected” step along the AI agent lifecycle poses downstream dangers, be it a harmful prompt, a poisoned piece of data, or bad actors’ ability to manipulate agents by exploiting login credentials or installing malware on critical infrastructure networks.
As agents become the norm in enterprise software, a world with multi-agent scenarios and multimodal intelligence (e.g. AI that can interpret images, audio, video, etc.), on offense and defense, is not far off. In 2024, National University of Singapore researchers found that feeding a single “adversarial image into the memory of any randomly chosen agent is sufficient to achieve infectious jailbreak”––meaning that jailbreaking a single agent leads all other agents in a multi-agent environment to “become infected exponentially fast and exhibit harmful behaviors.” In a worst case scenario, as IST posits, a multi-agent malicious environment could materialize where agents train each other on harmful use cases, such as producing malware, breaching networks, and obfuscating code.
This is to say: safeguarding LLMs at the model layer offers insufficient protection from cybersecurity vulnerabilities and misuse in the AI agent era. Attackers can exploit agentic systems’ autonomous behaviors, and openness to external software and data, to “manipulate agents, misuse tools, or disrupt workflows,” Lakera research scientists wrote. Case in point: one of the chief “hijacking” attacks to generative AI systems, indirect prompt injection, makes vulnerable far more than just a single LLM when agents are involved: by calling external APIs, LLM-integrated applications and autonomous agents enable cyber attackers to manipulate enterprise data and tools outside the immediate AI system, harm the users of those data and tools, and/or compromise the AI system in such a way that it harms users, too (Greshake et al., 2023).
Finally, it is important to distinguish AI agent cybersecurity vulnerabilities by attack surface, illustrating the reality that indirect prompt injection, while more researched than others, does not necessarily represent the single most important AI agent threat vector. To that end, Deng and Guo et al. (2024) highlight four critical knowledge gaps in AI agent security, and their corresponding attack types:
Attack surfaces involving a single AI agent
User inputs can be unpredictable in AI agent interactions, especially when insufficient or malicious instructions trigger an agent to initiate a cascade of downstream reactions. Compare this attack surface to the agent’s perception. Attack types:
Prompt injection attack (e.g. when a user begins their input with "ignore the above prompt" to override the AI agent's original security instructions);
Jailbreak (e.g. when a user tricks an AI agent into ignoring its ethical guidelines).
Internal executions refer to threats involving the relationship between prompts and tools, or the agent’s chain-loop structure. Agents’ internal execution states are often implicit and hard to observe. Compare this attack surface to the agent’s brain, and occasionally, behavior. Attack types:
Backdoor attack (e.g. an LLM that operates normally with benign inputs, but produces malicious inputs when prompted with a “backdoor trigger”);
Hallucination (e.g. when LLMs generate incorrect or meaningless statements);
Misalignment (e.g. discrepancies between the agent’s human-intended function and intermediate executed state);
Tools use threat (e.g. when an AI agent requests excessive permissions to execute high-risk system commands without user approval); etc.
Attack surfaces involving untrusted external entities
Operational environments vary, which can lead to inconsistent behavioral outcomes—consider if an agent operates on a remote server or on a secure network. This attack surface includes “agent-to-environment” vulnerabilities. Attack types:
Physical environment threat [e.g. sensor vulnerabilities in hardware and internet-of-things (IoT) devices; Trojans (harmful inputs) hidden in information collected by hardware devices; etc.]
Simulated and sandbox environment threat (e.g. anthropomorphic attachment threats for users, misinformation and tailored persuasion threats, etc.);
Development and testing threat (e.g. flawed guardrails and model evaluation protocols); and
Computing resource management environment threat [e.g. inadequate isolation between agents in a shared environment, resource exhaustion attacks leading to denial of service (DoS) for other users, etc.].
Untrusted external entities: This attack surface includes “agent-to-agent” and “agent-to-memory” vulnerabilities. Attack types:
Indirect prompt injection attack (e.g. when an attacker embeds malicious instructions in a webpage that, when accessed by an AI agent, tricks it into extracting and revealing the user's conversation history);
Cooperative threat (e.g. infecting a single agent in a multi-agent environment);
Competitive interaction threat (e.g. agents that generate adversarial inputs to mislead other agents and degrade their performance);
Long-term memory threat [e.g. retrieval-augmented generation (RAG) applications that rely on vector databases with poisoned data samples];
Short-term memory threat (e.g. delays in memory synchronization, leading to asynchronization of memory, in multi-agent systems).
As AI agents grow in their reach in business and government, and surely begin to touch critical infrastructure use cases, it is not hard to imagine how AI agent misuse could lead to unauthorized access, data breaches, and disruption of essential services. AI agent and multi-agent software environments have the potential to manipulate markets; disrupt energy grids; and intercept, block, or exfiltrate sensitive communications.
Three types of AI cybersecurity risks to critical infrastructure
In April 2024, the U.S. Department of Homeland Security (DHS) developed safety and security guidelines for three types of cybersecurity risks to critical infrastructure [first identified by the Cybersecurity and Infrastructure Security Agency (CISA) in January 2024]. DHS cautions critical infrastructure owners and operators about:
Attacks using AI refers to AI’s role in automating, enhancing, planning, or scaling physical attacks on or cyber compromises of critical infrastructure, said DHS. Possible attack vectors include AI-enabled cyber compromises, automated physical attacks, supply chain disruptions, intellectual property theft and reverse engineering, weapon development, and AI-enabled social engineering (e.g. deepfakes and phishing attempts).
Attacks targeting AI systems refers to targeted attacks on AI systems supporting critical infrastructure, said DHS. Possible attack vectors include theft of critical infrastructure data, adversarial manipulation of AI algorithms, evasion attacks, and interruption of service attacks.
Failures in AI design and implementation stem from deficiencies or inadequacies in the planning, structure, implementation, execution, or maintenance of an AI tool or system leading to malfunctions or other unintended consequences that affect critical infrastructure operations, said DHS. Possible methods of design and implementation failure include excessive permissions or poorly defined operational parameters for AI systems; unintended AI system failure; limited interpretability; inconsistent system maintenance; and training on outdated, statistically biased, or incorrect datasets.
Given the breadth of cybersecurity risks to critical infrastructure in the agentic AI era, how can operators protect themselves and their customers?
5. Three startup solutions for securing critical infrastructure in the AI agent era
While U.S. and international policymakers race to understand frontier AI—and debate the degree to which they will regulate the industry—entrepreneurs have an opportunity to meet cutting-edge innovation and a novel threat landscape with commensurate cybersecurity solutions. In this section, I propose entrepreneurship in three promising cybersecurity startup categories as a means of safeguarding critical infrastructure in the agentic AI era.
I. Machine-to-machine (M2M) authentication
As AI agents become core to how work gets done, we need new ways to manage digital identities and access, ensuring that only authorized entities and agents can interact with systems and data, and that multi-agent workflows are secure and reliable.
The promise of machine-to-machine (M2M) authentication software is core to the next generation of identity and access management (IAM) infrastructure.
In “IDs for AI Systems” (Centre for the Governance of AI, 2024), Chan et al. argue that the best preparation for a world with ubiquitous AI interactions is to ascribe a unique string and attributes (e.g. the name of the deployer) to instances of AI systems (e.g. a particular chatbot session). Watermarks and CAPTCHAs, they argue, verify origin information and whether a human is involved. IDs for AIs, on the other hand, would contain information about particular AI systems—whom they belong to and who is using them, for instance. Think of it like assigning an aircraft’s tail number to each AI interaction, so its ownership, usage, incident, and maintenance history are all readily attributable.
A flexible framework could enable developers to assign IDs at different levels of granularity depending on the AI use case and risk profile. For example, assigning a user ID might make sense for each instance that a critical infrastructure provider uses AI to interface with widely impactful systems.
Already, as of January 2025, South et al. have proposed a framework for translating flexible, natural language permissions into auditable access control configurations for AI agents. They suggest operationalizing ideas in “IDs for AI Systems” by building a token-based authentication framework leveraging OpenID Connect and OAuth 2.0, two respected protocols for authentication and delegation. In multi-agent communication scenarios, the framework suggests mutual authentication as a means of preventing impersonation and other fraudulent behavior and misuse. Authenticated delegation, central to the proposal in South et al. (2025), ensures that it is not just AI systems who have IDs—third parties must also be able to verify that agents are acting on behalf of specific human users, and that they have been granted the necessary permissions to proceed. Finally, in response to existing challenges in limiting AI agents’ scope, the proposed framework would enable users to provide third parties with relevant metadata about an AI agent’s goals and scope limitations, ensuring a human-in-the-loop as a check and balance on agent behavior and performance.
In assigning IDs to AI systems and managing what data and systems machine identities have access to, M2M authentication software can prevent misuse in single and multi-agent scenarios. Broader directions for M2M could include a new kind of identity governance platform for the AI agent era—one that can track and control not just human users, but also AI agents that may spin up, connect to multiple services, and shut down in minutes. Security teams need a clear picture of who and what has access to their systems, whether they are employees, AI agents, or automated processes. By making these complex relationships visible and manageable, organizations can embrace AI automation while keeping their systems secure.
II. Predictive security and digital twins
Just as it is challenging to predict potential security failures or breaches in critical infrastructure systems, so too is it to test their technology without risking a harmful incident in itself. This is where predictive risk analyses can help.
In his 2024 PhD dissertation toward the University of Rome Tor Vergata’s program in computer science, control, and geoinformation, Matteo Esposito demonstrated LLMs’ capability to augment traditional security risk analysis, particularly in mission-critical sectors such as healthcare, aerospace, and finance. Esposito compared a fine-tuned LLM against seven human experts in conducting an effective preliminary security risk analysis (PSRA), where the objective was to identify, evaluate, and propose remediation across 141 representative samples from previously finalized risk analyses for an Italian civil and military security R&D company. Esposito’s fine-tuned model outperformed six out of seven human experts in accuracy metrics, number of errors, and evaluation time, and the errors the LLM made were less severe than those made by human counterparts, suggesting a fine-tuned model functions as an effective “copilot” and cost and time savings tool in critical infrastructure PSRA decision-making.
The implication for entrepreneurship, then, is that cybersecurity practitioners ought to consider developing commercial-grade predictive security tools for critical infrastructure sectors. LLMs clearly have the ability to study and recommend maintenance courses of action based on past case studies, just as they can be applied to help detect anomalies in real-time. Software developers should integrate Esposito’s findings with the growing requirements of AI agent threat surfaces and multi-agent environments.
In this vein, digital twins—accurate digital representations of sophisticated cyber-cyber and cyber-physical systems—present an opportunity to anticipate critical infrastructure system failures, simulate attacks and irregularities caused by malicious activities, and test defenses without risking actual assets. One such commercial application of LLM-enabled predictive security for critical infrastructure, then, is to build digital twins to simulate agentic AI threats in a controlled environment, supporting systems’ security resilience in the face of a rapidly evolving attack landscape.
III. AI-enabled formal verification and provable cybersecurity
As AI systems become more generally intelligent and capable, it becomes harder to ensure they operate within defined parameters, and to prevent unintended behaviors as a consequence. In addition to the default assumption of searching for bugs and fixing them, what if there were a methodology to demonstrate the absence of bugs? This is the goal of automated theorem-proving, or AI-enabled formal verification.
In “Provably Safe Systems,” Max Tegmark and Steve Omohundro (2023) argue that “mathematical proof is humanity’s most powerful tool for controlling AGIs,” and that it will soon be tenable to use advanced AI for formal verification—to prove the integrity of an AI system’s guardrails. Their rationale: it is not possible to prove an incorrect theorem, and “theorems can be cheaply and reliably checked,” either in advance or in real-time.
In practice, provable security depends on proof-carrying code (PCC)—generating code that meets a specification, and generating a proof that the code meets that specification. In the case of cybersecurity, for instance, Tegmark and Omohundro suggest writing a formal specification stating it is impossible to access a computer without valid credentials.
To build provable cybersecurity, the authors suggest leveraging ML to create both the code and the proof—almost like tapping into AI’s advanced capabilities to produce a digital lock.
Formal methods are already in use: they are considered the de facto standard for demonstrating safety of FAA-certified, flight-critical electronic technology. Historically, cost has been an obstacle to deploying formal verification methods—but AI is poised to facilitate widespread automated theorem-proving in years, not decades, wrote Evan Miyazono, founder and CEO of Atlas Computing, a provable safety AI architecture nonprofit.
Startups that build and commercialize AI-powered formal verification tools for securing critical infrastructure will transform natural language into software specifications, synthesize programs from those specifications, and prove that the synthesized programs meet those specifications.
6. The importance of proactive leadership from the startup community
From the birth of the internet to the smartphone revolution, entrepreneurs have consistently built and commercialized technologies that seemed like moonshots just years before. Securing critical infrastructure in the AI agent era represents a similar inflection point—one that demands the same level of innovative spirit and determined execution that has defined previous technological transformations.
As AI agents become increasingly embedded in enterprise software and critical infrastructure operations, we cannot afford to wait for policy frameworks to catch up with technological reality. The cybersecurity challenges outlined in this paper—from unauthorized access and data breaches to potential systemic failures—require immediate, innovative solutions from the private sector.
The three startup categories proposed here offer complementary approaches to securing critical infrastructure in the AI agent era. Machine-to-machine authentication provides the fundamental building blocks for trusted agent interactions, ensuring that only authorized entities can access sensitive systems and data. Digital twins enable organizations to simulate and prepare for emerging threats without risking actual assets, while offering a sandbox for testing new security measures. And formal verification, through proof-carrying code and automated theorem-proving, promises mathematical guarantees about AI systems' behavior and defenses—a crucial capability as these systems grow more complex and autonomous.
Startups are uniquely positioned to drive innovation in these areas. Unlike larger enterprises bound by legacy systems and lengthy decision-making processes, startups can move quickly, iterate rapidly, and take fresh approaches to emerging challenges.
However, success in this space requires more than just technical innovation. Startups pursuing these opportunities must navigate complex stakeholder relationships, from critical infrastructure operators and government agencies to AI platform providers and regulators. They need to build solutions that not only advance the technical state of the art but also fit within existing operational contexts and compliance frameworks.
The stakes could not be higher. As the January 2025 report on adversarial misuses of Google's Gemini illustrates, bad actors are already probing AI systems' vulnerabilities. The convergence of AI capabilities with critical infrastructure creates both unprecedented opportunities and risks. Entrepreneurs who can successfully address these challenges will not only build viable businesses but help ensure that the transformative potential of AI benefits society while securing our most essential systems.