Avivah Litan, Distinguished VP Analyst at Gartner, explains that AI agents, operating autonomously or semi-autonomously, expand the threat surface beyond traditional AI models, requiring robust controls to mitigate risks like data exposure, resource consumption, and unauthorized activities.
AI agents operate autonomously, semi autonomously, or within multiagent systems, leveraging artificial intelligence to perceive, decide, act, and achieve goals in both digital and physical environments for various use cases (see Innovation Insight: AI Agents). Enterprises are already integrating or customizing products with AI agent capabilities, such as Microsoft Copilot Studio, Azure AI Studio, AWS Bedrock, and Google NotebookLM. Many existing products are also poised to enhance their generative AI features with increased automation.
However, AI agents bring new risks in addition to those associated with AI models and applications. Traditional AI models and applications typically have a limited threat surface, encompassing inputs, model processing and outputs, software vulnerabilities in the orchestration layer, and the hosting environments. In contrast, AI agents expand the threat surface to include the chain of events and interactions they initiate, which are inherently invisible to and uncontrollable by human or system operators.
Some of the risks and security threats introduced by AI agents include:
- Data Exposure or Exfiltration: Risks can occur anywhere along the chain of agent events.
- System Resource Consumption: Uncontrolled agent executions and interactions—whether benign or malicious—can lead to denial of service or denial of wallet scenarios, overloading system resources.
- Unauthorized or Malicious Activities: Autonomous agents may perform unintended actions, including “agent hijacking” by malicious processes or humans.
- Coding Logic Errors: Unauthorized, unintended, or malicious coding errors by AI agents can result in data breaches or other threats.
- Supply Chain Risk: Using libraries or code from third-party sites can introduce malware designed to target both non-AI and AI environments.
- Access Management Abuse: Embedding developer credentials into an agent’s logic, especially in low- or no-code development, can lead to significant access management risks.
- Propagation of Malicious Code: Automated agent processing and retrieval-augmented generation (RAG) poisoning can trigger malicious actions.
Three Controls to Safeguard Against AI Agent Threats
Use homegrown tools, or third-party tools such as those from Apex or Zenity, to manage AI agent risks and fulfill the three main requirements: view and map all AI agent activity and information flows, detect and flag anomalous AI agent activity, and apply automatic real-time remediation where possible.
- View and Map All AI Agent Activity and Information Flows– Provide a comprehensive view and map of agent activities, processes, connections, data exposure, information flows, outputs, and responses generated by agents to detect anomalies and violations. Additionally, ensure support for an immutable audit trail of agent interactions.
- Detect and Flag Anomalous AI Agent Activity– Detect and flag anomalous AI agent activities and those that violate specific preset enterprise policies. Once agent maps are populated and baseline expected and intended activities are established, outlier transactions and activities should be detected. Rogue transactions should be auto remediated where possible, as humans cannot scale the oversight and remediations needed, given the speed and volume of agent interactions. Outliers that cannot be auto remediated should be suspended and forwarded to a queue for human review and manual remediation.
- Apply Automatic Real-Time Remediation– Remediation actions should include appropriate containment and mitigation measures. This involves redacting sensitive data as defined by the enterprise, such as personally identifiable data or confidential information like board memos, as it passes through agent subsystems. It is also crucial to enforce least privilege access, blocking access if a violation is detected and cannot be remediated, and forwarding the issue for human review and resolution. Additionally, support deny lists of specific agent threat indicators gathered from threat intelligence correlated with enterprise data. There should also be support for deny and accept lists of files and file types that coding assistants are disallowed or allowed to access and use, including files used in RAG to support agent workflows. Finally, implement a monitoring and feedback loop to identify unwanted actions resulting from inaccuracies.