The rise of agents in AWS and their impact on cloud operations

CLOUD & DEVOPS

13.10.2025

min

Contributors

Rosina Garagorry

Cloud & DevOps Studio Leader

Marcelo Torterolo

DevOps Analyst

Mariano Perin

DevOps Analyst

All Posts

Amazon Web Services (AWS) has introduced various types of agents to automate tasks, improve observability and enable the next generation of artificial‑intelligence applications. These agents range from traditional components such as the AWS Systems Manager Agent and the unified CloudWatch Agent to the new family of Amazon Bedrock Agents designed for generative AI. This article surveys the different types of agents available in AWS, their capabilities and the precautions necessary to use them safely.

‍

What do we mean by “agent” in AWS?

‍

In the AWS context, an agent is a piece of software that acts on behalf of a user or service to perform specific tasks. Several categories can be distinguished:

‍

Management and observability agents. These processes run on EC2 instances, on‑premises servers or virtual machines to enable remote administration, metric collection and application monitoring. Examples include the SSM Agent of AWS Systems Manager and the Unified CloudWatch Agent, which sends system metrics, logs and custom data to CloudWatch.

‍

Data transfer agents. They run in on‑premises environments or other clouds to move large volumes of information to and from AWS. AWS DataSync uses an agent packaged as a virtual machine to transfer data quickly between on‑premises storage and services such as Amazon S3, EFS or FSx; tutorials show how to connect an NFS server on EC2 and move data to S3.

‍

Generative‑AI agents. The new family of Amazon Bedrock Agents relies on foundation models to automate high‑level tasks, from interacting with internal applications to coordinating complex workflows. These capabilities have been expanded with Amazon Bedrock AgentCore, a platform that abstracts the agent’s runtime, memory and identity to make it easier to build and operate agents at scale.

‍

Amazon Bedrock AgentCore: deploying AI agents at scale

‍

The most notable leap in the use of agents in AWS occurred during the AWS Summit New York 2025, when Amazon unveiled Amazon Bedrock AgentCore.

Its key services include:

Runtime with session isolation. The AgentCore runtime supports interactive, low‑latency loads and executes asynchronous flows of up to eight hours, offering industry‑leading duration and full session isolation.
Memory and context. To behave coherently, an agent needs a memory that combines short‑ and long‑term context. AgentCore provides high‑precision memory that lets developers build agents that are aware of task state.
Identity and authentication. Agents must access tools and data on behalf of the user. AgentCore Identity integrates the agent with identity providers like Amazon Cognito, Microsoft Entra ID or Okta, simplifying authentication and permission handling.
Gateway and tools. The AgentCore Gateway exposes API functions, Lambda functions and existing services as tools for agents. The Browser Tool offers a secure cloud browser to interact with websites at scale, while the Code Interpreter runs code in isolated environments.
Observability. AgentCore Observability uses Amazon CloudWatch to trace every action of the agent and provide real‑time telemetry. This is essential to detect deviations and optimize agent behaviour in production.

‍

AWS further expanded these capabilities by introducing multi‑agent collaboration. This functionality allows multiple specialized agents to coordinate under the supervision of a lead agent, making it ideal for complex tasks such as financial analysis. For example, a financial institution can create agents specialising in macro‑economics, industry trends and risk evaluation; a supervising agent splits and routes tasks to each agent and synthesizes the results into a final report. Companies like Moody’s already leverage this feature to deliver faster, more accurate risk assessments.

‍

Reducing hallucinations and model distillation

‍

The adoption of AI agents also presents challenges. One of the most critical problems is hallucinations, or fabricated answers. During AWS re:Invent 2024, AWS announced that Bedrock would incorporate Automated Reasoning checks, a protection that uses automated reasoning techniques to validate that model responses are correct. These checks help prevent factual errors and enable generative AI in regulated sectors such as healthcare or finance. Amazon Bedrock Guardrails integrates these checks and offers auditing of responses so that customers can require a model to comply with specific rules and policies.

‍

Another innovation presented was Model Distillation, which transfers knowledge from a large model to a smaller, more efficient one, reducing latency and cost. AWS reports that distillation can produce models up to 500% faster and 75% cheaper to run, with less than 2 % loss of accuracy. Companies like Robin AI use this technique to generate legal answers with high precision and lower cost.

‍

The role of traditional agents: SSM Agent and CloudWatch Agent

‍

Although AI agents attract attention, traditional agents remain pillars of AWS infrastructure. The AWS Systems Manager Agent (SSM Agent) runs on EC2 instances, on‑premises servers or virtual machines to allow Systems Manager to perform operations such as executing commands and collecting inventory. However, this power can be exploited. Security researchers have shown that if an attacker gains access to an instance with the agent installed, they could repurpose it as a remote access Trojan. The SSM agent is open source, digitally signed by Amazon and often pre‑installed on Amazon Linux, SUSE Linux, macOS and Windows Server. Being whitelisted by many security solutions, it often bypasses antivirus and EDR. If an attacker compromises an instance with the agent, they could link it to their own AWS account and run commands clandestinely. Therefore, researchers recommend restricting command reception via VPC endpoints and monitoring agent logs.

‍

The Unified CloudWatch Agent is another key component. The Unified CloudWatch Agent not only sends system logs but can also collect metrics such as memory and disk usage. It can collect metrics from Windows and Linux operating systems on EC2 instances and on‑premises servers, as well as custom metrics via StatsD and collectd. This flexibility makes it central for observability, capacity management and automation via alarms and events.

‍

Data transfer agents: AWS DataSync

‍

During migrations and data exchange across hybrid environments, the protagonist is AWS DataSync. The service is designed to move large volumes of data quickly and securely, allowING online data transfer between on‑premises storage and AWS services such as Amazon S3, EFS or FSx.

Security and governance best practices

‍

Using agents requires considering both the operational benefits and security risks. Key recommendations include:

Limit agent privileges and access. Configure IAM policies with minimal privileges and use specific roles so that agents can only execute necessary actions. In the case of the SSM Agent, restricting communication through VPC endpoints prevents a hijacked agent from receiving commands from unauthorized accounts.

‍

Control installation and life cycle. Regularly check which instances have agents installed and update them to supported versions. The unified CloudWatch Agent simplifies management because it consolidates metric and log collection.

‍

Enable observability and auditing. Use CloudWatch Logs and CloudWatch Events to trigger alarms when an agent changes state or connects to external accounts. For AI agents, take advantage of AgentCore Observability to trace each action of the agent.

‍

Implement quality controls in generative AI. When creating agents on Bedrock, apply Guardrails and Automated Reasoning checks to reduce hallucinations. Also consider model distillation to reduce costs and latency without sacrificing accuracy.

‍

Conclusions

‍

The AWS agent ecosystem is evolving rapidly. Traditional agents such as the SSM Agent or CloudWatch Agent remain essential for managing and observing hybrid infrastructures. However, the emergence of Amazon Bedrock Agents and the AgentCore platform takes the concept of agents to a new level, where generative AI can reason, plan and act on behalf of users. These innovations allow complex workflows to be coordinated by multiple agents, and mitigate hallucinations through formal reasoning techniques

‍

To fully leverage these agents, it is crucial to combine their power with sound practices in security, auditing and privilege management. Only then can we benefit from automation and intelligence in AWS while minimizing risks and maintaining control over our applications and data.

‍

Ready to harness the power of AWS agents for automation and AI? Our cloud specialists can help you design secure, scalable solutions with Bedrock AgentCore and beyond. Contact us to start your AWS agent strategy today.

‍