AB
Learn how to effectively use AWS Systems Manager to automate and manage your AWS infrastructure. Part 1 covers the fundamentals, key concepts, and core features of AWS Systems Manager.
In this section, we will explore AWS Systems Manager (SSM), what it does, its key components, and some of its common use cases. This guide will help you understand the importance of SSM in managing your AWS infrastructure, whether you’re a beginner or an experienced AWS user.
AWS Systems Manager (SSM) is a fully managed service designed to automate and manage tasks on your EC2 instances, on-premises servers, and other infrastructure resources. It helps streamline operational tasks such as patching, configuration management, automation, and more, all from a single interface. This service provides you with the tools to securely manage resources at scale.
SSM simplifies management and operation by providing centralized control over your infrastructure. Whether you’re managing just a few EC2 instances or thousands, SSM can automate tasks like patching, configuration changes, and compliance checks. It integrates seamlessly with other AWS services, enabling you to monitor and manage resources efficiently.
Example: Imagine you’re responsible for keeping multiple EC2 instances up-to-date with the latest security patches. Instead of manually logging into each instance and updating them, you can use SSM to automate the process across all instances, saving you time and ensuring consistency.
AWS Systems Manager consists of several key components that help you automate and manage different aspects of your infrastructure. Here’s an overview of the most important components:
Managed Instances: These are the EC2 instances (or on-premises servers) that are managed through SSM. They need to have the SSM Agent installed to be managed.
(How do we know if an instance is managed?)
Managed instances are those that have the SSM Agent installed and can communicate with the Systems Manager service. If the instance is registered with SSM, you can manage it through the AWS Management Console or AWS CLI.
SSM Agent: The SSM Agent is installed on EC2 instances and on-premises servers to enable them to communicate with AWS Systems Manager. The agent helps execute commands and automate tasks remotely.
Parameter Store: A secure storage for configuration data and secrets, like API keys, passwords, and other sensitive information. It ensures that your configurations and secrets are stored securely and can be accessed programmatically.
Run Command: Allows you to remotely execute commands on managed instances without needing SSH or RDP access. You can run scripts, check logs, or install software across multiple instances.
Automation: Provides pre-built runbooks and the ability to create custom automation workflows. These workflows can help you automate tasks like patching, backups, and disaster recovery processes.
State Manager: Automates the process of applying configurations to managed instances and ensuring they remain compliant. For example, you can ensure all instances have a specific security setting applied.
Session Manager: Allows you to securely connect to your EC2 instances without needing SSH or RDP, providing a secure, auditable way to access instances.
AWS Systems Manager is widely used across various operational tasks. Below are some common use cases where you can apply SSM effectively:
Automating Instance Configuration: Automatically configure EC2 instances or on-premises servers with your preferred settings. For example, you can ensure that all instances have a specific set of software installed or that a security patch is applied.
Patching EC2 Instances: Automatically apply patches to EC2 instances and on-premises servers, helping you maintain security compliance without manually logging into each instance.
Managing Secrets and Configuration Data: Store sensitive information like database credentials and API keys securely in the Parameter Store, and access them dynamically when needed by applications or other services.
Remote Command Execution: Use the Run Command feature to execute commands remotely across many instances. This is especially useful when you need to install software or run maintenance tasks across a fleet of instances.
Centralized Management: With Systems Manager, you can manage your infrastructure from a single dashboard, making it easier to monitor and automate tasks at scale.
When we talk about automating tasks, we mean using Systems Manager features like Automation, Run Command, and State Manager to perform repetitive tasks automatically, without requiring manual intervention. For example, you can schedule an automation runbook to check if your EC2 instances are up-to-date with the latest patches and install the missing ones automatically. This saves you time and reduces human error.
To run a simple command on an EC2 instance using SSM, you would use the aws ssm send-command
command. Here’s an example of how to run a command that checks the disk space on a Linux EC2 instance:
aws ssm send-command \
--instance-ids i-0123456789abcdef0 \
--document-name "AWS-RunShellScript" \
--parameters 'commands=["df -h"]'
Explanation:
send-command
: This command sends a command to run on the managed instance.--instance-ids
: The ID of the instance you want to run the command on.--document-name
: Specifies which document (i.e., pre-defined or custom automation) to use. In this case, "AWS-RunShellScript"
runs a shell script.--parameters
: The script or command to run. In this example, it runs df -h
, which shows the disk usage on a Linux instance.Outcome: The command will be executed on the specified EC2 instance, and you will receive the output (disk space usage) as a result. This allows you to automate tasks without logging into each instance manually.
In this section, we will dive deeper into key concepts within AWS Systems Manager (SSM). Understanding these foundational concepts will help you use SSM more effectively to manage your AWS resources and automate common tasks.
AWS Systems Manager (SSM) consists of several important resources that help you automate tasks and manage your infrastructure. Let’s take a look at the most commonly used resources in SSM:
Managed Instances: These are EC2 instances (or on-premises servers) that are running the SSM Agent and are registered with AWS Systems Manager. Managed instances are the primary resources that SSM interacts with.
(What does it mean for an instance to be “managed” by SSM?) Managed instances are those that are set up to allow AWS Systems Manager to interact with them. This means that the SSM Agent is installed on the instance, and it can execute commands, apply configurations, and automate tasks. A key benefit of having managed instances is that you don’t need to manually log in to each instance to perform administrative tasks.
Documents (SSM Documents): These are JSON or YAML files that define the actions or steps that AWS Systems Manager should take on a managed instance. Common examples include AWS-RunShellScript
, which runs a shell script on an instance, or AWS-ApplyPatchBaseline
, which applies patches to instances.
(What is an SSM document and why is it important?)
An SSM document is like a recipe that tells AWS Systems Manager how to execute specific tasks. For example, if you want to run a shell script on your instances, you’d use the AWS-RunShellScript
document. These documents help automate and standardize the management of instances.
Parameter Store: This is a service within Systems Manager that securely stores configuration values, secrets, and other sensitive information. You can store things like database passwords, API keys, and other credentials here. These parameters can be encrypted for additional security.
(Why should you use Parameter Store?) Parameter Store helps you securely manage sensitive information like passwords or configuration settings that your applications might need. It is integrated with AWS services and allows you to easily reference these values within your scripts and automations.
Run Command: This feature allows you to remotely execute commands on managed instances without needing to log into them directly via SSH or RDP. This is particularly useful for tasks like applying patches or running maintenance scripts across multiple instances at once.
(How is Run Command useful?) Run Command allows you to perform administrative tasks on EC2 instances or on-premises servers remotely. You can execute scripts to check system health, install software, or perform diagnostics—all without needing to access each server individually.
The SSM Agent is a crucial component of AWS Systems Manager. It runs on your EC2 instances and on-premises servers and allows them to communicate with AWS Systems Manager.
It is an agent that is installed on every instance or server that you want to manage with AWS Systems Manager. The agent allows Systems Manager to execute commands on the instance and retrieve information about the instance’s configuration, health, and more.
Example: If you need to execute a command like installing a package or running a script, the SSM Agent communicates with AWS Systems Manager to pass the commands to the instance and send back the results.
EC2 Instances (Amazon Linux 2 or Ubuntu): The SSM Agent is pre-installed on most EC2 instances running Amazon Linux 2, Ubuntu, and other common distributions. However, if it’s missing or needs to be updated, you can install it using the following commands:
For Amazon Linux 2:
sudo yum install -y amazon-ssm-agent
sudo systemctl start amazon-ssm-agent
sudo systemctl enable amazon-ssm-agent
For Ubuntu:
sudo snap install amazon-ssm-agent --classic
sudo systemctl start snap.amazon-ssm-agent.amazon-ssm-agent.service
sudo systemctl enable snap.amazon-ssm-agent.amazon-ssm-agent.service
Verify SSM Agent is Running: After installing, you can verify that the agent is running with the following command:
sudo systemctl status amazon-ssm-agent
Outcome:
If the agent is running correctly, it will show as “active (running)”. This means the instance can now communicate with AWS Systems Manager.
In AWS, IAM roles and policies control access to your resources. For AWS Systems Manager to interact with EC2 instances, the instances need an IAM role that grants them the necessary permissions.
To allow EC2 instances to use AWS Systems Manager, you must create an IAM role and attach it to your instance. The role should include the necessary permissions to interact with Systems Manager.
Create the Role:
Attach the Role to an EC2 Instance:
To perform actions using AWS Systems Manager (like sending commands or automating tasks), the IAM role must have specific permissions. Here are some common permissions:
Example IAM Policy for EC2 Instances to Use SSM:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:SendCommand",
"ssm:DescribeInstanceInformation",
"ssm:GetParameters"
],
"Resource": "*"
}
]
}
(Why do IAM roles matter in Systems Manager?)
IAM roles define what AWS resources can interact with Systems Manager and what actions they can perform. Without the proper IAM roles and permissions, your instances won’t be able to communicate with Systems Manager or perform automated tasks.
AWS Systems Manager (SSM) offers a suite of powerful tools that simplify the management and automation of your AWS infrastructure. These tools are essential for improving operational efficiency, ensuring compliance, and automating common tasks. In this section, we’ll explore some of the core features and functions of AWS Systems Manager, including Parameter Store, Run Command, State Manager, and Automation.
AWS SSM Parameter Store is a secure storage solution within AWS Systems Manager. It allows you to store configuration data, secrets, and other sensitive information that your applications and services need to function securely.
AWS SSM Parameter Store is part of AWS Systems Manager that provides a centralized service to store and retrieve key-value pairs. These values can include configuration settings, passwords, and other sensitive data. It allows you to manage these parameters across different environments in a secure manner.
(Why should you use Parameter Store?) Parameter Store is used because it simplifies the management of sensitive data. For example, instead of hardcoding passwords into your application code, you can store them in Parameter Store and access them securely at runtime. This reduces security risks and ensures that your credentials are not exposed.
String Parameters: These are simple key-value pairs, used for storing non-sensitive information.
Example: You might use String parameters to store configuration values like database hostnames, API endpoints, etc.
aws ssm put-parameter --name "DatabaseHost" --value "db.example.com" --type "String"
This stores a simple configuration parameter called DatabaseHost
.
SecureString Parameters: These are used for storing sensitive information like passwords or API keys. The values are encrypted at rest using AWS KMS (Key Management Service).
Example: You might store an API key as a SecureString parameter:
aws ssm put-parameter --name "ApiKey" --value "your-api-key-here" --type "SecureString" --key-id "alias/aws/ssm"
This stores an API key securely, ensuring it is encrypted and access-controlled.
You can retrieve parameters from Parameter Store using AWS CLI or SDK.
Example (CLI): To retrieve the parameter we stored earlier, use this command:
aws ssm get-parameter --name "DatabaseHost"
Outcome: This command will output the parameter value, allowing your application to dynamically load configuration settings without needing to hardcode them.
Run Command is a powerful feature in AWS Systems Manager that allows you to run commands remotely on your managed instances (EC2, on-premises, etc.) without needing SSH or RDP access.
Run Command lets you run scripts or commands on multiple instances at once, making it much easier to perform maintenance tasks or automate repetitive activities across your environment.
(What is the benefit of using Run Command?) The main benefit of Run Command is that it eliminates the need to manually SSH into each server to run commands. You can execute commands on hundreds of instances simultaneously, which saves time and reduces human error.
You can execute a wide variety of commands using Run Command, such as shell scripts, PowerShell commands, or AWS CLI commands.
Example: If you want to check disk space on all your EC2 instances, you can use the following command:
aws ssm send-command --document-name "AWS-RunShellScript" --targets "Key=instanceIds,Values=i-1234567890abcdef0" --parameters 'commands=["df -h"]'
Outcome:
This command runs the df -h
command on the specified EC2 instance, which shows disk usage in a human-readable format. You can see the output of the command in the SSM console or via CLI.
State Manager allows you to define and enforce the configuration state of your instances. It helps you automate the process of ensuring your EC2 instances are always in the desired configuration state.
State Manager is useful for automating tasks like patch management, software installation, or security updates on your instances.
(Why use State Manager?) State Manager ensures that your instances are always in the right configuration without needing manual intervention. It can automatically apply changes when necessary and continuously monitor the health of your configurations.
You can configure State Manager to automatically apply patches to your EC2 instances using predefined SSM documents.
Example Command:
To apply patches automatically, you could set up a State Manager association with the AWS-ApplyPatchBaseline
document.
aws ssm create-association --name "AWS-ApplyPatchBaseline" --targets "Key=instanceIds,Values=i-1234567890abcdef0"
Outcome: This command will ensure that the EC2 instance automatically applies patches according to the patch baseline you define, ensuring that your instances stay up-to-date without manual intervention.
State Manager can also be used to check compliance by ensuring that configurations or patches are applied according to organizational policies.
Example: You can create an association that ensures specific software or settings (like antivirus) are installed and running on all your managed instances.
Automation in AWS Systems Manager allows you to automate complex workflows and tasks. This is especially useful for automating recurring maintenance tasks and handling operational procedures in a structured and repeatable manner.
Automation allows you to create and execute runbooks, which are a series of predefined steps that can be executed on-demand or on a schedule. This can simplify routine tasks like EC2 instance recovery or patching.
(What is a runbook in Automation?) A runbook is essentially a set of instructions that describe how to automate a task. These instructions can include a series of AWS Systems Manager actions, like invoking Lambda functions, running shell scripts, or interacting with other AWS services.
Creating a Runbook:
Running the Runbook: Once your runbook is created, you can execute it immediately or set it to run on a schedule.
Example Use Case: Automating EC2 Instance Recovery
Let’s say you want to automatically recover an EC2 instance if it becomes unresponsive. You can create a runbook that checks the health of an EC2 instance, and if the instance is unhealthy, it can automatically reboot or replace it.
Example Command: Here’s an example of a runbook that reboots an EC2 instance:
aws ssm start-automation-execution --document-name "AWS-RunShellScript" --parameters 'commands=["sudo reboot"]'
Outcome: This command initiates the runbook to reboot an EC2 instance if it is stuck or unresponsive, ensuring minimal downtime.
While AWS Systems Manager offers powerful core features for automation and management, it also provides advanced capabilities that further enhance operational efficiency, security, and scalability. This section covers the advanced features of AWS Systems Manager, such as Session Manager, OpsCenter, Patch Manager, Fleet Manager, and Explorer.
Session Manager is a secure and auditable way to connect to your EC2 instances without needing traditional SSH or RDP access. This is particularly beneficial when working in environments that prioritize security and need to minimize the surface area for potential attacks.
Session Manager is part of AWS Systems Manager and allows you to establish secure, encrypted, and auditable shell or remote desktop sessions to EC2 instances and on-premises servers. It works through the AWS Management Console or CLI, eliminating the need for direct inbound SSH or RDP access.
(Why use Session Manager instead of SSH or RDP?) By using Session Manager, you don’t need to open ports like 22 (SSH) or 3389 (RDP), which are common targets for attacks. Session Manager provides a more secure method for accessing instances without exposing them to the internet.
With Session Manager, you can securely access EC2 instances and on-premises servers without the hassle of managing SSH keys or maintaining RDP access.
Example: Suppose you want to connect to an EC2 instance without opening SSH. You can use Session Manager by running this command:
aws ssm start-session --target i-0abcdef1234567890
Outcome:
This command opens a secure shell session to the EC2 instance with ID i-0abcdef1234567890
through the Systems Manager console. You do not need to open SSH ports or manage key pairs, making your instance more secure.
To use Session Manager, you must ensure the proper IAM permissions are in place. You’ll need to attach the AmazonSSMManagedInstanceCore
policy to the EC2 instance’s role. Additionally, enabling session logging provides an audit trail of the session.
Example: To enable logging of sessions to Amazon S3:
aws ssm create-configuration --name "MySessionLogs" --s3-bucket "my-logs-bucket"
Outcome: This command configures Session Manager to store session logs in an S3 bucket, allowing you to review and audit remote access later.
OpsCenter helps streamline incident management by providing a centralized hub to track and resolve incidents within your AWS environment.
OpsCenter is part of AWS Systems Manager and is used for managing operational incidents. It allows you to create, manage, and resolve incidents such as application failures, performance issues, or security breaches. The goal is to consolidate incident data, streamline the investigation process, and accelerate resolution.
(How is OpsCenter different from other incident management tools?) Unlike traditional tools that only log incidents, OpsCenter provides a centralized workspace for managing and resolving incidents. It integrates with AWS CloudWatch, AWS Config, and other AWS services to automatically create incident records, ensuring you can respond quickly and effectively.
OpsCenter can automatically create incident records when specific events (e.g., alarm triggers, system health issues) occur in your AWS environment. It integrates with tools like AWS CloudTrail, CloudWatch, and others to provide comprehensive incident data.
Example: If a CloudWatch alarm triggers due to high CPU usage on an EC2 instance, OpsCenter automatically creates an incident record for tracking.
OpsCenter enables effective incident resolution by providing a clear view of the issue, its context, and its impact on other resources. For example, when a patch fails, OpsCenter can automatically create an incident and notify the team responsible for remediation.
Patch Manager is an automation tool that helps you maintain the security and compliance of your EC2 instances and on-premises servers by automatically applying patches.
Patch Manager enables automated patching of EC2 instances and on-premises servers, helping you maintain security and compliance across your fleet of resources. It can be used to automatically apply critical security patches, bug fixes, and other updates.
(Why automate patch management?) Automating patch management ensures that critical security vulnerabilities are patched as soon as patches are available. It eliminates the need for manual intervention, reducing operational overhead and the risk of human error.
Patch Manager can automatically apply patches based on a patch baseline, which defines the rules for patching and scheduling.
Example: You can create a patch baseline and schedule it to apply patches to instances every Sunday:
aws ssm create-patch-baseline --name "MyPatchBaseline" --operating-system "AmazonLinux" --approval-rule "PatchGroup=AllInstances"
Outcome:
This creates a patch baseline that will automatically apply to all EC2 instances in your environment that belong to the AllInstances
patch group, ensuring your instances are always up-to-date.
Patch baselines can be configured to specify which patches are approved, which should be skipped, and how often patches should be applied.
Example: To create a patching schedule that runs weekly:
aws ssm create-maintenance-window --name "WeeklyPatchingWindow" --schedule "cron(0 2 * * SUN)"
Outcome: This creates a maintenance window that applies patches every Sunday at 2:00 AM UTC.
Fleet Manager is a centralized service within AWS Systems Manager that allows you to manage and monitor large fleets of EC2 instances at scale.
Fleet Manager enables you to manage multiple EC2 instances and on-premises servers from a single dashboard. It simplifies operations such as instance configuration, software management, and troubleshooting.
(Why use Fleet Manager instead of managing EC2 instances individually?) Fleet Manager allows you to centralize management tasks like configuration, patching, and monitoring, which is especially beneficial when managing a large number of EC2 instances.
Fleet Manager is particularly useful for large-scale environments where managing individual instances manually becomes impractical. You can manage thousands of instances from a single console.
AWS Systems Manager Explorer is a feature that provides unified insights into your AWS resources and operations, helping you monitor and optimize your environment.
Explorer offers a dashboard that aggregates operational data from across your AWS environment. It helps you quickly identify trends, issues, and opportunities for optimization, making it easier to manage resources at scale.
(How does Explorer help in day-to-day operations?) Explorer provides an at-a-glance view of resource health, performance metrics, and operational insights. It can help teams quickly spot anomalies, resource usage patterns, or security incidents.
You can use Explorer to visualize key metrics such as instance health, patch compliance, and configuration drift. It aggregates data from across your AWS accounts, providing an intuitive interface for monitoring.
Example: You can access the Explorer dashboard to view the overall patch compliance of your EC2 instances and identify instances that are out-of-compliance with the latest patch baseline.
Outcome: This gives you a comprehensive view of your AWS environment’s health, enabling proactive management and quicker response times.
In Part 2 of this guide, we’ll dive into security and compliance aspects of AWS Systems Manager, best practices, integration with other AWS services, and real-world use cases. Stay tuned to learn how to leverage AWS Systems Manager to its full potential!