Incident Response in the Cloud: 4 Ways to Improve Your Investigation and Containment Capabilities

Dealing with the aftermath of an incident in a cloud environment can be a daunting scenario given the challenges that cloud infrastructure security presents. Depending on how many systems and applications you host in various cloud environments (including through your third-party connections), a single incident can have far-reaching consequences.

Moreover, without physical access to your systems, lack of control can be a concern when investigating and containing an incident in the cloud. However, if anything, incident response (IR) teams can maintain control and gain unique benefits through the cloud.

Here are several cloud computing features IR teams can take advantage of to improve incident response processes.

Out-of-Band Logging

All major cloud service providers offer “logging capabilities” (operational metrics or log files to gain insight into service operations) for their environments. Some are pay-per-service, and other services are free that just need to be activated in your environment. These logs range from basic access logs to full-blown audit and configuration logs. Each service provider varies, but most maintain these logs outside your environment or give you the option of storing them in private clouds.

Amazon Web Services (AWS) offers multiple logging capabilities, including CloudTrail for audit logging, CloudWatch for application monitoring, and GuardDuty for security monitoring. AWS CloudTrail and GuardDuty are both paid services, but CloudWatch has some basic monitoring features that are free.

How can you use logs at the onset of an incident response investigation? Your logs are one of the few sources of information that can remain “off limits” from attackers. If an attacker successfully compromises your cloud systems or services, they will not be able to delete or modify those logs. These protected logs can help determine the attacker’s IP address, the timeline of the attack, and the systems targeted. Through out of band logging, your IR team can block the malicious IPs and have a reliable starting point to begin their investigation.

Hypervisor Level Control

When responding to an incident in your cloud environment, you must remember that you’re dealing with a collection of virtual machines (VMs). You have a hypervisor level or “God-mode” account that’s specific for your cloud environment. You can build, suspend, or delete systems in your production environment at any time. When you confirm the status of an incident, you can create snapshots of the compromised instances in your environment. In the AWS environment, creating a snapshot for analysis is made easy with Elastic Block Store (EBS)“Create Snapshot”. These snapshots can be used for evidence collection for the ongoing investigation.

Once you have snapshots of the affected systems, you can turn your focus towards short-term containment options. You have the option to suspend or segregate the systems in your production environment. In addition, your network team can rapidly rebuild the affected systems and restore data from backups. They can also restore the systems from good snapshots, decreasing the duration of possible service outages.

Virtual Forensics Workspace

Many organizations maintain separate development and production environments in the cloud. They can also maintain a dedicated incident response environment in the cloud. This environment can be as simple as a single VM with your incident response or forensic toolkit loaded onto it. If maintaining a running IR environment isn’t appropriate for your organization, you can maintain a snapshot of your IR systems that’s ready to be deployed when needed.

When an incident occurs, your IR team can spin up the snapshots of the compromised systems in the environment with their toolkit. You don’t have to worry about waiting for a bit-by-bit image of each compromised system. This provides your team with the ability to analyze volatile memory and hard-disk memory, making it easier to gain information about the attacker or malware on your network.

Virtual Cyber Security Training Range

Many organizations depend on table top exercises to train and test their cyber security and incident response capabilities. Cloud environments now give companies the opportunity to spin up training environments that are identical to your production network. This permanent or temporary training range provides your cyber security team an opportunity to respond to real malware and attacks on your network. Using tools like AWS CloudFormation allows you to design and quickly deploy identical or partial networks to train on. Limiting the duration of the exercises can keep costs down while ensuring your team is ready and capable of responding to a real-world attack.

Key Recommendations

Use out-of-band logging to help your incident response team block malicious IPs
Create snapshots of the compromised instances in your environment
Use your toolkit to create snapshots of the compromised systems
Use tools like AWS CloudFormation to design and deploy identical networks for training

Summary

Being able to maintain an effective incident response program for the cloud begins by understanding how to better use the cloud tools you have in place, especially when it comes to maintaining log files. These logs can make all the difference when you’re trying to investigate a suspected breach incident. For AWS environments, we discuss the importance of logging all API activity in our previous blog about AWS CloudTrail.

Delta Risk has already been engaged with incident response in cloud environments and provides dedicated cloud infrastructure security services. We also offer AWS Security Architecture Assessments to help ensure AWS controls are set up properly. Contact us to learn more.

Managed Security Services

Security Services

Company and Resources