Rapid detection and noise reduction using automation (SUP311)

Here is a detailed summary of the video transcription in markdown format, broken down into sections:

Challenges Faced by Security Teams

  • Security teams at many AWS customers are small, often just 2-3 engineers
  • They struggle to keep up with the high volume of security alerts and findings, which can take 3-5 hours to triage manually
  • By the time they investigate an alert, it may be 12-36 hours old, reducing the ability to respond quickly
  • The dynamic nature of cloud environments, with resources constantly being created and destroyed, leads to many false positive alerts
  • Manual triage of security alerts lacks precision, with multiple teams responding to the same event

Potential Solutions Considered

The team considered several potential solutions, but each had drawbacks:

  1. Hiring more security engineers: This was not feasible due to cost and scalability concerns.
  2. Asking the existing team to work 24/7: This was not sustainable and faced pushback from the team.
  3. Centralizing logs from all customers: This would be expensive and raise privacy/sovereignty concerns.
  4. Magic unicorns: Unfortunately, they couldn't find any.

The Automation Approach

The team ultimately decided to pursue an automation-based approach, with the following key elements:

  1. Automating Tier 1 Security Analysis: Automating the repetitive, undifferentiated tasks like checking log files and threat intelligence databases, which freed up time for the security engineers.
  2. Leveraging Broad Knowledge Bases: The automation could access a much broader set of data sources and knowledge bases than a human analyst, enabling it to identify known good behavior.
  3. Empowering Security Engineers: The team worked with security analysts to capture their domain knowledge and investigation techniques in code, allowing the automation to apply their expertise at scale.

Challenges and Overcoming Resistance

The team faced several challenges in implementing the automation approach, including:

  1. Resistance to Change: Security engineers were skeptical of the "known good behavior" approach, as it was not the industry standard.
  2. Fear of Missing Threats: There was concern that the automation might miss important malicious activity.
  3. Job Security Concerns: Security engineers worried that the automation would replace them.
  4. Transparency: The team needed to ensure customers could still see the actions taken by the automation.

The team addressed these challenges through data-driven analysis, transparent communication, and empowering the security engineers to contribute to the automation.

Benefits and Outcomes

The automation-based approach has yielded several key benefits:

  1. Noise Reduction: A 100:1 reduction in the noise-to-signal ratio, allowing the team to focus on the most critical alerts.
  2. Time Savings: Freeing up thousands of hours per year that were previously spent on manual triage.
  3. Proactive Security: The ability to shift left and improve the security posture of customer environments.
  4. Continuous Learning: Leveraging the scale of the platform to learn from each security incident and build new automations.

Scenario Examples

The team provided several real-world examples of how the automation has helped, including:

  1. Port Scanning and Reconnaissance: Automating the identification of known good reconnaissance activity to reduce noise.
  2. Cryptocurrency Mining: Accurately differentiating legitimate blockchain-related activity from malicious crypto-mining.
  3. Web Crawling: Recognizing and accommodating the security needs of customers performing large-scale web crawling.

AWS Security Incident and Response Service

The team announced the launch of the AWS Security Incident and Response service, which brings the automation-based approach to all AWS customers. Key features include:

  1. Automated Monitoring and Investigation: Triage of security alerts from various sources, determining known good behavior.
  2. Security Incident Management Console: Centralized visibility and control over security incidents.
  3. 24/7 Security Expert Support: Access to specialized security engineers for escalated incidents or additional assistance.

The service is available for customers who want to leverage the team's security capabilities without the full AMS operational support.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us