TalksAWS re:Invent 2025-Detection Engineering at Scale: Building High-Fidelity Security Operations-SEC327

AWS re:Invent 2025-Detection Engineering at Scale: Building High-Fidelity Security Operations-SEC327

Detection Engineering at Scale: Building High-Fidelity Security Operations

Challenges in the Modern Security Landscape

  • Organizations are dealing with an ever-increasing volume of logs from various sources, including AWS CloudTrail, endpoint security logs, network context from VPC flow logs, and AI platform logs.
  • The goal is to detect threats and misconfigurations in a contextual and actionable way, but the sheer amount of data makes this challenging.
  • Manual detection engineering approaches suffer from issues like lack of version control, siloed development, inconsistent quality, and difficulty scaling.

Applying SDLC Principles to Detection Engineering

  • Security engineers have begun applying software development lifecycle (SDLC) principles to detection engineering, including:
    • Version control and peer review
    • Automated testing and validation
    • CI/CD deployment
    • Modularization and reusability

Balancing Recall and Precision in Detections

  • The goal is to write "perfectly accurate" detections, but this is often unachievable in practice.
  • There is a trade-off between maximizing recall (detecting all malicious events) and maximizing precision (minimizing false positives).
  • Organizations must understand their tolerance for false positives and continuously monitor and tune detections to find the right balance.

Riot Games' Detection as Code Platform

Optimizing Log Ingestion and Storage

  • Riot built a platform on AWS services to:
    • Filter logs early to save on costs
    • Adapt and scale ingestion on-the-fly
    • Normalize and enrich logs
    • Enforce data hygiene with tagging
  • They use the open-source tool Vector for log transformation and routing
  • They leverage Data Dog's archive and rehydration features to balance hot and cold storage

Implementing Detection as Code

  • Riot applies SDLC principles to detection development:
    • Version control and peer review in GitHub
    • Automated validation and testing
    • CI/CD deployment of detections
  • They have a "break glass" process for quickly deploying detections during incidents
  • They carefully manage and monitor out-of-the-box detections to avoid alert fatigue

Leveraging Behavioral Detections and AI

  • Riot uses behavioral detections from tools like Okta and AWS GuardDuty, but these can be noisy
  • They correlate behavioral alerts with other signals (user agents, IPs, usernames) to triage and tune detections
  • They see potential for AI-powered alert correlation and investigation, but emphasize the need to provide clear guidance to the AI system

Key Takeaways

  • Detection engineering requires a robust process, not just more rules
  • Optimizing the data pipeline is crucial for high-fidelity detections
  • Applying SDLC principles like version control, testing, and automation is essential
  • Balancing recall and precision is an ongoing challenge that requires continuous tuning
  • Behavioral detections and AI can augment human analysts, but require careful implementation

Resources

  • Data Dog for Startups program: up to $100,000 in credits for early-stage companies
  • OCSF (Open Cybersecurity Schema Framework) for log normalization
  • Data Dog Observability Pipelines with OCSF processor
  • Bits AI Security Analyst for alert management and investigation automation

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.