AWS re:Invent 2025 - Why is Reliability So Hard? (DVT227)

Summary of AWS re:Invent 2025 Presentation: "Why is Reliability So Hard?"

Introduction

Presenter: Hannes Lank, CEO and co-founder of Czechly

Czechly helps organizations detect, communicate, and resolve software reliability issues faster

Goal is to help engineers "own reliability from pull request to postmodern software"

The Evolution of Software Reliability

A decade ago, software was built and shipped much less frequently (yearly, quarterly, monthly)

Today, software is built and shipped almost instantly, but reliability has not kept pace

Yesterday's applications were simple, with few dependencies - issues were easy to identify

Modern applications are highly complex, with many dependencies that introduce potential failure points

The Reliability Challenge

Increased complexity and dependencies make it harder to ensure reliability

Traditional approaches of more people, processes, and testing have not solved the problem

Both development and operations teams try to validate application functionality, but in siloed ways

Key Principles for High-Performing Teams

Predictability: Ability to predict how an application will behave when released to production

Accountability: Knowing what changed, who changed it, why, and when - to identify the root cause of issues

Resiliency: Building applications that can be quickly rolled back in the event of problems

Czechly's Approach

Unifies testing and monitoring into a single, version-controlled workflow

Allows teams to build tests (UI, API, uptime) as code and deploy them for continuous monitoring

Integrates the reliability pipeline with the CI/CD pipeline, enabling a common language and visibility

The Evolving "You" in Software Reliability

Traditionally, "you" referred to the developer or engineer responsible for the code

Today, "you" encompasses anyone who touches the user experience, including agents, cloud code, and other tools

In the future, agents may be capable of building, testing, monitoring, and owning more of the software lifecycle

Key Takeaways

Reliability has not kept pace with the rapid evolution of software development

Increased complexity and dependencies make it harder to ensure reliability using traditional approaches

High-performing teams focus on predictability, accountability, and resiliency to improve reliability

Czechly's approach unifies testing and monitoring, integrating the reliability pipeline with CI/CD

The concept of "you" in software reliability is expanding to include a wider range of stakeholders and tools

AWS re:Invent 2025 - Why is Reliability So Hard? (DVT227)

Summary of AWS re:Invent 2025 Presentation: "Why is Reliability So Hard?"

Introduction

The Evolution of Software Reliability

The Reliability Challenge

Key Principles for High-Performing Teams

Czechly's Approach

The Evolving "You" in Software Reliability

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Why is Reliability So Hard? (DVT227)

Summary of AWS re:Invent 2025 Presentation: "Why is Reliability So Hard?"

Introduction

The Evolution of Software Reliability

The Reliability Challenge

Key Principles for High-Performing Teams

Czechly's Approach

The Evolving "You" in Software Reliability

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.