TalksAWS re:Invent 2025 - Code completion to agents: The evolution of development (DVT405)

AWS re:Invent 2025 - Code completion to agents: The evolution of development (DVT405)

The Evolution of Development: From Code Completion to Autonomous Agents

Introduction

  • Presenters: Joavanni Zapella and Loranka Lo, principal scientists working on the Ko autonomous agent
  • Overview of the evolution of coding agents, from simple code completion to advanced autonomous agents

The Spectrum of Agent Experiences

  • Two main families of agents:
    1. Synchronous agents: Interactive companions that accelerate developer tasks
    2. Asynchronous agents: Autonomous agents that can be delegated tasks to complete independently
  • Synchronous agents operate in a sequential manner, similar to how developers work
  • Asynchronous agents allow for parallelization and delegation of tasks, with touch points for task definition, code review, and iteration

Early Approaches and Limitations

  • Initial attempts using large language models (LLMs) on simple benchmarks like HumanEval
  • Shift to more realistic benchmarks like SWBench revealed issues:
    • Poor code quality and test failures
    • Low recall in retrieving relevant files for code changes

The Fixed Workflow Approach

  • A four-step workflow:
    1. Identify potentially relevant files with high recall
    2. Enrich file content with additional metadata
    3. Refine file selection based on new information
    4. Select and rewrite code chunks to generate the final patch
  • Provided a 10x improvement over previous approaches, but lacked flexibility

The Text-Code Agent

  • Introduced an "agentic loop" where the agent selects and uses various tools to interact with the codebase
  • Significant performance boost, reaching 38% on the verified subset of SWBench
  • Demonstrated flexibility to support additional use cases, such as documentation generation

Limitations and the Need for Semantic Understanding

  • Agents struggled to self-correct and improve code quality due to lack of semantic understanding
  • Shift towards code execution and test generation to leverage more compute and feedback signals

The Logos Agent

  • Architecture with a code writer agent and a separate component for verifying and selecting the best patch
  • Achieved 51% performance on the full SWBench dataset, with an additional 4 percentage point improvement from patch selection
  • Enabled scaling by generating multiple candidate patches and selecting the best one

Reasoning and Planning with the Hong Agent

  • Introduced a supervisor agent that can create and manage sub-agents to accomplish complex, multi-step tasks
  • Sub-agents receive detailed instructions from the supervisor to perform specific subtasks
  • Demonstrated the ability to implement new features in a Flask-based website, including authentication and voting functionality

Key Lessons

  1. Optimize for specific use cases: Simple workflows for repetitive tasks, interactive agents for low-latency, and complex agents for ambitious goals
  2. Create reliable systems: Handle stochastic LLM outputs, enable observability and failure attribution, and leverage tools like the Strand Agent SDK and Amazon Bedrock Agent Core
  3. Be ready to evolve: Adapt to new tools, models, and customer requirements; be prepared to change agent architectures and capabilities over time

Conclusion

  • The field of coding agents has evolved rapidly, driven by advancements in AI and the need to support increasingly complex developer tasks
  • The presented approaches demonstrate the progression from simple code completion to sophisticated autonomous agents capable of tackling ambitious software engineering challenges
  • Key lessons include optimizing for specific use cases, building reliable systems, and being prepared to evolve agent architectures and capabilities over time

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.