TalksAWS re:Invent 2025 - Code completion to agents: The evolution of development (DVT405)
AWS re:Invent 2025 - Code completion to agents: The evolution of development (DVT405)
The Evolution of Development: From Code Completion to Autonomous Agents
Introduction
Presenters: Joavanni Zapella and Loranka Lo, principal scientists working on the Ko autonomous agent
Overview of the evolution of coding agents, from simple code completion to advanced autonomous agents
The Spectrum of Agent Experiences
Two main families of agents:
Synchronous agents: Interactive companions that accelerate developer tasks
Asynchronous agents: Autonomous agents that can be delegated tasks to complete independently
Synchronous agents operate in a sequential manner, similar to how developers work
Asynchronous agents allow for parallelization and delegation of tasks, with touch points for task definition, code review, and iteration
Early Approaches and Limitations
Initial attempts using large language models (LLMs) on simple benchmarks like HumanEval
Shift to more realistic benchmarks like SWBench revealed issues:
Poor code quality and test failures
Low recall in retrieving relevant files for code changes
The Fixed Workflow Approach
A four-step workflow:
Identify potentially relevant files with high recall
Enrich file content with additional metadata
Refine file selection based on new information
Select and rewrite code chunks to generate the final patch
Provided a 10x improvement over previous approaches, but lacked flexibility
The Text-Code Agent
Introduced an "agentic loop" where the agent selects and uses various tools to interact with the codebase
Significant performance boost, reaching 38% on the verified subset of SWBench
Demonstrated flexibility to support additional use cases, such as documentation generation
Limitations and the Need for Semantic Understanding
Agents struggled to self-correct and improve code quality due to lack of semantic understanding
Shift towards code execution and test generation to leverage more compute and feedback signals
The Logos Agent
Architecture with a code writer agent and a separate component for verifying and selecting the best patch
Achieved 51% performance on the full SWBench dataset, with an additional 4 percentage point improvement from patch selection
Enabled scaling by generating multiple candidate patches and selecting the best one
Reasoning and Planning with the Hong Agent
Introduced a supervisor agent that can create and manage sub-agents to accomplish complex, multi-step tasks
Sub-agents receive detailed instructions from the supervisor to perform specific subtasks
Demonstrated the ability to implement new features in a Flask-based website, including authentication and voting functionality
Key Lessons
Optimize for specific use cases: Simple workflows for repetitive tasks, interactive agents for low-latency, and complex agents for ambitious goals
Create reliable systems: Handle stochastic LLM outputs, enable observability and failure attribution, and leverage tools like the Strand Agent SDK and Amazon Bedrock Agent Core
Be ready to evolve: Adapt to new tools, models, and customer requirements; be prepared to change agent architectures and capabilities over time
Conclusion
The field of coding agents has evolved rapidly, driven by advancements in AI and the need to support increasingly complex developer tasks
The presented approaches demonstrate the progression from simple code completion to sophisticated autonomous agents capable of tackling ambitious software engineering challenges
Key lessons include optimizing for specific use cases, building reliable systems, and being prepared to evolve agent architectures and capabilities over time
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.