TalksAWS re:Invent 2025 - Long-Horizon Coding Agents: Complex Software Projects with Claude (AIM3316)

AWS re:Invent 2025 - Long-Horizon Coding Agents: Complex Software Projects with Claude (AIM3316)

Long-Horizon Coding Agents: Complex Software Projects with Claude

Key Takeaways

  • Leveraging AI/LLMs to handle "boring" or repetitive development tasks, allowing developers to focus on high-level decision making and problem-solving
  • Maintaining context and continuity across multiple coding sessions through a structured environment and workflow
  • Balancing the strengths and limitations of AI agents compared to human developers
  • Practical implementation details and architecture for deploying this approach using AWS services

The Developer Productivity Challenge

  • Developers often get bogged down in repetitive tasks and context switching, disrupting flow and productivity
  • Traditional development workflows involve:
    • Writing code and debugging manually
    • Documenting work as an afterthought
    • Constantly shifting between different tasks and priorities
  • The goal is to leverage AI/LLMs to handle these mechanical, repetitive tasks while keeping humans in control of high-level decision making

Limitations of LLMs and the Forgetting Problem

  • LLMs have a fundamental limitation in maintaining persistent memory across coding sessions
  • While this "forgetting" can be seen as a drawback, the researchers found it can actually be beneficial by preventing the agent from picking up bad habits or making incorrect assumptions
  • The key is to externalize the agent's memory and context through structured environment components

The Structured Environment Approach

  • Feature list in JSON format: Prevents the agent from modifying the specification
  • Standards file (Markdown): Ensures the agent follows the same coding standards across sessions
  • Cloud progress file (Markdown): Tracks the agent's progress and links to Git commits
  • Initialization file: Sets up the development environment deterministically at the start of each session

The Continuous Coding Loop

  1. Agent examines the environment and tests the existing codebase
  2. Agent selects the next available feature to work on
  3. Agent implements the feature, continuously running tests
  4. Agent commits its work to the Git repository
  5. The loop repeats, with the agent picking up where it left off in the next session

Transitioning the Developer Workflow

  • Developers focus on defining requirements, setting standards, and reviewing the agent's work
  • The agent is responsible for implementing features, writing tests, and maintaining documentation
  • This approach can be faster for well-specified projects, but requires careful oversight and review by human developers

Technical Architecture

  • GitHub issues as the backlog of features to be implemented
  • GitHub Actions for CI/CD and issue management
  • AWS Lambda-based "Agent Core Runtime" to orchestrate the coding sessions
  • Anthropic API running on AWS to leverage the Claude language model

Demonstration and Insights

  • Showed a live example of the agent building a project management tool called "Canopy"
  • Highlighted the agent's ability to take screenshots, write tests, and maintain progress through Git commits
  • Discussed potential areas for improvement, such as using specialized agents vs. a single general-purpose agent

Conclusion and Next Steps

  • The structured environment and workflow approach aims to augment developers, not replace them
  • Developers maintain control over the standards and vision, while the agent handles the mechanical implementation
  • Opportunities for further research and experimentation, including exploring different domains beyond web development

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.