TalksAWS re:Invent 2025 - Transforming Cable Network Reliability with Agentic AI & Graphs (IND3332)

AWS re:Invent 2025 - Transforming Cable Network Reliability with Agentic AI & Graphs (IND3332)

Transforming Cable Network Reliability with Agentic AI & Graphs

The Journey to Autonomous Networks

Telco Autonomous Network Journey

  • Telco autonomous network journey structured by TM Forum with different phases:
    • Manual operation: Managing thresholds, rules, and hard-coded scripts
    • Closed-loop based operation: Network can sense data, understand it, reason on it, and take actions

Requirements for Autonomous Networks

  1. Data: Extracting and preparing network data (KPIs, telemetry, configurations, etc.)
  2. Knowledge Fabric: Transforming raw data into usable formats (time series, dependencies, embeddings, etc.)
  3. Predictive Intelligence: Applying analytics, machine learning, deep learning, and foundation models
  4. Autonomous Execution: Deploying agents to take actions based on insights

AWS Services and Frameworks

  • Execution and Agentic AI:
    • Transagent: Open-source framework for creating and managing agents
    • Agent Core: AWS's comprehensive agentic platform for deploying agents at scale
  • Automated Reasoning:
    • Bedrock: AWS's framework for automated reasoning using large language models
  • Purpose-Built Databases:
    • S3 Vector: Cost-effective solution for unstructured network documentation
    • Amazon Neptune: Graph database for storing and querying network topology
    • Graph Analytics: Performing graph algorithms for network analysis
    • Graph Deep Learning: Applying deep learning techniques on network graph data

Autonomous Network Use Cases

  1. Orchestration, Governance, and Planning:
    • Agents triggered by network events to understand and respond to issues
    • Guardrails and evaluation to ensure secure and authorized agent access
  2. Observability and Proactive Manner:
    • On-demand observability to troubleshoot, recommend, and assess network health
    • Proactive event analysis, prioritization, and recommendations
  3. Root Cause Analysis and Service Impact:
    • Leveraging network dependencies for change management and service impact assessment
  4. New Use Cases:
    • Automating data engineering tasks like topology discovery and feature engineering
    • Accelerating the adoption of AI in network operations

Cox's Journey to Autonomous Networks

Cox Communications Overview

  • Multiple service operator providing data, video, and voice services to over 6 million customers
  • 180,000 miles of hybrid fiber coax and fiber-to-the-home networks
  • 20,000 employees, with 100 in the Network Analytics and Reliability Enablement team

Transitioning from Reactive to Proactive Network Operations

  • Historically relied on customer calls to diagnose and troubleshoot issues
  • Shifted focus to harnessing data to understand systems, processes, and policies
  • Aimed to reduce customer calls and truck rolls by addressing root causes

Service Health Ecosystem

  1. Network Health: Aggregating and correlating SNMP probe data with network topology
  2. Node Health: Combining geospatial information, time-series telemetry, and online/offline traps
    • Classifying events as urgent, critical, or impaired
    • Prioritizing labor based on customer healthy minutes
  3. Premise Health: Differentiating between outside plant and in-home issues

Results and Impact

  • 22% reduction in call volumes
  • 10% reduction in truck rolls
  • 48% reduction in impaired customer minutes

Digital Twin and the Service Health Platform

Constructing the Digital Twin

  1. Discovering Network Assets: Automating the collection and integration of asset data from multiple sources
  2. Data Quality Checks: Ensuring the accuracy and completeness of the network topology
  3. Federated Telemetry: Aggregating alarms, KPIs, and customer transactions into a real-time representation of the network

Leveraging the Digital Twin

  1. Analytics and Insights: Applying machine learning and graph algorithms to the digital twin data
  2. Agentic AI Integration: Connecting the digital twin to agent-based systems for automated reasoning and actions

Strands Agent Architecture

  • Routing Lambda manages agent requests and responses
  • Agents scale out to process tasks from SQS queues
  • Open Search used as a vector store for caching
  • Leveraging Strands and Bedrock as the core components

Agent Core Enhancements

  1. Agent Core Gateway: Provides a standardized communication platform for agents to access various tools and capabilities
  2. Agent Core Memory:
    • Short-term memory for current incident context
    • Long-term memory for institutional knowledge and continuous improvement
  3. Agent Core Observability:
    • Visibility into agent performance and cost
    • Tracing agent workflows and correlating with business outcomes

Key Takeaways

  1. Data is Everything: There are no shortcuts - building high-quality, high-velocity data sets is crucial.
  2. Intentional Innovation: Fostering a startup mentality with cross-functional collaboration.
  3. Adopt and Go: Being willing to forge new ground and adapt as needed.
  4. Building for Change: Designing systems and processes with adaptability in mind.

Technical Details and Business Impact

  • Cox Communications operates a 180,000-mile hybrid fiber coax and fiber-to-the-home network
  • Employs 20,000 people, with 100 in the Network Analytics and Reliability Enablement team
  • Achieved a 22% reduction in call volumes, 10% reduction in truck rolls, and 48% reduction in impaired customer minutes
  • Leveraging AWS services and frameworks, including:
    • Amazon Neptune for graph database
    • Amazon Open Search for event storage and analytics
    • Strands and Bedrock for agent-based systems
  • Developed a digital twin of the network, combining topology and telemetry data
  • Integrated agentic AI capabilities, including:
    • Automated root cause analysis and recommendations
    • Proactive event detection and prioritization
    • Observability and cost optimization for agent-based systems

The key business impact is the transformation from a reactive, customer-driven network operations model to a proactive, data-driven approach that prioritizes reliability and customer experience. By harnessing the power of data, analytics, and agentic AI, Cox has been able to significantly improve operational efficiency and customer satisfaction.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.