This session from AWS re:Invent 2025 showcases how to leverage the AWS Neuron SDK and the Neuron Kernel Interface (Nikki) to optimize the performance of large language models (LLMs) running on AWS Trainium, the latest generation of AWS's custom ML accelerator chips.