Allianz Germany: Accelerating legacy migration with generative AI (FSI323)

Here is a detailed summary of the video transcription in markdown format, broken down into sections for better readability:

Legacy System Migration Challenges

  • Traditional data migration from legacy systems is a risky, expensive, and time-consuming operation.
  • Legacy systems often use old programming languages, and there is a lack of people who understand the processes and data.
  • Mapping old data to a new system can be challenging if the new system is not well understood.

Leveraging Contract Data to Streamline Migration

  • The project aims to migrate data from a legacy system to a new system, where the data is contained in contract PDF documents.
  • The traditional manual migration of 20,000 contracts would not have met the timeline, so a new approach was needed.
  • The key idea is to use the contracts as the ground truth, instead of relying on the data in the legacy system.

Automated Data Extraction and Human Validation

  • The project aims to extract up to 150 attributes from the contract documents.
  • The data quality can vary, as some contracts are from the company's own source, while others are from third-party sources.
  • The team follows a "human-in-the-loop" principle, where the business users can accept or decline the extracted data attributes to ensure high data quality.

Serverless Pipeline Architecture

  1. Documents are uploaded to an S3 bucket.
  2. A Step Functions workflow processes the batch of documents.
  3. A distributed map function is used to process each document in parallel, performing tasks like OCR, data cleaning, and data attribute extraction using AWS Comprehend.
  4. The extracted data attributes are aggregated, prioritizing more recent data.
  5. The results are written back to S3, and the document state is updated in an Aurora database.

Front-end Demonstration

  • The front-end displays the extracted data attributes, along with the source (AI or legacy system), the extracted value, the page where the information was found, and the reasoning behind the extraction.
  • The business user can accept or decline the data attributes, allowing for human validation.
  • The front-end also displays alternative values from the legacy system, enabling the user to choose the most up-to-date and accurate data.

Key Takeaways

  1. Involve business users actively in the migration process, instead of relying on external experts.
  2. Combine automated data extraction with human validation to achieve high data quality.
  3. Establish an evaluation framework to quickly test and improve data extraction models.
  4. Leverage the expertise of your team to understand the old and new data structures and formats.
  5. Address data quality challenges, such as document classification, by incorporating business user insights.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us