Coding Cost-Efficient Multi-Tenant Knowledge Bases with Time-to-Live (TTL)
Overview
This presentation discusses the design and implementation of cost-efficient, multi-tenant knowledge bases that leverage time-to-live (TTL) functionality. The key focus is on addressing the challenges faced by software companies in building and managing knowledge bases that serve multiple tenants while optimizing for cost and performance.
Challenges in Multi-Tenant Knowledge Bases
The presenters outline the main design decisions and challenges in building multi-tenant knowledge bases:
- Data Segmentation: Ensuring complete data isolation between tenants, preventing data spillover.
- Hybrid Search: Enabling both semantic and keyword-based search capabilities, integrated with AI-powered chatbots.
- Tiered Performance: Providing different performance tiers (latency, cost) to cater to the needs of various customer segments.
- Data Lifecycle Management: Implementing effective data retention policies and aging out outdated information, tailored to each tenant's requirements.
Vector Store Options
The presenters compare three popular vector store options for powering the knowledge bases:
- OpenSearch: Offers low latency and hybrid search capabilities, with the ability to provision resources to meet demand.
- PostgreSQL Vector: Provides low latency and integration with structured data, enabling hybrid queries.
- Amazon S3 Vector Bucket: Serverless and pay-per-query model, but with higher latency compared to the other options.
The key is to abstract the vector store implementation, allowing the application to seamlessly switch between these options based on the specific needs of each tenant.
Architecture and Implementation
The presenters walk through the architecture and implementation of the multi-tenant knowledge base system:
- Infrastructure Setup: The presenters use AWS CDK to set up the necessary infrastructure, including the knowledge base and its associated vector stores.
- Knowledge Base Configuration: The configuration includes setting up the vector type, embedding models, and storage configurations for different vector stores (OpenSearch, PostgreSQL).
- Data Source and Ingestion: The presenters demonstrate a custom data source approach, where metadata (tenant ID, user ID, file identifiers) is used to control the ingestion process and enable selective indexing of documents.
- Querying and Filtering: The query process leverages the metadata to filter the results based on tenant and user, ensuring data isolation.
- Data Expiration and Cleanup: The presenters utilize DynamoDB's time-to-live (TTL) functionality to automatically remove expired documents from the knowledge base, optimizing cost and maintaining data freshness.
Business Impact and Use Cases
The multi-tenant knowledge base architecture presented offers several benefits:
- Cost Optimization: The ability to leverage different vector store options (S3 Vector Bucket, OpenSearch, PostgreSQL) based on tenant requirements allows for cost-effective scaling.
- Performance Tuning: The tiered performance model enables tailoring the knowledge base to the specific needs of each tenant, whether it's low-latency for critical use cases or cost-effectiveness for evaluation purposes.
- Data Lifecycle Management: The TTL-based expiration and cleanup process ensures that only relevant and up-to-date information is maintained in the knowledge base, reducing storage costs and improving search accuracy.
The presenters highlight use cases in industries like real estate and healthcare, where the ability to manage data retention and performance requirements on a per-tenant basis is crucial.
Key Takeaways
- Implementing robust data segmentation and isolation is essential for multi-tenant knowledge bases.
- Offering tiered performance options (latency, cost) can cater to the diverse needs of customers.
- Leveraging TTL and metadata-driven ingestion enables effective data lifecycle management and cost optimization.
- Abstracting the vector store implementation allows for flexibility in choosing the most appropriate option for each tenant's requirements.
- The presented architecture provides a scalable and cost-efficient solution for building knowledge bases that serve multiple tenants.