Gain insights from Web3 data with AWS Public Blockchain datasets (BLC307)
Blockchain Data Sets and Queries: Exploring the AWS Public Blockchain Data and Amazon Managed Blockchain Query
Amazon Managed Blockchain Query (AMB Query)
AMB Query is a fully managed service that helps with the complexities of indexing blockchain data.
It provides simple APIs to directly query structured data from your blockchain workloads.
Handles the indexing, loading, and provides low-latency access to the data.
Supports Bitcoin and Ethereum blockchains, allowing you to query past balances, transaction histories, and more.
AWS Public Blockchain Data Sets
Launched in 2022 as an open-source project, supporting Bitcoin and Ethereum blockchains.
AWS has built an ETL process to transform the data from RPC requests and persist it into a public S3 bucket.
The data sets include tables for blocks, transactions, logs, token transfers, traces, and contracts for Ethereum.
For Bitcoin, there are tables for blocks and transactions.
These data sets can be used to perform SQL queries, from basic to advanced, such as finding the largest Bitcoin transaction or creating a heat map of money flow.
Expanding the Blockchain Support
Based on customer feedback, AWS has added support for five new blockchains: Optimism, Arbitrum, BNB Chain, Polygon, and XRP Ledger (XRPL).
These new data sets are provided by the indexing partner Sonar X and are loaded daily into the public S3 buckets.
Customers can access the data sets for free, either by downloading them or by integrating with AWS services like Amazon Athena, Amazon Redshift, or Amazon SageMaker.
This approach aims to lower the cost and effort of indexing blockchain data for research and experimentation.
Demo: Text-to-SQL Queries using Bedrock Agent
The presenters showcased a demo that combines generative AI and the AWS public blockchain data sets.
The Bedrock agent, utilizing the Anthropic Claud model, can understand natural language queries and generate the corresponding SQL queries.
The generated queries are executed on Amazon Athena, which runs on the AWS public blockchain data sets.
The solution can gracefully handle errors by analyzing the error messages and generating new queries.
This demo highlights the synergies between AMB Query and the AWS public blockchain data sets, where AMB Query provides low-latency access to structured data, while the public data sets offer a broader range of information.
Comparison and Complementary Use Cases
Querying a single Bitcoin balance using Amazon Athena can take around 75 seconds and cost $6 due to the large volume of data (1.15 TB).
AMB Query, on the other hand, offers millisecond latency and a fixed cost of around $6-$7 per million requests, making it much more cost-effective.
The presenters mentioned exploring new storage formats, such as delta tables, to further improve the performance and cost-effectiveness of the AWS public blockchain data sets.
The two offerings, AMB Query and the AWS public blockchain data sets, can be used in a complementary manner, where AMB Query provides low-latency access to specific data points, while the public data sets offer a more comprehensive view of the blockchain ecosystem.
References and Resources
AWS Public Blockchain Data website: [link]
Guide on Analyzing Blockchain Data with Natural Language using Bedrock: [link]
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.