EOSIO™ History Tools: Scaling Access to Blockchain History Data
September 13th, 2019
By nature, blockchain databases are always growing. With the confirmation of each new block, the ledger of sequential transactions becomes ever longer. An active blockchain’s database can easily grow beyond terabytes in size within years and its entire history must always be preserved.
While flexible and scalable access to historical blockchain data is a basic necessity for many blockchain applications, it remains one of the more difficult challenges for blockchain platforms to efficiently deliver. Because a blockchain’s history is so large, applications that require access to historical data are often resource-intensive and require special tooling for developers to create performant solutions.
In the ongoing effort to improve tools available to EOSIO developers, a stable version of the State History Plugin was released as part of EOSIO Version 1.8 at the end of June. This new solution for history replaced the prior history_plugin and MongoDB Plugin enabling more efficient and scalable access to on-chain data. Additionally, today we are introducing alpha support for a new and more robust blockchain history tool based on RocksDB that will replace the alpha release supporting LMDB that was initially included in History Tools. Support for PostgreSQL within History Tools will continue.
This article provides an overview of blockchain history solutions on EOSIO and a simplified architecture diagram explaining tools available for application developers.
We’ve had multiple breakthroughs during our ongoing work to optimize history solutions for EOSIO blockchains. Today’s history tools are built to facilitate highly performant searches, written in C++, that can efficiently and scalably sift through terabytes of data from the full history of EOSIO blockchains. As more efficient tools have become available, we’ve deprecated older and now less effective history solutions. This includes the history_plugin that was deprecated with the announcement of EOSIO Version 1.2 and the recently deprecated MongoDB Plugin.
The History of State History Tools on EOSIO
The original history_plugin solution stored data in memory, or RAM, which meant that as the chain grew, the amount of RAM needed also grew. Using RAM to service history became an expensive solution, so a more scalable approach took the form of the MongoDB Plugin.
By converting raw binary data from the EOSIO blockchain into JSON format and storing this information on the disk, the MongoDB Plugin transitioned away from using RAM to using disk, a more cost-effective solution. However, this approach also quickly presented scalability issues and as the database grew in size it suffered from the decreased performance from being stored as JSON plain text. MongoDB was also very tightly integrated with Nodeos leading to overall stability concerns as the solution evolved.
For future reference, a full list of past and future intended deprecations in EOSIO is available here in Github.
A Scalable Solution: The State History Plugin
The State History Plugin is the latest history solution supported for Nodeos and has been well received across the EOSIO developer community. It acts as a firehose, piping EOSIO blockchain data through an application developer’s preferred history tool for use in their application database. With the State History Plugin, it is possible for any external process to track on-chain data including tables, transaction history, and blocks. This data can, in turn, be stored, transformed, or indexed to suit a variety of needs. The plugin also stores data in portable files, which block producers may distribute alongside snapshots to reduce the time it takes to bring up a new Nodeos instance with complete state history.
The State History Plugin allows developers to request only data that is irreversible to be provided to the consumer, regardless of the mode Nodeos is operating in. Relying on this factor, developers won’t need to implement logic in their applications to see if transactions need to be rolled back as a result of acting on data that was not finalized. This provides relative confidence to developers as to the validity of data from the State History Plugin. For example, it is useful in the case of exchange operators or any other platform or service where operation hinges on transaction finality.
Decoupling Nodeos from Access to State History
The State History Plugin meshes with Nodeos without being so tightly integrated that it causes instability. It provides the data via a web-socket, allowing various history tools to hone in on particular data of interest and populate databases of a developer’s choice accordingly.
Using the State History Plugin, developers no longer need to write a tightly integrated plugin into Nodeos to reliably get access to historical data. It becomes the mechanism through which finalized blockchain events are published. Now developers can write applications and infrastructure that only need to rely on consuming data from the State History Plugin. Developers can fully control the filtering and transformation of data their applications require.
The State History Plugin overcomes limitations present in the MongoDB model and offers a new architecture to developers that support PostgreSQL and RocksDB.
History Tools Integrating with the State History Plugin
The State History Plugin works in conjunction with history tools designed to populate developer specified databases by managing full history from the EOSIO blockchain. Once a database is populated, a highly performant search tool, WASM-Query Language (WASM-QL), assists with querying the binary contract data in a scalable way. WASM-QL ships data so that developers can access and query what they need from the database.
RocksDB Alpha in History Tools
With the addition of alpha support for RocksDB in History Tools, we are moving away from supporting LMDB which was introduced in the initial alpha of History Tools. LMDB presented an option for developers to quickly instantiate a database with less overhead needed for maintenance with a PostgreSQL database, however, it was best suited for testing and not ideal for production scale.
RocksDB represents a more robust state history database solution for EOSIO blockchain application developers. RocksDB is a versatile embedded database that is not only more efficient but also simpler to administer than a database server.
Applications such as combo-rocksdb, fill-rocksdb, and wasm-ql-rocksdb, offer identical functionality of tools used to store and retrieve data from a PostgreSQL database:
- combo-rocksdb: Fills the database and answers queries of a live database
- fill-rocksdb: Used to initially populate the database but cannot answer queries; once caught up with the chain, switch to combo-rocksdb
- wasm-ql-rocksdb: Queries a database that isn’t being filled
Currently, RocksDB outperforms LMDB and PostgreSQL databases by considerable margins for both partial and full blockchain history on EOSIO blockchains. RocksDB also stores the full history much more efficiently than PostgreSQL.
We are continuing to examine open source solutions for querying state history, and additional solutions will be released under the history tools repository. As the EOSIO ecosystem continues to expand, we are dedicated to pioneering new and effective methods of managing state history for EOSIO blockchains.
Building components of the EOSIO ecosystem is strengthened by support from the EOSIO developer community. We expect to iterate and evolve history tools as they are used and input is received. If you would like to offer feedback and work more closely with our team to improve EOSIO for developers, you can send our developer relations team an email at email@example.com.
To keep up-to-date with future announcements, you can also subscribe to our mailing list on the EOSIO website. We are excited to be regularly improving the usability of the software for EOSIO developers as we continue to lay a foundation for the mass adoption of blockchain technology.