Considerations & tradeoffs
Tableland offers a new solution to suboptimal web3 data storage standards.
Selecting the chain to deploy tables on has a significant impact on the subsequent usage of the table. But before selecting a chain to then use Tableland’s capabilities, it may be helpful to consider the alternatives to Tableland in the first place.
Be sure to read the introduction on what is Tableland and how it works.
The three most common places people will store data in web3 include:
- Distributed / decentralized file storage.
- Centralized service provider.
Each of these decisions comes with a price (literally & figuratively). On-chain "everything" data is a nice idea but isn't optimal for many uses cases. If you're storing something off-chain in a centralized database or decentralized file storage, there is no "linkage" to the host blockchain—that's where Tableland comes into play. Since all table creates and data writes are on-chain operations, you take advantage of blockchain-native features while also offloading the data's storage and accessibility to an off-chain decentralized database network. A win-win for unstoppable web3 data.
|Decentralized file storage||Pretty cheap||Complex||Low||None|
|Centralized database||Great, cheap||Yes, cheap||None||High|
Smart contract storage
One pattern developers often turn to is writing all data to smart contract storage (on-chain maxis, rejoice!). Here, developers put all they possibly can on-chain—e.g., an image could be written on-chain as an SVG string, like
data:image/svg+xml;base64,PHN2..., instead of stored as a standard image file (JPEG, PNG, or SVG) on IPFS. Or, maybe some data is part of a
mapping combination that can be read by other contracts.
This is quite cool! The data is interoperable by default since it's all on-chain. It is fully composable to where other projects can build on of it—such as making a derivative NFT that, for example, simply toggles traits on/off after reading the on-chain SVG data. Plus, the data is guaranteed to exist as long as the chain itself lives. With the IPFS or centralized storage options, there is no such guarantee. Persisted file solutions (like Filecoin and Arweave) at least provide persistence guarantees within their own network, but again, there are some clear drawbacks (lack of mutability, queryability) and no "linkage" to the chain itself. They're isolated networks.
One key consideration is cost; you must store all data in the contract. This does allow information to be mutable, but that means you’d have to overwrite values in contract storage, which is costly (due to blockchains storing global state—a summary of every contract and account interaction that happens on the network). Also, you don't have a way to access that data efficiently (select data, aggregate it, join together, etc.) without indexing networks that involve entirely decoupled development workflows.
Decentralized file storage
The defacto web3 solution for large data and media is IPFS. Keep in mind that IPFS is a distributed file system, and decentralizing this storage requires a network like Filecoin for incentivized persistance. IPFS is great for storing files using content addressing, which uniquely identifies a file and not its location (using something called a "CID"), but it's imperfect for data storage.
Think of data in this context as "small" and a file as "large" media. File storage solutions are amazing and built for, well, storing large files and media. However, they lack query features, metadata, and dynamism—all of which are, especially, essential for small data. Imagine you upload a JSON file on IPFS, so you get its unique CID in return. Now, determine what the value is at some key in the JSON object—and do so without downloading the file itself and reading / parsing it. Or, programmatically take two different JSON files stored on IPFS and query / compose their metadata together, joined by matching some key. You, quite simply, can't without additional infrastructure.
Tableland works great to augment IPFS. Without a solution like Tableland, data embedded in some JSON file on IPFS is not easily queryable nor described for discoverability purposes. It is a perfect setup if you pair these solutions together, where large files (like images, website frontends, or large datasets) are stored on IPFS / Filecoin, and pointers (CIDs) are stored in Tableland tables. Another common use case is large datasets stored on these networks but without any easily queryable metadata about what is stored at the CID itself. It's another great combination where Tableland tables can describe this information and also offer access control-based mutability.
With centralized solutions (AWS, Google Cloud, etc.), developers can easily store and mutate data. You could even use a combination of the "augmentation" example above with images being stored on IPFS, and the CIDs are stored on the centralized server. It’s a partial web2-web3 approach. The primary benefit here is that by managing your own database (i.e., outsource to centralized service providers), you have full control over storing, mutating, and querying the data at a relatively low cost with high efficiency.
But, all of your data sits behind a wall that the service provider owns and controls, and each service might offer an entirely a different way to interface with that relational data. Data is not interoperable and limits how web3 data can interact with each other. It also detracts entirely from the web3 value proposition. When you are storing data in a centralized server, how does another developer openly read your database's data? If you want to collaborate on data with some external entity, how do you provision access and allow them mutate data?
That is, using a centralized database silos the data and prevents true composability and interoperability. What's needed is a global, shared database. Blockchain-driven permissionless access controls offer a much simpler way of enabling collaboration, and a web3 database provides a single data layer with a unified interface.
Data on Tableland
Tableland offers the best of all of these approaches in an optimal on-chain=>off-chain hybrid approach.
- Creating and mutating data are both handled on-chain (SQL instructions written to logs) with account-based access control and table ownership.
- Store large media (images, files) on IPFS, Filecoin, or similar, and include pointers to the CIDs within tables.
- Query the data off-chain in a decentralized SQL database network and across any chain for true interoperability.
Hence, data is cheaply secured on-chain and natively composable using off-chain queries.
But, not every relational database or use case makes sense to live in web3 relational Tableland tables. The main consideration is write frequency and the chain in which the table has been deployed on, since both of these attribute to table usability and feasibility (costs). Recall that each chain will have varying block finalization times, so the base chain greatly impacts data throughput.
Consider only a use case that, at its fastest, needs to write data at the same speed of the base chain's block times—how quickly a chain includes new blocks determines the write velocity—and can justify the chain's transaction costs that come along with it.
For reference, Ethereum and the "rollup" scaling solutions all have a max block gas limit of 30 million gas. Polygon has a maximum of 20 million gas per block. Knowing this, along with the gas used for create / write transactions, can be helpful when designing and understanding if a use case make sense for Tableland.
Tableland is, obviously, the best choice for decentralized data storage, but there are others tackling this issue. You should already be familiar with web3 data choices developers have had to make around on-chain storage, decentralized file solutions (IPFS / similar), and centralized service providers. These aren't really competitive solutions, per se, but alternatives developers have had to use due to lack of web3-native options.
In the web3-specific database space, there are a few protocols developers are using—each with a different approach:
- Ceramic: Rather complex concepts, acts as a standalone network (no interaction with any host chains), and uses its own identity system (rather than EVM accounts).
- Polybase: Not intended for smart contract devs, somewhat decoupled from the host chain (it's a zk-rollup). You lose ability to create on-chain data & rules to drive table changes, so it complicates application development since you can't, for example, have a smart contract write to the database.
- Space and Time: Primarily focused on being a general web3 data warehouse with on- and off-chain data flowing into its indexing layer.
- The Graph: Data indexing solution, not really a competitor. Think of The Graph as a way to query EVM data (i.e., historical event store), whereas Tableland is intended for chain-driven data liveliness—a cache-like store only latest state approach. (Although, you could build an index with tables!)
Check out our curated list of Awesome Decentralized Database information about various solutions and general learning resources.