Over the past few days, we gave you a general overview of how Ethereum 2.0, or ETH 2.0, works and then showed you ETH 2.0 Staking and the Casper Protocol’s nuances. In this one, we are going to look into another massive feature of ETH 2.0 – Sharding.
One common criticism of various cryptocurrency and altcoin systems is that of scalability. Put simply, if cryptocurrency and blockchain technology is going to drive the DeFi world of tomorrow, it needs to be able to support billions of people. This is something our comprehensive DeFi guide goes into in-depth, but there are already many solutions. Scalability techniques mainly fall into the following categories - layer 2 and layer 1.
Layer-2 scalability
These are off-chain scalability solutions built on top of the blockchain. The idea here is to leave the base layer alone and put on extra architecture on top of it. This layer deals with complex computations which mitigates the architectural bottlenecks of the base layer. Raiden and Plasma are examples of layer-2 scalability, which we will explore in future articles.
Layer-1 scalability
Scalability techniques that are executed within the blockchain are called layer-1. Increasing the block size and Sharding are the two most well-known layer-1 scalability techniques.
Sharding was initially a technique used to partition bulky databases into more manageable chunks or shards horizontally. Look at this table:
So, do you see what happened here?
There is a large database with 6 rows. By breaking it down, we are converting it into three smaller shards of manageable sizes. This happens only via horizontal partitioning. To understand this, consider the following example.
Consider this table:
Let's partition this table vertically:
See that? Because of the partition, the table turns into two completely different tables. Horizontal partitioning only changes the table into smaller tables with the same features.
The same concept can be extended to the blockchain, wherein the chain's state gets fragmented into smaller and more manageable chunks, called shards.
One of the biggest problems with cryptocurrencies and the core reason behind the creation of Ethereum 2.0 is scalability. Ethereum currently can do <25 transactions per second, which is pretty abysmal. The reason behind this slow speed is the proof-of-work (POW) consensus protocol and the inherent architectural design on these cryptocurrencies.
The majority of the transactional operations that take place in cryptocurrencies are sequential in nature. Think about how a transaction works:
As you can see, the whole process is extremely sequential. Every step depends on the proper fulfillment of the previous step. This problem gets even more compounded as the network increases in size.
This is why choosing a parallelized process can be a more viable alternative. Breaking up a blockchain state into several shards and processing them in parallel can, in essence, allow you to divide and conquer.
Imagine a network with three nodes - A,B, and C. In a sequential format, they would have each had to verify a dataset D individually. However, with Sharding, D would be broken down into three shards D1, D2, and D3. They can each take up an individual shard and process them all at the same time. Even if we are just considering three shards, parallelizing can definitely speed up proceedings quite dramatically.
However, let’s scale things up to Ethereum’s size, which currently has more than 6,970 nodes. If optimally executed, the improvement in overall throughput will be immense. Eth 2.0 will eventually be divided into 1,024 shards and its hoped that this should theoretically increase network throughput by >1000X.
Up next, let’s look at another aspect of scalability. As you may already know, Ethereum is a peer-to-peer network. There are no centralized data-centers. The whole network depends on its nodes doing their jobs. In Ethereum, each individual node has the same power and privilege as its other peers. In Ethereum, you can either be a light client or a full node.
Light Clients are nodes that download a portion of the blockchain in their system. It allows them to verify transaction execution without having to download and maintain the full blockchain.
A full node is any system connected to the main network that has fully downloaded and is regularly maintaining the blockchain. They are pretty much the backbone of the Ethereum network and fulfill the following roles:
The catch is that Ethereum full nodes must download and maintain the whole blockchain at all times. The problem here is that the Ethereum blockchain is enormous. It is fast approaching the 1 TB size, so it’s becoming increasingly difficult for regular nodes to store the entire data.
So, how is Sharding going to help here? As per the official Sharding FAQ on GitHub, the key idea is to allow Ethereum to process upto 10,000+ transactions per second without forcing every node to spend thousands of dollars on hardware equipment. This is why Sharding is such a brilliant solution to this problem. The workload distribution per node decreases significantly.
Finally, let’s look at how Sharding works in ETH 2.0. The entire state of the ethereum blockchain is called “global state.” This state gets broken down into shards, and each of these shards has its own state. These states, shards, and global roots form a Merkle tree.
So, let’s see what’s happening here. Every single level in the tree is derived from one of the nodes in the level above.
When Sharding is activated, the following happens:
To visualize how this works, let’s take Vitalik Buterin’s example from Devcon.
Imagine that Ethereum has been split into thousands of islands. Each island can do its own thing. Each of the island has its own unique features and everyone belonging on that island i.e. the accounts, can interact with each other AND they can freely indulge in all its features. If they want to contact with other islands, they will have to use some sort of protocol.
Vitalik Buterin
Ethereum 2.0 executes this by creating two levels of interaction.
The first level in the shard interaction is the transaction group. Every shard will have its own unique transaction group.
This group is further subdivided into a transaction group header and body.
Transaction Group Header
The group header has a distinct left and right part.
The Left Part has the following components:
The Right Part of the header is a group of randomly chosen validators who verify the transactions inside the shard itself.
Level One Features
So far, we have seen the components that belong in level one, let’s see how everything comes together:
Now, let’s look at the second level of ETH 2.0’s Sharding. What you are looking at the image above is a standard blockchain, but it has two roots, instead of one:
Level Two Features
Alright, so now you know how the shards individually work and what they are made up of. However, the last thing Ethereum needs is for these shards to become individual silos of their own. There must be a method with which these shards could effectively talk to one another.
To paint a clearer picture, let’s bring back Buterin’s island analogy. If the islands have to thrive, they need to interact effectively with each other using a particular protocol. Plus, to reduce communication overload and expenses, the islands have to figure out a way to communicate only when needed.
The same principle is true for shard communication. Ethereum developers needed to answer certain questions to ensure effective cross-shard communication:
ETH 2.0’s cross-shard communication
ETH 2.0’s cross-shard communication protocol of choice is the “receipt paradigm.”
The two biggest problems with cross-shard communications are operational complexities and latency. Let’s see how ETH 2.0 mitigates these road bumps.
Vitalik Buterin has announced two proposals to create a fully-sharded Ethereum, with a “relatively minimal consensus-layer framework,” that provides sufficient support to develop complex smart contract frameworks.
Complexity kills
The proposals will:
Looking into the new transaction types
To understand how this delay happens, let’s look at how the cross-shard communication works:
As you can imagine, this process causes a lot of delays which will damage user experience and go against the whole scalability ethos of ETH 2.0. Vitalik explained the solution to this problem by giving the following examples:
“…if Bob has 50 coins on shard B, and Alice sends 20 coins to Bob from shard A, but shard B does not yet know the state of shard A and so cannot fully authenticate the transfer, Bob’s account state temporarily becomes ‘70 coins if the transfer from Alice is genuine, else 50 coins.’ Clients that have the ability to authenticate shard A and shard B can be sure of the “finality” of the transfer (ie. the fact that Bob’s account state will eventually resolve to 70 coins once the transfer can be verified inside the chain) almost immediately, and so their wallets can simply act like Bob already has the 70 coins.”
This solution proposal is called “Fast Cross-Shard Transfers Via Optimistic Receipt Root.”
As soon as a transfer has been verified, the transaction:
One of the most crucial technological needs to properly execute Sharding is the proof-of-stake consensus algorithm. The reason being that each individual shard will have a fraction of the original Ethereum chain’s hashrate. However, if the shard contains a powerful mining pool as its validator, it can completely take over the system and centralize its operations.
The phases are – Phase 0, Phase 1, Phase 1.5, and Phase 2.
Phase 0
Phase 0 kickstarts the POS implementation by launching the beacon chain. This chain launches its first (genesis) block once the following conditions are met:
Phase 1
This phase can be thought of as the “Sharding phase” and will take place in 2021. The blockchain gets partitioned into 64 shard chains which run parallelly and continually communicate with each other. By the end of this phase, Ethereum should be able to process transactions in 64 blocks simultaneously. Since the shards distribute the workload, it reduces the main chain-bloat by a considerable amount.
Phase 1.5
In this phase, the beacon chain gets integrated with the main proof-of-work (POW) chain to create a new POS chain. As defined in the previous phase, the POW chain will end up existing as one of the 64 shard chains.
Keep in mind that the original POW chain’s history will exist, but it will simply exist as any other shard and won’t run the POW consensus mechanism anymore.
Phase 2
The finer details of this phase have still not been fleshed out. However, it is widely believed that this phase will finetune features like ether accounts, transactions, transfers, and seamless smart contract execution on the new chain.
As you can imagine, fine-tuning the different features of seamless Sharding execution is extremely difficult, which is why it's a good thing that Ethereum 2.0 is launching in stages. However, once executed, this is going to take Ethereum to unprecedented heights. The biggest knock against cryptocurrencies has been its lack of scalability, which has forced newer protocols to opt for more centralized methods and mechanisms. However, with Sharding, Ethereum will be able to scale up significantly without compromising on decentralization.
Do you want to know more about Ethereum and smart contract coding? Want to prepare yourself before Ethereum 2.0 launches? Check out Ivan on Tech Academy's blockchain courses to find a repository of highly valuable educational blockchain material that will give you a significant edge in the job market!
Get to work in a fast growing industry. Start learning blockchain together with our 20,000+ students today.
Use coupon code BLOG20 for 20% off.
Enter your email and we will send it to you!