A hash function takes in an input string of any length and gives an output hash of a fixed length. Upon hashing, you get a string of pseudo random alphabets and numbers as an output which will look like this -> 242485AB6BFD3502BCB3442EA2E211687B8E4D89.
Hash functions are the cornerstone of the blockchain’s architecture. So, if you want to understand the blockchain and cryptocurrencies better, you must have complete clarity on what hashing and is and what it can potentially do. We will also look into various sidenotes such as hashing vs encryption to help us understand the main differences between them.
Let’s look at some of the unique properties and features of these hash functions.
First and foremost, cryptographic hash functions are deterministic in nature. This means that no matter how many times you parse one input through a specific hash function, you will always get the same output. This is extremely important for multiple reasons, the least of it being operational accuracy.
Hash functions need to be very fast and efficient. The hash calculation has to be as fast as possible. Otherwise, it will be impossible to use them in real-time calculation scenarios.
First, let’s understand what pre-image resistance means. If H(A) is the hash and H() is the hash function, A will be the pre-image. Simply put, its the value that gets hashed by the hash function. So, the pre-image resistance states that if given H(A), it is infeasible to figure out the value of A.
So, why do we say infeasible rather than impossible?
The reason is that it is not impossible to find the pre-image of a hash. All that it requires is a brute-force attack and a whole lot of time and patience to crack a hash. But the real question is, how long will you need to break this hash? Suppose you are using a 128-bit hash, and you want to figure out what the pre-image could be. In this situation, there are three possible scenarios that we are dealing with:
Either way, it’s a huge number, and it will take a long time and a vast amount of calculations to even attempt to break it. In other words, it’s simply not worth it.
To understand this property, let’s look at a simple example. Look at the following table. Note, we are using a SHA-256 hashing algorithm.
Do you see what happened here? We just changed the alphabet of the input from the upper case to lower case, which triggered a huge change in the output hash. This is a valuable property, so keep it in mind for the future.
Given two hashes H(A) and H(B), with pre-images A and B, H(A) can't be equal to H(B). Basically, each hash, for the most part, is unique. The keyword here is “for the most part.” To understand what we mean, let’s look into the birthday paradox.
If you go out on the streets, then the chances are very low that you’ll have the same birthday as some random stranger you encounter. However, if you gather up 20-30 people in one room, there is as much as a 50-50 chance for two people to share the same birthday.
Why does this happen?
Imagine that you have N different possibilities of an event happening. In this case, you need a square root of N to have a 50% chance of a collision. When it comes to birthdays, there are 365 distinct possibilities of that event happening. Upon square rooting that, we get sqrt(365) = ~23. This is why, in a room with 20-30 people, there is a chance of a collision.
How is this applicable in hashing? Well...if you have a 128-bit hash with 2^128 possibilities. By square rooting, we have 2^64 options. What this means is that we have a collision instance at the 2^64th instance. By this logic, no hash function in the world is collision-free.
The final property is puzzle-friendliness. For every output Y, if we choose a random variable “k” from a high min-entropy distribution, it is infeasible to find a random number x, such that:
H(k|x) = Y.
Let’s look at the terms being described here:
In computer science, there are are two very well-known data structures that play a pivotal role in blockchain architectures:
In traditional programming, pointers are variables that store the address of another variable. These variables are pointing towards another variable. A hash pointer is a special class of pointers that contain the hash of the value of the variable that it’s pointing towards. This is one of the most well-known cases of hashing in computer science.
This is what a linked list looks like:
A linked list is a sequence of blocks, each of which contains some data. A pointer links the blocks to each other and contains the address of the next block in it. You are probably wondering that this is all well and good, however, this isn’t an example of hashing in computer science, right?
Well, let’s take things to a whole other level.
What if, instead of a pointer, we use a hash pointer that contains the hash of the preceding block?
Does it sound familiar?
Well, it should, because that’s what a blockchain technically is.
Before we go any further, there is another concept that should be made clear when it comes to hashing in computer science – hashing vs encryption. While both are techniques used in cryptography, they are still vastly different.
We hope that this has clarified the whole encryption vs hashing debate for you. Alright, let’s move on.
The following is a simplified version of the bitcoin blockchain.
As you can see, it kind of looks like a simple linked list with hash pointers. Now you are probably wondering, why are we even using a hash pointer in the first place? Well, this is how we impart one of the blockchain’s most important features – immutability.
Immutability means that once you input data into the blockchain, it should be completely non-tamperable. There is nothing that you can do to remove the data. To understand how this works, keep in mind the “snowball effect” property of hash functions. Now imagine that someone has attempted to remove data from block #2 of our blockchain.
This creates a whole chain reaction, which should theoretically change the whole blockchain, but since that’s an impossibility, the blockchain becomes immutable.
Now that we have a basic understanding of how cryptographic hashing works let’s answer the million-dollar question – what hashing algorithm does bitcoin use to hash blocks? In previous sections, we have repeatedly used this hashing algorithm, called “SHA-256.” Over here, SHA stands for Secure Hashing Algorithms.
SHA is a family of cryptographic hash algorithms that have been published by NIST or the National Institute of Standards and Technology. The SHA is a U.S. Federal Information Processing Standard (FIPS). The different types of SHA algorithms are as follows:
Hashing is primarily used in two Bitcoin operations:
Specialized Bitcoin nodes called “miners” continuously mine new Bitcoin blocks and ensure that the blockchain keeps on growing in length. To understand POW mining, we need to understand a metric called “difficulty.” Remember, there are only 21 million bitcoins that will ever be mined. If the miners aren’t restricted, nothing will stop them from thoroughly mining out all the coins in existence.
Along with this, you should also keep in mind that Bitcoin must have a consistent block time of 10 mins. If this block time isn’t maintained, it results in:
To honor these restrictions, the Bitcoin protocol has a metric called “difficulty.” As is pretty self-evident, the difficulty metric defines how cryptographically hard the puzzle that the miners will need to solve has to be. The difficulty target is a 64-character string and begins with a bunch of zeroes when it comes to bitcoins. Depending on the ease of mining, the difficulty level changes every 2016th block.
A fascinating hash property that we saw during our discussion above is puzzle-friendliness. To give you a quick recap, for every output “Y,” it is infeasible to find an input “x” such that H(k|x) = Y. In this case:
The design principles of the whole mining process is as follows:
Another term that gets tossed around a lot when it comes to Bitcoin and POW cryptocurrencies, in general, is “hash rate.” It is a metric that calculates the speed with which hash operations are taking place within the network.
The second major hashing use case in cryptocurrencies is the creation of public addresses. So, how do you convert your public key to public addresses?
Hashing is one of the components that form the heart and soul of the cryptocurrencies and blockchain technology. We hope that this guide was able to clearly demonstrate to you what is hashing. Plus, we also looked into what hashing algorithm does bitcoin use to hash blocks. If you are passionate about this space, then having a general idea of how cryptographic hash functions work is an absolute necessity.
Speaking of which, if you are genuinely passionate about this space and want to forge your career here, then come on over! Enroll for one of our world-famous blockchain education courses at Ivan on Tech Academy. The courses have been created by industry experts and our very own in-house trainers.
Get to work in a fast growing industry. Start learning blockchain together with our 20,000+ students today.