Create your own blockchain using Python (pt. 1)

The basics

6 min readJul 2, 2021

In this first section of this tutorial, we will get our hands dirty by writing our first blockchain. But before deep diving in code, I think it is important for us to have a common understanding of what is a blockchain. Here, I’ll use Wikipedia’s definition.

A blockchain is a growing list of records, called blocks, that are linked together using cryptography. Each block contains a cryptographic hash of the previous block, a timestamp, and transaction data.

You can read more about blockchain here.

Creating our first Blockchain in Python

A block

Let’s start by creating a block using Object Oriented Programming. Here we create a class called Block:

# node/block.pyclass Block(object):
    def __init__(self):
        pass

A user would create instances of Block like so:

block_1 = Block()
block_2 = Block()
block_3 = Block()

Ain’t this beautiful?

Block

A chain of blocks

Now, let’s chain them together, very much like a linked list.

# node/block.pyclass Block(object):
    def __init__(self, previous_block=None):
        self.previous_block = previous_block# Example
block_0 = Block()
block_1 = Block(previous_block=block_0)
block_2 = Block(previous_block=block_1)

Chain of blocks

While this is an incredibly beautiful piece of code, this chain of blocks is not a blockchain because our blocks are not linked together using cryptography.

A blockchain

Looking at our wikipedia definition, we see that in order to be a blockchain, each block needs to contain 3 things:

a cryptographic hash of the previous block
a timestamp
transaction data

We’re lazy so let’s start with the easiest task: timestamps.

# node/block.pyfrom datetime import datetime


class Block(object):
    def __init__(self, timestamp: float, previous_block=None):
        self.timestamp = timestamp
        self.previous_block = previous_block

Chain of blocks with timestamps

Now let’s address transaction data. Blockchains are typically meant for use as a ledger, so that’s also how we’re going to build ours. Let’s imagine a village of 3 people where transactions are dealt with in cheese and we want to store those transactions. For this example, we’ll have those 3 transactions:

Albert pays 30 cheese to Bertrand
Albert pays 10 cheese to Camille
Bertrand pays 5 cheese to Camille

Let’s add transaction data part of our block attributes.

# node/block.py

class Block:
    def __init__(self, timestamp: float, transaction_data: str, previous_block=None):
        self.transaction_data = transaction_data
        self.timestamp = timestamp
        self.previous_block = previous_block

Chain of blocks with transaction data and timestamps

Here, our transaction data will contain the sender and receiver name as well as the amount of cheese that was sent. We’ll have those 3 pieces of information stored plain text in our block object.

# Examplefrom datetime import datetimetimestamp_0 = datetime.timestamp(datetime.fromisoformat('2011-11-04 00:05:23.111'))
transaction_data_0 = "Albert,Bertrand,30"
block_0 = Block(
    transaction_data=transaction_data_0,
    timestamp=timestamp_0
)

timestamp_1 = datetime.timestamp(datetime.fromisoformat('2011-11-07 00:05:13.222'))
transaction_data_1 = "Albert,Camille,10"
block_1 = Block(
    transaction_data=transaction_data_1,
    timestamp=timestamp_1,
    previous_block=block_0
)

timestamp_2 = datetime.timestamp(datetime.fromisoformat('2011-11-09 00:11:13.333'))
transaction_data_2 = "Bertrand,Camille,5"
block_2 = Block(
    transaction_data=transaction_data_2,
    timestamp=timestamp_2,
    previous_block=block_1
)

Now, the most interesting piece: each block should contain a cryptographic hash of the previous block. I strongly recommend to read on cryptographic hashes and hash functions on the Wikipedia page on them (yes Wikipedia is my only source of information). But for now, here are some properties of hash functions:

Hash functions map an input x to an output of fixed size called the hash.
Hashes are deterministic: an input x will always produce the same hash for a same hash function.
Knowing the hash, it is infeasible to determine x (preimage resistant).
It is infeasible to find both x and y where hash(x) = hash(y) (collision resistant).
Change x slightly and hash(x) changes significantly (avalanche effect).
Knowing hash(x) and part of x, it is infeasible to find x (puzzle friendliness)
Hashes are easy and fast to compute on modern computers.

Those properties make it so that if an attacker wanted to tamper with the blockchain and tried to modify any value inside of it, the last block’s hash would also be modified. There exists various different hash functions but for bitcoin and other cryptocurrencies, the hash function used is called sha256. This hash function will transform our initial data into a 256 bits size hash. In Python, hashing can easily be handled using the pycryptodome library. This package can be installed via pip:

pip3 install pycryptodome

Let’s create a method calculate_hash that will have the role to calculate the hash of a string data.

# node/utils.pyfrom Crypto.Hash import SHA256


def calculate_hash(data: bytes) -> str:
    h = SHA256.new()
    h.update(data)
    return h.hexdigest()

Now inside of our Block class we create two new methods:

cryprographic_hash: returns the hash of the block content
previous_block_cryptographic_hash: returns the previous block’s cryptograpghic hash

# block.pyimport json

from utils import calculate_hash, convert_transaction_data_to_bytes


class Block:
    def __init__(
            self,
            timestamp: float,
            transaction_data: str,
            previous_block=None,
    ):
        self.transaction_data = transaction_data
        self.timestamp = timestamp
        self.previous_block = previous_block

    @property
    def previous_block_cryptographic_hash(self):
        previous_block_cryptographic_hash = ""
        if self.previous_block:
            previous_block_cryptographic_hash = self.previous_block.cryptographic_hash
        return previous_block_cryptographic_hash

    @property
    def cryptographic_hash(self) -> str:
        block_content = {
            "transaction_data": self.transaction_data,
            "timestamp": self.timestamp,
            "previous_block_cryptographic_hash": self.previous_block_cryptographic_hash
        }
        block_content_bytes = json.dumps(block_content, indent=2).encode('utf-8')
        return calculate_hash(block_content_bytes)

Now let’s create our 3 blocks:

from datetime import datetime


timestamp_0 = datetime.timestamp(datetime.fromisoformat('2011-11-04 00:05:23.111'))
transaction_data_0 = "Albert,Bertrand,30"
block_0 = Block(
    transaction_data=transaction_data_0,
    timestamp=timestamp_0
)

timestamp_1 = datetime.timestamp(datetime.fromisoformat('2011-11-07 00:05:13.222'))
transaction_data_1 = "Albert,Camille,10"
block_1 = Block(
    transaction_data=transaction_data_1,
    timestamp=timestamp_1,
    previous_block=block_0
)

timestamp_2 = datetime.timestamp(datetime.fromisoformat('2011-11-09 00:11:13.333'))
transaction_data_2 = "Bertrand,Camille,5"
block_2 = Block(
    transaction_data=transaction_data_2,
    timestamp=timestamp_2,
    previous_block=block_1
)

And that’s it, you’ve created your first Blockchain. If you want, you can try to temper with the blockchain and change any piece of data inside of it and you will see that the last block’s hash will also have changed.

A Blockchain

A more efficient use of disk space: Merkle Trees

While the blockchain we created is awesome, in practice there is typically more than 1 transactions per block. For example, in Bitcoin there are around 2,759.12 transactions in each block. In order to save space, each block stores a summary of its content that we call its header. And inside of this header, only a hash representing all of the block’s transactions is stored. In pt.2 of this tutorial, we will deep dive into Merkle Trees and how they allow such a thing.

Code repository

https://github.com/gruyaume/my-blockchain/tree/basics

Create your own blockchain using Python