Merkle Tree contract

A Merkle tree, also known as a hash tree, is a data structure used in cryptography and computer science to verify data integrity and consistency. It is a binary tree where each leaf node represents the cryptographic hash of some data (a transaction for example), and each non-leaf node represents the cryptographic hash of its child nodes. This hierarchical structure allows efficient and secure verification of the data integrity.

Here's a quick summary of how it operates and what functionalities it supports:

How it works:

  1. Leaves Creation:
    • Some data is hashed to create a leaf node.
  2. Intermediate Nodes Creation:
    • Pairwise hashes of the leaf nodes are combined and hashed again to create parent nodes.
    • This process continues until only one hash remains, known as the Merkle root.
  3. Merkle Root:
    • The final hash at the top of the tree, representing the entire dataset.
    • Changing any single data block will change its corresponding leaf node, which will propagate up the tree, altering the Merkle root.

Key Features:

  1. Efficient Verification:

    • Only a small subset of the tree (the Merkle proof) is needed to verify the inclusion of a particular data block, reducing the amount of data that must be processed.
  2. Data Integrity:

    • The Merkle root ensures the integrity of all the underlying data blocks.
    • Any alteration in the data will result in a different root hash.

Examples of use cases:

  1. Fundamental use case: Ethereum blockchain integrity

    • Cryptocurrencies like Ethereum use Merkle trees to efficiently verify and maintain transaction integrity within blocks.
    • Each transaction in a block is hashed to form leaf nodes, and these hashes are recursively combined to form a single Merkle root, summarizing all transactions.
    • The Merkle root is stored in the block header, which is hashed to generate the block's unique identifier.
    • Guaranteed Integrity: Any change to a transaction alters the Merkle root, block header, and block hash, making it easy for nodes to detect tampering.
    • Transaction verification: Nodes can verify specific transactions via Merkle proofs without downloading the entire block.
  2. Whitelist inclusion

    • Merkle trees allow efficient whitelist verification without storing the full list on-chain, reducing storage costs.
    • The Merkle root of the whitelist is stored on-chain, while the full list remains off-chain.
    • To verify if an address is on the whitelist, a user provides a Merkle proof and the address. The Merkle root is recalculated using the provided data and compared to the stored on-chain root. If they match, the address is included; if not, it's excluded.
  3. Decentralized Identity Verification

    • Merkle trees can be used in decentralized identity systems to verify credentials.
    • Off-chain data: a user's credentials.
    • On-chain data: the Merkle root representing the credentials.

Visual example

Diagram of the Merkle Tree

The above diagram represents a merkle tree.
Each leaf node is the hash of some data.
Each other node is the hash of the combination of both children nodes.

If we were to verify the hash 6, the merkle proof would need to contain the hash 5, hash 12and hash 13:

  1. The hash 5 would be combined with the hash 6 to re-compute the hash 11.
  2. The newly computed hash 11 in step 1 would be combined with hash 12 to re-compute hash 14.
  3. The hash 13 would be combined with the newly computed hash 14 in step 2 to re-compute the merkle root.
  4. We can then compare the computed resultant merkle root with the one provided to the verify function.

Code

The following implementation is the Cairo adaptation of the Solidity by Example - Merkle Tree contract.

#[generate_trait]
pub impl ByteArrayHashTraitImpl of ByteArrayHashTrait {
    fn hash(self: @ByteArray) -> felt252 {
        let mut serialized_byte_arr: Array<felt252> = ArrayTrait::new();
        self.serialize(ref serialized_byte_arr);

        core::poseidon::poseidon_hash_span(serialized_byte_arr.span())
    }
}

#[starknet::interface]
pub trait IMerkleTree<TContractState> {
    fn build_tree(ref self: TContractState, data: Array<ByteArray>) -> Array<felt252>;
    fn get_root(self: @TContractState) -> felt252;
    // function to verify if leaf node exists in the merkle tree
    fn verify(
        self: @TContractState, proof: Array<felt252>, root: felt252, leaf: felt252, index: usize
    ) -> bool;
}

mod errors {
    pub const NOT_POW_2: felt252 = 'Data length is not a power of 2';
    pub const NOT_PRESENT: felt252 = 'No element in merkle tree';
}

#[starknet::contract]
pub mod MerkleTree {
    use core::poseidon::PoseidonTrait;
    use core::hash::{HashStateTrait, HashStateExTrait};
    use starknet::storage::{
        StoragePointerWriteAccess, StoragePointerReadAccess, Vec, MutableVecTrait, VecTrait
    };
    use super::ByteArrayHashTrait;

    #[storage]
    struct Storage {
        pub hashes: Vec<felt252>
    }

    #[derive(Drop, Serde, Copy)]
    struct Vec2 {
        x: u32,
        y: u32
    }

    #[abi(embed_v0)]
    impl IMerkleTreeImpl of super::IMerkleTree<ContractState> {
        fn build_tree(ref self: ContractState, mut data: Array<ByteArray>) -> Array<felt252> {
            let data_len = data.len();
            assert(data_len > 0 && (data_len & (data_len - 1)) == 0, super::errors::NOT_POW_2);

            let mut _hashes: Array<felt252> = ArrayTrait::new();

            // first, hash every leaf
            for value in data {
                _hashes.append(value.hash());
            };

            // then, hash all levels above leaves
            let mut current_nodes_lvl_len = data_len;
            let mut hashes_offset = 0;

            while current_nodes_lvl_len > 0 {
                let mut i = 0;
                while i < current_nodes_lvl_len - 1 {
                    let left_elem = *_hashes.at(hashes_offset + i);
                    let right_elem = *_hashes.at(hashes_offset + i + 1);

                    let hash = PoseidonTrait::new().update_with((left_elem, right_elem)).finalize();
                    _hashes.append(hash);

                    i += 2;
                };

                hashes_offset += current_nodes_lvl_len;
                current_nodes_lvl_len /= 2;
            };

            // write to the contract state (useful for the get_root function)
            for hash in _hashes.span() {
                self.hashes.append().write(*hash);
            };

            _hashes
        }

        fn get_root(self: @ContractState) -> felt252 {
            let merkle_tree_length = self.hashes.len();
            assert(merkle_tree_length > 0, super::errors::NOT_PRESENT);

            self.hashes.at(merkle_tree_length - 1).read()
        }

        fn verify(
            self: @ContractState,
            mut proof: Array<felt252>,
            root: felt252,
            leaf: felt252,
            mut index: usize
        ) -> bool {
            let mut current_hash = leaf;

            while let Option::Some(value) = proof.pop_front() {
                current_hash =
                    if index % 2 == 0 {
                        PoseidonTrait::new().update_with((current_hash, value)).finalize()
                    } else {
                        PoseidonTrait::new().update_with((value, current_hash)).finalize()
                    };

                index /= 2;
            };

            current_hash == root
        }
    }
}
Last change: 2024-10-01, commit: 3698499