A Hash function is any function that can be used to map data of arbitrary length to fixed-size length. A trivial hash function is as follows:
where and are integers and is the modular (remainder after division by ) operations. can be of any arbitrary length integer, but is within the range . However, this does not qualify for cryptographically secured hash functions and these are mainly used in Blockchain Technology to verify the integrity of data.
According to the pigeonhole principle, the major problem with a hash function is to avoid collision as we are mapping something from the arbitrary length to the fixed length.
Suppose N items are put into M containers, where N < M than at least one container must contain more than one item. Thus, the collision must be there, but it can be minimized if the cryptographer well designs the hash function.
Let us first understand what a collision is in the hash function. Given the two distinct messages, M1 & M2 to the hash function H(*), which produce the hash value H(M1) & H(M2) where H(M1) == H(M2) are the same.
Everything is done online in the digital world era: Suppose there is a contract between two parties, and signing is done on the hash value of the agreement paper. If collision exists then, the malicious party can modify the contract agreement paper and force the other party to accept it in the future as the signature is valid for a forged agreement too.
Cryptographical Hash Functions:
A cryptographic hash function is a mathematical algorithm that takes an arbitrary amount of data as input and produces a fixed-size output called Hash or Digest or Checksum. A cryptographic hash function can assure data integrity.
Example of Cryptographically Secured Hash Functions
- SHA1 (SHA160), SHA256, SHA512
- RIPEMD160, RIPEMD256
Here the number indicates the security level or steps requires to break the hashing scheme using the brute force method: For example, SHA160 or SHA1 means to find collision it requests steps. However, because of the birthday paradox, it reduces to steps.
Important note: SHA1 collision is reported in 2017 in this paper published by Google researchers.
Security requirements for Cryptographically Secure Hash Functions
- Preimage resistance – Onewayness – It means given a , we can computer , but given a , no deterministic algorithm can computer .
- Second preimage resistance – (Weak collision resistance property) – It means given a and respective , an adversary need to come up another and such that . This is assumed to be a weak collision resistance property as there is a restriction imposed to generate same hash value.
- Collision resistance – (Strong collision resistance property) – It means adversary can choose any & such that and come up with the same hash value. If any cryptographic algorithm adheres to a strong collision resistance property, it means the adversary takes more time and computing power to find a collision.
- Avalanche effect – A small change in the input data results in a significant change in the output hash value. The example is illustrated in the below diagram.
References
- NPTEL lecture series on Blockchains Architecture, Design and Use Cases by Prof. Sandip Chakraborty, IITK.
304 total views, 1 views today