How to Calculate SHA256 Hash of a File in Python

Hey! Have you ever wanted to check if a file has been tampered with or changed in any way? Well, to calculate the SHA256 hash of a file in Python can help you do just that!

Calculating the SHA256 Hash of a File is achieved using the built-in hashlib module. This involves reading the file’s contents in binary mode, creating a SHA256 object, updating the object with the file contents, and obtaining the hash value by calling the hexdigest method.

Calculate SHA256 Hash of a File in Python

Introduction

In today’s world of digital security and cryptography. It is essential to know the file that is transferred via a network is the same file that is received at the destination. To make this happen hashing plays an important role. A hash function is a one-way function that takes an input (message or data) and produces a fixed-size output, known as a hash value or digest. Out of that SHA256 hash function is one of the most widely used hash functions and is commonly used to verify the integrity of files, passwords, and digital signatures.

In this article, We gonna learn how Python made it easy to calculate the SHA256 hash of a file using the built-in hashlib module.

Prerequisites

To make sure the below-provided examples and explanations are understandable, you’ll need:

  • Python 3.6 or higher installed on your system
  • A text editor or IDE of your choice

Calculating the SHA256 Hash of a File

Here’s the step-by-step process for calculating the SHA256 hash of a file in Python:

Step 1: Import the hashlib module

The hashlib module provides a range of hash functions, including the SHA256 hash function. To use the SHA256 hash function, we need to import the hashlib module, As it is a build-in module, We don’t need to install it separately

import hashlib

Step 2: Create a function to calculate the SHA256 hash

Next, We are going to create a function that calculates the SHA256 hash by taking the path of a file as input and returning the SHA256 hash value of the file as output

Function to calculate the SHA256 hash

def calculate_sha256(filepath):
    # Initialize the SHA256 hash object
    sha256_hash = hashlib.sha256()

    # Open the file in binary mode
    with open(filepath, 'rb') as f:
        # Read the file in chunks of 4096 bytes
        for chunk in iter(lambda: f.read(4096), b''):
            # Update the hash object with the chunk
            sha256_hash.update(chunk)

    # Get the hexadecimal digest of the hash value
    sha256_hex = sha256_hash.hexdigest()

    return sha256_hex

Let’s break down this function for better learning:

  • Initially, we created a new SHA256 hash object using the hashlib.sha256() method.
  • Next, we opened the file in binary mode using the built-in open() function and the ‘rb’ mode specifier. Binary mode is important because hash functions operate on binary data, not text data.
    • NOTE: ‘rb’ means opening the file in binary format for reading
  • Next, We read the file in chunks of 4096 bytes using the iter() and lambda function. By dividing the file into chunks, We are reducing memory usage and improving the performance, especially for large files.
  • For each chunk of the file, we update the hash object with the chunk using the update() method.
  • At last, we get the hexadecimal digest of the hash value using the hexdigest() method and return it.

Step 3: Calculate the SHA256 hash of a file using the custom function

As we have created our custom calculate_sha256() function to calculate the SHA256 hash in the previous step. Now, We are going to use the function to calculate the hash value

Code to call the custom function:

filepath = '/path/to/file'
sha256_hex = calculate_sha256(filepath)
print(sha256_hex)

In the above code, We have added the file path in the variable “filepath“, For which we are calculating the hash value. Then, we are calling our calculate_sha256() function filepath as input and store the hash value in the sha256_hex variable. Once done, we are printing the hash value to the console using the print() function.

Final Code:

The below code can be run by simply copying and pasting the code, but make sure to change the “file path” based on your local filesystem ๐Ÿ™‚

def calculate_sha256(filepath):
    # Initialize the SHA256 hash object
    sha256_hash = hashlib.sha256()

    # Open the file in binary mode
    with open(filepath, 'rb') as f:
        # Read the file in chunks of 4096 bytes
        for chunk in iter(lambda: f.read(4096), b''):
            # Update the hash object with the chunk
            sha256_hash.update(chunk)

    # Get the hexadecimal digest of the hash value
    sha256_hex = sha256_hash.hexdigest()

    return sha256_hex

import hashlib
filepath = '/Users/user1/PycharmProjects/pythonProject/testing'
sha256_hex = calculate_sha256(filepath)
print(sha256_hex)

Output:

12f2d261aa9a3d2fd320b378f5b69ea00077e964a77e57e6573f7549bcbbeb7b

Process finished with exit code 0

In the above code, We got the hash value for the file “testing”, running this code multiple times (With no change in the file content) will give you the same results So, that we can compare the file before and after transmitting in the network and can say if there are any changes in the file or not.

Let’s say there are some changes in the file “testing”, then the hash value will be changed as below

Example:

e28e2d0242876ef5c601670f78d980adf01a305db35b369d2dd325648825159b

Process finished with exit code 0

As we can see the hash value is changed because the file content has been changed. Using this method we can confirm the received file has been tampered or corrupted

Conclusion

In this article, we learned how to calculate the SHA256 hash of a file in Python using the built-in hashlib module. We have created a function that takes the path of a file as input and returns the SHA256 hash value of the file, and we used this function to calculate the hash value of a file.

Knowing how to calculate the SHA256 hash of a file is a useful skill for anyone who works with digital files and wants to verify their authenticity and integrity. This can be particularly useful in fields such as cybersecurity, software development, and digital forensics.

Good Luck with your Learning !!

Related Topics:

How to Calculate MD5 Hash of a File in Python

Easiest Way to Connect Hive with Python:PyHive

Could not import azure.core Error in Python

Python Hangman Code โ€“ Building a Classic Game from Scratch

Jerry Richard R
Follow me