How to Calculate MD5 Hash of a File in Python
MD5 (Message Digest Algorithm 5) is a widely used cryptographic hash function that generates a fixed-length 128-bit hash value. In this article, we’ll learn how to calculate the MD5 hash of a file in Python.
The MD5 hash value of a file in Python can be computed using the hashlib module. By invoking hashlib.md5() method, an MD5 hash object is created which can be finalized using the hexdigest() method to get the hash value of the file. hashlib is a built-in module in Python, supporting various secure hash algorithms such as MD5.
Use of computing MD5 Hash
MD5 is a popular cryptographic hash function used for data security and integrity purposes. It generates a fixed-size output based on the input file. It can be used to verify that the data has not been tampered with or corrupted.
Data integrity can be checked by computing the hash value of a file using the MD5 algorithm before transmission and then comparing it to the hash value computed at the recipient end. If the values match, the file is considered unaltered and free from corruption (As shown in the image below).
![Calculate MD5 Hash of a File](https://learnerkb.com/wp-content/uploads/2023/05/Screenshot-2023-05-03-at-10.17.55-PM-1024x442.png)
What is hashlib module?
hashlib
is a module in the Python standard library that provides a variety of hash functions for use in cryptography. hashlib
module provides an implementation of various hash algorithms like SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and MD5.
We have already discussed about SHA-256 algorithm in this article: How to Calculate SHA256 Hash of a File in Python
Each algorithm takes in an input (such as a string or file) and produces a fixed-size string of bytes as an output, which is known as a hash digest or hash value.
How to calculate the MD5 hash of a file in Python using hashlib?
Below is the step-by-step guide for calculating the MD5 hash of a file in Python using hashlib
the module:
Step 1: Import the hashlib module
To use the MD5 hash function, we need to import the hashlib module, As it is a build-in module, We don’t need to install it separately
import hashlib
Step 2: Open the File in binary mode
In this step, we open the file in binary mode using the open()
function.
open(filename, "rb")
NOTE: ‘rb’ mentioned in the code is to open the file in binary format for reading
Step3: Read the File contents
Once the file has been opened, We can read the contents of the file in binary mode using the read()
method.
file_contents = f.read()
Step4: Calculate the MD5 hash
With the file content saved as a variable “file_contents”. we can Calculate the MD5 hash of the file contents using the hashlib.md5()
function.
hashlib.md5(file_contents)
Step5: Convert to a hexadecimal string
At last, We can convert the hash value to a hexadecimal string using the hexdigest()
method.
hashlib.md5(file_contents).hexdigest()
Final Code:
import hashlib filename = "example_file.txt" with open(filename, "rb") as f: # Read the contents of the file in binary mode file_contents = f.read() # Calculate the MD5 hash of the file contents md5_hash = hashlib.md5(file_contents).hexdigest() print(f"The MD5 hash of {filename} is: {md5_hash}")
Output:
The MD5 hash of /Users/user1/PycharmProjects/pythonProject/testing.py is: 8491f54e6ce1ceca6b949ba37e393a12 Process finished with exit code 0
Real-time Example:
Calculating the MD5 hash of a text file
Create a text file: example.txt
that contains the following sentence:
Codingspell is a platform to share knowledge in which others can able to learn from your mistake and experience. Visit codingspell.com
Once created, We can proceed to calculate the MD5 hash of this file using the below code(
NOTE: I have used pycharm as IDE to execute all the code provided in this article
import hashlib filename = "example.txt" with open(filename, "rb") as f: # Read the contents of the file file_contents = f.read() # Calculate the MD5 hash md5_hash = hashlib.md5(file_contents).hexdigest() print(f"MD5 hash value for the file: {filename} is: {md5_hash}")
Output:
MD5 hash value for the /Users/User1/PycharmProjects/pythonProject/example.txt is: e1bda1f0014334423d80f7f4082d3fa6
Calculating the MD5 hash of a binary file
This is as same as calculating the MD5 hash of a text file, Below proceeding further, We need to Create a binary file, which contains some random binary data. Once created, we can proceed to calculate the MD5 hash of this binary file by providing the binary file as input as shown below (I have used pycharm IDE to execute the below Python code)
For a Quick test: You can download sample binary files from this link and rename or update the filename in the below code.
import hashlib filename = "example.bin" with open(filename, "rb") as f: # Read the contents of the file file_contents = f.read() # Calculate the MD5 hash md5_hash = hashlib.md5(file_contents).hexdigest() print(f"MD5 hash value for the file: {filename} is: {md5_hash}")
Output:
MD5 hash value for the file: /Users/User1/PycharmProjects/pythonProject/example.bin is: e1bda1f0014334423d80f7f4082d3fa6
Conclusion :
In Summary, we’ve learned to calculate the MD5 hash of a file in Python using the built-in `hashlib` module. We have discussed each step in detail like how to open a file in binary mode, read its contents, and calculate the MD5 hash of the file contents using the `hashlib.md5()` function.
Also Shared a few examples to calculate the MD5 hash of a custom text file and a binary file. Calculating the MD5 hash of a file is a useful technique for verifying data integrity, detecting changes in files, and ensuring that files haven’t been tampered with or corrupted. With the `hashlib` module in Python, calculating the MD5 hash of a file is easy and straightforward.
Good Luck with your Learning !!
Related Topics:
Python Create Directory If It Doesn’t Exist
Python was not found; run without arguments
ssl module in Python is not available
Mastering Python List of Dictionaries: Your Step-By-Step Guide
- How to Fix – TypeError: only size-1 arrays can be converted to Python scalars - 16 October 2023
- How to Implement d’wave qbsolv in Python - 16 October 2023
- Resolve Javascript error: ipython is not defined - 15 October 2023