Check if Two Files are the Same in Bash [7 Ways]

Table of Contents

Using cmp Command
Using Hash Values
Using -ef Operator
Using diff Command
Using vimdiff Command
Using cksum Command
Using rdfind Command

In this entire article, we will use the following content of three different text files to practice the provided solutions.


This is Line number 1.
This is Line Number 2.
It is Line number 3.
I am Line 4.

This is Line number 1.

This is Line Number 2.

It is Line number 3.

I am Line 4.


This is Line number 1.
This is Line Number 2.
It is Line number 3.
I am Line 4.

This is Line number 1.

This is Line Number 2.

It is Line number 3.

I am Line 4.


Greetings! Welcome to Java2Blog!

Greetings! Welcome to Java2Blog!

Using `cmp` Command

Use the cmp command to check if two files are the same in Bash.


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File2.txt"
if cmp -s "$file1" "$file2"; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File2.txt"

if cmp -s "$file1" "$file2"; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are the same.

Both files are the same.

In the above example, we initialized the file1 and file2 variables with the two different text file locations. Then, we used the if statement with the cmp command to check if the content of $file1 and $file2 was the same.

If it was the same, then the echo from the if block was executed; otherwise, the else block would be executed.

How did this command work? The cmp command compared the $file1 and $file2 files byte by byte and helped to find whether the given two files were identical or not. Here, the -s option (alternatively used as --silent) suppressed the output of differences between $file1 and $file2.

If the files were identical, the cmp produced no results but if they differ, by default, the cmp returns the byte offset and the line number where the first difference occurred.

Let’s take another example where the files are different.


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File3.txt"
if cmp -s "$file1" "$file2"; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File3.txt"

if cmp -s "$file1" "$file2"; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are different.

Both files are different.

Using Hash Values

Use sha1sum to calculate hash values to check if two files are the same in Bash.


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File2.txt"
hash1=$(sha1sum "$file1" | cut -d ' ' -f 1)
hash2=$(sha1sum "$file2" | cut -d ' ' -f 1)
if [ "$hash1" == "$hash2" ]; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File2.txt"

hash1=$(sha1sum "$file1" | cut -d ' ' -f 1)

hash2=$(sha1sum "$file2" | cut -d ' ' -f 1)

if [ "$hash1" == "$hash2" ]; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are the same.

Both files are the same.

Here, we used the sha1sum to generate a SHA1 hash for the $file1 and piped the output to the cut command to grab the hash only. Because the sha1sum returned the output as hash filepath.

With the cut command, we used the -d option to specify the delimiter for splitting the input into fields, while -f 1 was used to select the first field, the hash, which we stored in the hash1 variable.

We repeated the process for getting a hash for $file2 and stored it in the hash2 variable. Then, we used the if statement with the == operator to check if both hashes are the same. If so, then we ran the echo inside the if block to display a message saying both files are the same.

If you have an error saying something similar to the bash: /home/user/File1.txt: Permission denied then run the chmod u+x filename.txt command on the bash console to allow permissions.

Let’s take another example below:


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File3.txt"
hash1=$(sha1sum "$file1" | cut -d ' ' -f 1)
hash2=$(sha1sum "$file2" | cut -d ' ' -f 1)
if [ "$hash1" == "$hash2" ]; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File3.txt"

hash1=$(sha1sum "$file1" | cut -d ' ' -f 1)

hash2=$(sha1sum "$file2" | cut -d ' ' -f 1)

if [ "$hash1" == "$hash2" ]; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are different.

Both files are different.

Similarly, you can use md5sum and sha256sum to calculate hash values to check if two files are the same in Bash.


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File2.txt"
hash1=$(md5sum "$file1" | cut -d ' ' -f 1)
hash2=$(md5sum "$file2" | cut -d ' ' -f 1)
if [ "$hash1" == "$hash2" ]; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File2.txt"

hash1=$(md5sum "$file1" | cut -d ' ' -f 1)

hash2=$(md5sum "$file2" | cut -d ' ' -f 1)

if [ "$hash1" == "$hash2" ]; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are the same.

Both files are the same.


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File2.txt"
hash1=$(sha256sum "$file1" | cut -d ' ' -f 1)
hash2=$(sha256sum "$file2" | cut -d ' ' -f 1)
if [ "$hash1" == "$hash2" ]; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File2.txt"

hash1=$(sha256sum "$file1" | cut -d ' ' -f 1)

hash2=$(sha256sum "$file2" | cut -d ' ' -f 1)

if [ "$hash1" == "$hash2" ]; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are the same.

Both files are the same.

Using `-ef` Operator

Use the -ef operator to check if two files are the same in Bash.


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File2.txt"
if [ "$file1" -ef "$file2" ]; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File2.txt"

if [ "$file1" -ef "$file2" ]; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are the same.

Both files are the same.

Let’s take another example below:


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File3.txt"
if [ "$file1" -ef "$file2" ]; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File3.txt"

if [ "$file1" -ef "$file2" ]; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are different.

Both files are different.

Using `diff` Command

Use the diff command to check if two files are the same in Bash.


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File2.txt"
diff "$file1" "$file2" > /dev/null
if [ $? -eq 0 ]; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File2.txt"

diff "$file1" "$file2" > /dev/null

if [ $? -eq 0 ]; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are the same.

Both files are the same.

In the above example, we used diff command to compare the $file1 and $file2 line by line. We used the > operator to redirect the standard output of the diff command to the /dev/null device, which would effectively discard the output. In other words, this line performs the comparison silently without displaying any anything on the console.

Next, we used the if statement with the -eq operator to check the exit status of the previously run diff command. If the exit status was equal to 0, it meant the specified files were the same; otherwise, different.

In Bash, the exit status of the previously executed command is stored in the $? variable. By convention, an 0 exit code indicates successful execution (no errors), while a non-zero represents some error that occurred during execution.

Let’s have a look at another example below:


#!/bin/bash
file1="/home/user/File1.txt"
file2="/home/user/File3.txt"
diff "$file1" "$file2" > /dev/null
if [ $? -eq 0 ]; then
    echo "Both files are the same."
else
    echo "Both files are different."
fi

#!/bin/bash

file1="/home/user/File1.txt"

file2="/home/user/File3.txt"

diff "$file1" "$file2" > /dev/null

if [ $? -eq 0 ]; then

echo "Both files are the same."

else

echo "Both files are different."


Both files are different.

Both files are different.

Use the diff command with the -q option to check if the two files are the same. For this solution, you do not have to use if-else statements. It will show nothing if the files will be the same.


#!/bin/bash
diff -q /home/user/File1.txt /home/user/File3.txt

#!/bin/bash

diff -q /home/user/File1.txt /home/user/File3.txt


Files /home/user/File1.txt and /home/user/File3.txt differ

Files /home/user/File1.txt and /home/user/File3.txt differ

Use the diff command with the -c option to retrieve a comparison of two files in context mode.


#!/bin/bash
diff -q /home/user/File1.txt /home/user/File3.txt

#!/bin/bash

diff -q /home/user/File1.txt /home/user/File3.txt


*** /home/user/File1.txt        2023-07-14 05:20:08.852722832 +0000
--- /home/user/File3.txt        2023-07-20 09:58:58.717339962 +0000
***************
*** 1,4 ****
! This is Line number 1.
! This is Line Number 2.
! It is Line number 3.
! I am Line 4.
\ No newline at end of file
--- 1 ----
! Greetings! Welcome to Java2Blog!
\ No newline at end of file

*** /home/user/File1.txt 2023-07-14 05:20:08.852722832 +0000

--- /home/user/File3.txt 2023-07-20 09:58:58.717339962 +0000

***************

*** 1,4 ****

! This is Line number 1.

! This is Line Number 2.

! It is Line number 3.

! I am Line 4.

\ No newline at end of file

--- 1 ----

! Greetings! Welcome to Java2Blog!

\ No newline at end of file

In the above output, the * character was used to display the File1.txt related things, while the - character was used to show File3 related things. In the first two lines of the output, we got file locations with date and time.

The *** 1,4 **** represented the range of lines in File1.txt. However, the --- 1 --- indicated the number of lines in the File3.txt. Remember, we would have a range of lines for File3.txt as well if it would have multiple lines.

How did we identify the differences in the output?

The + meant the line was not present in the first file. Remove it from the second file or insert it in the first file to match.
The - is similar to the + with a little difference. It means the line existed in the first file but not in the second file. Remove it from the first file or insert it in the second file to match.
The ! represented that the line requires modifications to match.

Use the diff command with the -u option; it is similar to the context mode but without redundant details.


#!/bin/bash
diff -u /home/user/File1.txt /home/user/File3.txt

#!/bin/bash

diff -u /home/user/File1.txt /home/user/File3.txt


--- /home/user/File1.txt        2023-07-14 05:20:08.852722832 +0000
+++ /home/user/File3.txt        2023-07-20 09:58:58.717339962 +0000
@@ -1,4 +1 @@
-This is Line number 1.
-This is Line Number 2.
-It is Line number 3.
-I am Line 4.
\ No newline at end of file
+Greetings! Welcome to Java2Blog!
\ No newline at end of file

--- /home/user/File1.txt 2023-07-14 05:20:08.852722832 +0000

+++ /home/user/File3.txt 2023-07-20 09:58:58.717339962 +0000

@@ -1,4 +1 @@

-This is Line number 1.

-This is Line Number 2.

-It is Line number 3.

-I am Line 4.

\ No newline at end of file

+Greetings! Welcome to Java2Blog!

\ No newline at end of file

Let’s modify any letter’s case in any file. We modified the File1.txt by replacing the This with this in line 2. Now, if we run the diff command with the -q option, then it will say the files differ; see the following example.


#!/bin/bash
diff -q /home/user/File1.txt /home/user/File2.txt

#!/bin/bash

diff -q /home/user/File1.txt /home/user/File2.txt


Files /home/user/File1.txt and /home/user/File2.txt differ

Files /home/user/File1.txt and /home/user/File2.txt differ

What happened here? By default, the diff command does case-sensitive comparison. As we modified the letter’s case, the files became different.

Use the diff command with the -i option for case-insensitive comparison.


#!/bin/bash
diff -i /home/user/File1.txt /home/user/File2.txt

#!/bin/bash

diff -i /home/user/File1.txt /home/user/File2.txt

Now, we got nothing because the File1.txt and File2.txt were found identical.

Can we have any solution which can open an editor for us to modify the files if there is some difference that we don’t need? Yes, see the following section.

If you are looking for a standard output in coloured form, then replace the diff with colordiff as colordiff FirstFilePath SecondFilePath. Don’t forget to install colordiff using the sudo apt install colordiff command. Like diff, it will only show the output if the specified files are different; otherwise, not.

Using `vimdiff` Command

Use the vimdiff command to compare two files in Bash. You must install Vim Editor using sudo apt install vim to use this solution.


#!/bin/bash
vimdiff /home/user/File1.txt /home/user/File3.txt

#!/bin/bash

vimdiff /home/user/File1.txt /home/user/File3.txt

OUTPUT:

As you can see in the above screenshot, both files were opened side by side and the differences were highlighted.

Use :wq and hit Enter to exit from the file.

Using `cksum` Command

Use the cksum command to check if the two files are the same.


cksum /home/user/*.txt

cksum /home/user/*.txt


3232966274 79 /home/user/File1.txt
3232966274 79 /home/user/File2.txt
1972114799 66 /home/user/File3.txt

3232966274 79 /home/user/File1.txt

3232966274 79 /home/user/File2.txt

1972114799 66 /home/user/File3.txt

The cksum command is used to produce a CRC (Cyclic Redundancy Check) checksum for all the files in the given directory and output the checksum values with respective file names/paths and byte count. It would grab the files from the current directory if any path was not specified.

We used the cksum to get checksum values for all the .txt files in the specified directory. If the checksum values are the same, the files are the same; otherwise, not.

Do we have a solution to remove the duplicate files if found? Can we see the history before deleting them? Yes, let’s go to the following section.

Using `rdfind` Command

Use the rdfind command with the -deleteduplicates option to find duplicate files in the given directory and subdirectories, and delete them.


rdfind -deleteduplicates true /home/user

rdfind -deleteduplicates true /home/user


Now scanning "/home/user", found 3 files
Now have 3 files in total.
Removed 0 files due to nonunique device and inode
Total size is 193 bytes or 193 B
Removed 1 files due to unique sizes from list. 2 files left.
Now eliminating candidates based on first bytes:removed 0 files from list. 2 files left.
Now eliminating candidates based on last bytes:removed 0 files from list. 2 files left.
Now eliminating candidates based on sha1 checksum:removed 0 files from list. 2 files left.
It seems like you have two files that are not unique
Totally, 80 B can be reduced.
Now, making results file results.txt
Now deleting duplicates:
Deleted 1 files.

Now scanning "/home/user", found 3 files

Now have 3 files in total.

Removed 0 files due to nonunique device and inode

Total size is 193 bytes or 193 B

Removed 1 files due to unique sizes from list. 2 files left.

Now eliminating candidates based on first bytes:removed 0 files from list. 2 files left.

Now eliminating candidates based on last bytes:removed 0 files from list. 2 files left.

Now eliminating candidates based on sha1 checksum:removed 0 files from list. 2 files left.

It seems like you have two files that are not unique

Totally, 80 B can be reduced.

Now, making results file results.txt

Now deleting duplicates:

Deleted 1 files.

Using the -dryrun option before using -deleteduplicates helps in previewing the results rather than deleting them immediately.

That’s all about Check if two files are same in Bash.

Was this post helpful?

Let us know if this post was helpful. Feedbacks are monitored on daily basis. Please do provide feedback as that\'s the only way to improve.

Check if Two Files are the Same in Bash

Using `cmp` Command

Further reading:

Bash Check if Command Exists

If Not Condition in Bash

Using Hash Values

Using `-ef` Operator

Using `diff` Command

Using `vimdiff` Command

Using `cksum` Command

Using `rdfind` Command

Was this post helpful?

Author

Leave a Reply Cancel reply

Categories

Popular Posts

Let’s be Friends

Using cmp Command

Further reading:

Bash Check if Command Exists

If Not Condition in Bash

Using Hash Values

Using -ef Operator

Using diff Command

Using vimdiff Command

Using cksum Command

Using rdfind Command

Was this post helpful?

Related posts:

Share this

Author

Leave a Reply Cancel reply

Let’s be Friends

Using `cmp` Command

Using `-ef` Operator

Using `diff` Command

Using `vimdiff` Command

Using `cksum` Command

Using `rdfind` Command