Sort CSV by Column in Bash

Bash sort CSV by column

Bash Sort CSV by Column

Using sort Command

Use bash’s sort command to sort the CSV file by column.

In this example, the sort command is used to sort the myFile.csv file by column. Here, the -t parameter is used to specify the field separator, which is , in the above case. Then, the -k option indicates the column to sort by.

In this case, -k2 represented the second column (Age, which contains numeric values) of the given CSV file that needs to be sorted. We can observe, after the execution of the above command, the Age column is sorted in ascending order.

It should be noted that the header of the file is moved at the end because there is no numeric value. Now, see the example below to understand how to sort the file by skipping the header.

In the above example, the head command first read the myFile.csv file and extracted the first line (a header). Here, the -n option is used to represent the line number which is 1 in this case. After that, the tail command is used to skip the first line (header) from the sorting processing; its output is then passed to the sort command to sort the second column of the file.

In case you want to sort CSV column by descending order, you can use -r option.

Have a look at another example to sort non-numeric columns:

This example is the same as above; we only replaced the column number k2 with k1 because we wanted to sort the non-numeric Name column alphabetically.

Note that the original file remains unchanged. The sorted output is displayed on the console by default. If you want to save the sorted columns in a file, check the below example:

This command is the same as above, but it redirected the output to the sortedFile.csv file using the> operator. If the file does not exist, it will be created. If the file already exists, it will be overwritten with the new sorted output.

To verify, let’s check the content of sortedFile.csv.

Using csvsort Utility

Use the csvsort utility in bash to sort the CSV file by column.

This example used the csvsort utility with the -c option to sort the myFile.csv file by column. Here, c indicates the column number, 2, in the above case. Then, the sorted output is redirected to the sortedFile.csv.

Let’s check the sortedFile.csv content to see the sorted column.

We can observe that column number 2 (Age) is sorted.

Please note that csvsort treats the first row as the header row by default and preserves it in the output. If your CSV file doesn’t have a header row, use the --no-header-row option with csvsort to indicate no header row. Otherwise, it will consider the first row as a header.

Here’s an example command to sort a CSV file without a header row:

The above command sorted the file by column 2, treating the first row as data instead of a header row. Here, a, b and c in the first row represent the header column.

To use csvsort, you must install csvkit. Run the pip install csvkit command to install it via pip if it is not installed on your system. Once it is installed, you will be able to use the csvsort utility.

Bash Sort CSV by Multiple Columns

In case, you want to sort CSV with multiple columns, you can use sort command with -k option and pass comma separated column numbers to sort it.

Let’s change content of the file as below:

As you can see, since we provided -k1,2 option, it first sorted based on name(1st column) and if names were same, then it sorted based on age(2nd column).

That’s all about how to Sort CSV by Column in Bash.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *