Read text file in Pandas

Read text file in Pandas

A dataset has the data neatly arranged in rows and columns. The pandas module in Python allows us to load DataFrames from external files and work on them. The dataset can be in different types of files.

In this tutorial, we will read text file in pandas module in Python.

Using the read_csv() function to read text files in Pandas

The read_csv() function is traditionally used to load data from CSV files as DataFrames in Python. However, a CSV is a delimited text file with values separated using commas. Hence, we can use this function to read text files also.

We can specify various parameters with this function.

  • The header parameter can ensure whether you want the first row to be the header row or not. By default, it will consider first row to be header of dataframe.
  • We can specify the index with the index_col parameter.
  • The names of the columns can be mentioned in the names parameter.
  • The value of the delimiter parameter is taken as the separator, which is a comma by default.

Apart from these, there are many more parameters available. These can be found on the documentation for the function on the pandas website.

See the following example.

File Content:

Code:

Output:

Notice that in the above example, the final shape of the DataFrame is (4,1). This means that the function read the whole file into a single column. This happened because the default delimiter in the case of read_csv() function is a comma. If we wish to rectify this, we need to specify the delimiter as a space.

See the code below.

File Content:

Code:

Output:

Similarly, we can also have a file where the delimiter is a comma.

See the code below.

File Content:

Code:

Output:

Using the read_table() function to read text files in Pandas

We can read tables from different files using the read_table() function in pandas. This function reads a general delimited file to a DataFrame object.

This function is essentially the read_csv() function with the delimiter = '\t', instead of a comma by default.

For example,

File Content:

Code:

Output:

Note that here also we specified the delimiter as a space. Otherwise, we would have encountered the same problem. This function also supports different parameters like header, index_col, names, and more, but was deprecated in recent versions of pandas.

Using the read_fwf() function to read text files in Pandas

The fwf in the read_fwf() function stands for fixed-width lines. We can use this function to load DataFrames from files. This function also supports text files.

Fixed-width formatted files are not delimited by commas, tabs.

For example,

File Content:

Code:

Output:

Since the columns in the file were separated with a fixed-width, this function read the contents effectively into separate columns.

That’s all about read text file in Pandas.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *