Read text file in Pandas

Read text file in Pandas

A dataset has the data neatly arranged in rows and columns. The pandas module in Python allows us to load DataFrames from external files and work on them. The dataset can be in different types of files.

In this tutorial, we will read text file in pandas module in Python.

Using the read_csv() function to read text files in Pandas

The read_csv() function is traditionally used to load data from CSV files as DataFrames in Python. However, a CSV is a delimited text file with values separated using commas. Hence, we can use this function to read text files also.

We can specify various parameters with this function.

  • The header parameter can ensure whether you want the first row to be the header row or not. By default, it will consider first row to be header of dataframe.
  • We can specify the index with the index_col parameter.
  • The names of the columns can be mentioned in the names parameter.
  • The value of the delimiter parameter is taken as the separator, which is a comma by default.

Apart from these, there are many more parameters available. These can be found on the documentation for the function on the pandas website.

See the following example.

File Content:

Code:

Output:

Notice that in the above example, the final shape of the DataFrame is (4,1). This means that the function read the whole file into a single column. This happened because the default delimiter in the case of read_csv() function is a comma. If we wish to rectify this, we need to specify the delimiter as a space.

See the code below.

File Content:

Code:

Output:

Similarly, we can also have a file where the delimiter is a comma.

See the code below.

File Content:

Code:

Output:

Using the read_table() function to read text files in Pandas

We can read tables from different files using the read_table() function in pandas. This function reads a general delimited file to a DataFrame object.

This function is essentially the read_csv() function with the delimiter = '\t', instead of a comma by default.

For example,

File Content:

Code:

Output:

Note that here also we specified the delimiter as a space. Otherwise, we would have encountered the same problem. This function also supports different parameters like header, index_col, names, and more, but was deprecated in recent versions of pandas.

Using the read_fwf() function to read text files in Pandas

The fwf in the read_fwf() function stands for fixed-width lines. We can use this function to load DataFrames from files. This function also supports text files.

Fixed-width formatted files are not delimited by commas, tabs.

For example,

File Content:

Code:

Output:

Since the columns in the file were separated with a fixed-width, this function read the contents effectively into separate columns.

That’s all about read text file in Pandas.

Related Posts

  • Select rows by multiple conditions using loc in Pandas
    29 July

    Select rows by multiple conditions using loc in Pandas

    The loc() function in a pandas module is used to access values from a DataFrame based on some labels. It returns the rows and columns which match the labels. We can use this function to extract rows from a DataFrame based on some conditions also. First, let us understand what happens when we provide a […]

  • Split dataframe in Pandas
    28 July

    Split dataframe in Pandas

    Table of ContentsUsing the iloc() function to split DataFrame in PythonBy RowsBy ColumnsUsing the sample() function to split DataFrame in PythonUsing the groupby() function to split DataFrame in PythonUsing the columns to split DataFrame in Python In real-life scenarios, we deal with massive datasets with many rows and columns. At times, we may want to […]

  • Copy DataFrame in Python
    10 July

    Copy DataFrame in Pandas

    This articles provide different ways to copy DataFrame in Pandas.

  • Pandas convert column to int
    18 June

    Pandas convert column to int

    Table of ContentsUse the to_numeric() function to convert column to intUse the astype() function to convert column to intUse the infer_objects() function to convert column to intUse the convert_dtypes() function to convert column to int Pandas is a library set up on top of the Python programming language and is mostly used for the purpose […]

  • 20 September

    Reorder the columns of pandas dataframe in Python

    Table of ContentsUsing reindex methodUsing column selection through column nameUsing column selection through column index In this post, we will see 3 different methods to Reordering the columns of Pandas Dataframe : Using reindex method You can use DataFrame’s reindex() method to reorder columns of pandas DataFrame. You need to pass columns=[$list_of_columns] to reindex() method […]

  • 08 September

    Pandas create Dataframe from Dictionary

    Table of ContentsUsing a Dataframe() method of pandas.Using DataFrame.from_dict() method. In this tutorial, We will see different ways of Creating a pandas Dataframe from Dictionary . Using a Dataframe() method of pandas. Example 1 : When we only pass a dictionary in DataFrame() method then it shows columns according to ascending order of their names […]

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to our newletter

Get quality tutorials to your inbox. Subscribe now.