Read text file in Pandas

Read text file in Pandas

A dataset has the data neatly arranged in rows and columns. The pandas module in Python allows us to load DataFrames from external files and work on them. The dataset can be in different types of files.

In this tutorial, we will read text file in pandas module in Python.

Using the read_csv() function to read text files in Pandas

The read_csv() function is traditionally used to load data from CSV files as DataFrames in Python. However, a CSV is a delimited text file with values separated using commas. Hence, we can use this function to read text files also.

We can specify various parameters with this function.

  • The header parameter can ensure whether you want the first row to be the header row or not. By default, it will consider first row to be header of dataframe.
  • We can specify the index with the index_col parameter.
  • The names of the columns can be mentioned in the names parameter.
  • The value of the delimiter parameter is taken as the separator, which is a comma by default.

Apart from these, there are many more parameters available. These can be found on the documentation for the function on the pandas website.

See the following example.

File Content:

Code:

Output:

Notice that in the above example, the final shape of the DataFrame is (4,1). This means that the function read the whole file into a single column. This happened because the default delimiter in the case of read_csv() function is a comma. If we wish to rectify this, we need to specify the delimiter as a space.

See the code below.

File Content:

Code:

Output:

Similarly, we can also have a file where the delimiter is a comma.

See the code below.

File Content:

Code:

Output:

Using the read_table() function to read text files in Pandas

We can read tables from different files using the read_table() function in pandas. This function reads a general delimited file to a DataFrame object.

This function is essentially the read_csv() function with the delimiter = '\t', instead of a comma by default.

For example,

File Content:

Code:

Output:

Note that here also we specified the delimiter as a space. Otherwise, we would have encountered the same problem. This function also supports different parameters like header, index_col, names, and more, but was deprecated in recent versions of pandas.

Using the read_fwf() function to read text files in Pandas

The fwf in the read_fwf() function stands for fixed-width lines. We can use this function to load DataFrames from files. This function also supports text files.

Fixed-width formatted files are not delimited by commas, tabs.

For example,

File Content:

Code:

Output:

Since the columns in the file were separated with a fixed-width, this function read the contents effectively into separate columns.

That’s all about read text file in Pandas.

Related Posts

  • Pandas apply function to column
    12 January

    Pandas apply function to column

    Table of ContentsHow do I apply function to column in pandas?Using dataframe.apply() functionUsing lambda function along with the apply() functionUsing dataframe.transform() functionUsing map() functionUsing NumPy.square() function We make use of the Pandas dataframe to store data in an organized and tabular manner. Sometimes there, is a need to apply a function over a specific column […]

  • Find rows with nan in Pandas
    11 January

    Find rows with nan in Pandas

    Table of ContentsWhat is nan values in Pandas?Find rows with NAN in pandasFind columns with nan in pandasFind rows with nan in Pandas using isna and iloc() In this post, we will see how to find rows with nan in Pandas. What is nan values in Pandas? A pandas DataFrame can contain a large number […]

  • 05 October

    Pandas replace values in column

    Table of ContentsUsing the loc() function to replace values in column of pandas DataFrameUsing the iloc() function to to replace values in column of pandas DataFrameUsing the map() function to replace values of a column in a pandas DataFrameUsing the replace() function to replace values in column of pandas DataFrameUsing the where() function to replace […]

  • Select rows by multiple conditions using loc in Pandas
    29 July

    Pandas Loc Multiple Conditions

    💡 Outline Here is the code to select rows by pandas Loc multiple conditions. [crayon-6272f9fd5f5d9085904139/] Here, we are select rows of DataFrame where age is greater than 18 and name is equal to Jay. [crayon-6272f9fd5f5e5501776841/] The loc() function in a pandas module is used to access values from a DataFrame based on some labels. It […]

  • Split dataframe in Pandas
    28 July

    Split dataframe in Pandas

    Table of ContentsUsing the iloc() function to split DataFrame in PythonBy RowsBy ColumnsUsing the sample() function to split DataFrame in PythonUsing the groupby() function to split DataFrame in PythonUsing the columns to split DataFrame in Python In real-life scenarios, we deal with massive datasets with many rows and columns. At times, we may want to […]

  • Copy DataFrame in Python
    10 July

    Copy DataFrame in Pandas

    This articles provide different ways to copy DataFrame in Pandas.

Leave a Reply

Your email address will not be published.

Subscribe to our newletter

Get quality tutorials to your inbox. Subscribe now.