How to Filter Pandas Dataframe by column value

Previous
Next

In this post, we will see how to filter Pandas by column value.

You can slice and dice Pandas Dataframe in multiple ways.

Table of Contents

Sometimes, you may want to find a subset of data based on certain column values. You can filter rows by one or more columns value to remove non-essential data.

Pandas DataFrame sample data

Here is sample Employee data which will be used in below examples:

Name Age Gender
Ravi 28 Male
Michelle 21 Female
Mary 37 Female
Sunita 17 Female
Sam 21 Male

Here are some examples to filter data based on columns value.

Filter rows on the basis of single column data

You can use boolean expression to filter rows on the basis of column value. You can create boolean expression based on column of interest and use this variable to filter data.

For example:
Let’s say you want to filter all the employees whose age is 21.

Output:

As you can see, we got the data filtered by Age = 21
PandasDataFrameFilter
In case, you want to reset index of the filtered dataframe, you can call reset_index() function of the DataFrame.
You can add below line to preceding code and you will get below output:

Output:

💡 Did you know?

You can also use Python chaining to filter rows based on the condition. Python chaining makes it easier to mix one command with another. You can access DataFrame column by DataFrame.columnName.
For example:
You can access DataFrame column Age by emp_df.Age and can apply condition on this.

Filter rows on the basis of multiple columns data

You can filter rows using multiple columns data.
Here is an example:
Let’s say you want find employees whose age is greater than 21 and also Male.

Output:

Here is the diagram to illustrate the filter conditions.
PandasDataFrameFilterGender

Filter rows on the basis of list of values

You can also filter DataFrames by putting condition on the list of values.
Let’s say you want to filter employees DataFrame based on list of Names. If Name is in the list, then include that row.

Output:

As you can see, we have rows for which Name column is matched with value in the Name list.

Filter rows on the basis of values not in the list

You can also filter DataFrames by putting condition on the values not in the list. You can use tilda(~) to denote negation.
Let’s say you want to filter employees DataFrame based Names not present in the list. If Name is not in the list, then include that row.

Output:

As you can see, we have rows for which Name column is not matched with value in the Name list.

That’s all about Filter Pandas Dataframe by column value.

Previous
Next

Add Comment