Table of Contents
In this post, we will see how to find rows with nan in Pandas.
What is nan values in Pandas?
A pandas
DataFrame can contain a large number of rows and columns. Sometimes, a DataFrame may contain NaN values. Such values indicate that something is not legal and is different from Null which means a value does not exist and is empty.
Such values may make it difficult for data processing. So we usually remove rows that contain NaN values from a DataFrame.
In this article, we will find the rows from a DataFrame that contain such NaN values.
We will work with the following DataFrame where the second row has a NaN value.
1 2 3 4 5 6 7 8 |
import pandas as pd import math df = pd.DataFrame([['Jay',18,'BBA'], ['Ram',math.nan,'BTech'], ['Mason',20,'BSc']], columns = ['Name','Age','Course']) print(df) |
Output:
1 2 3 4 5 6 |
Name Age Course 0 Jay 18.0 BBA 1 Ram NaN BTech 2 Mason 20.0 BSc |
Find rows with NAN in pandas
We can detect NaN values in Python using the isnan()
function. This function is present in three modules- math
and numpy
.
Since we are looking to find rows from a DataFrame, we will use the pandas.isna()
function.
This function will return a DataFrame, with True value wherever it encounters NaN or Null values in a DataFrame or a Series object.
For example,
1 2 3 4 5 6 7 8 |
import pandas as pd import math df = pd.DataFrame([['Jay',18,'BBA'], ['Ram',math.nan,'BTech'], ['Mason',20,'BSc']], columns = ['Name','Age','Course']) print(df.isna()) |
Output:
1 2 3 4 5 6 |
Name Age Course 0 False False False 1 False True False 2 False False False |
Note that the above function will return a full DataFrame with True and False values. We can use the any()
function to get the rows that contain NaN rows from this DataFrame.
The any()
function will return the rows or columns containing True values from a DataFrame. Since we want to find the rows, we will set the axis
parameter as 1.
See the following example.
1 2 3 4 5 6 7 8 |
import pandas as pd import math df = pd.DataFrame([['Jay',18,'BBA'], ['Ram',math.nan,'BTech'], ['Mason',20,'BSc']], columns = ['Name','Age','Course']) print(df[df.isna().any(axis=1)]) |
**Output:
1 2 3 4 |
Name Age Course 1 Ram NaN BTech |
Further reading:
Find columns with nan in pandas
We were able to find the rows containing NaN values in a DataFrame. To find the columns, we can set the axis
parameter as 0 in the any()
function.
Here is an example:
1 2 3 4 5 6 7 8 |
import pandas as pd import math df = pd.DataFrame([['Jay',18,'BBA'], ['Ram',math.nan,'BTech'], ['Mason',20,'BSc']], columns = ['Name','Age','Course']) print(df.loc[:,df.isna().any(axis=0)]) |
**Output:
1 2 3 4 5 6 |
Age 0 18.0 1 NaN 2 20.0 |
Find rows with nan in Pandas using isna and iloc()
We can also use the iloc() method to filter out the rows with NaN values. This function is used to extract rows and columns from the DataFrame based on the index.
See the code below.
1 2 3 4 5 6 7 8 |
import pandas as pd import math df = pd.DataFrame([['Jay',18,'BBA'], ['Ram',math.nan,'BTech'], ['Mason',20,'BSc']], columns = ['Name','Age','Course']) print(df.iloc[df[(df.isna().sum(axis=1) >= 1)].index]) |
**Output:
1 2 3 4 |
Name Age Course 1 Ram NaN BTech |
To deal with Null values, we follow the same procedure and replace the isna()
with isnull()
function.
That’s about how to find rows with nan in Pandas