In this article, we will discuss how to replace values in column of pandas dataframe.
Table of Contents
- Using the loc() function to replace values in column of pandas DataFrame
- Using the iloc() function to to replace values in column of pandas DataFrame
- Using the map() function to replace values of a column in a pandas DataFrame
- Using the replace() function to replace values in column of pandas DataFrame
- Using the where() function to replace values in column of pandas DataFrame
- Using the numpy.where() function to to replace values in column of pandas DataFrame
Using the loc()
function to replace values in column of pandas
DataFrame
The loc()
function is used to access values based on column names and row values. We can use this function to access the required value and provide the new value using the =
operator.
For example,
1 2 3 4 5 6 |
import pandas as pd df = pd.DataFrame([['Jay',75,18],['Mark',92,17],['Neil',74,18]], columns = ['a','b','c']) df.loc[0,'a'] = 'Jack' print(df) |
Output:
1 2 3 4 5 6 |
a b c 0 Jack 75 18 1 Mark 92 17 2 Neil 74 18 |
In the above example, we changed Jay
to Jack
in column a
.
We can use this function to replace values based on some conditions also.
See the following example.
1 2 3 4 5 6 |
import pandas as pd df = pd.DataFrame([['Jay',75,18],['Mark',92,17],['Neil',74,18]], columns = ['a','b','c']) df.loc[df.b<80,'b'] = 60 print(df) |
Output:
1 2 3 4 5 6 |
a b c 0 Jay 60 18 1 Mark 92 17 2 Neil 60 18 |
In the above example, we found all the values below 75 in column b
and replaced them with 60.
Further reading:
Using the iloc()
function to to replace values in column of pandas
DataFrame
The iloc()
function is similar to the loc()
function and can be used to access columns and rows of a DataFrame. It uses the index of the values instead of their labels.
For example,
1 2 3 4 5 6 |
import pandas as pd df = pd.DataFrame([['Jay',75,18],['Mark',92,17],['Neil',74,18]], columns = ['a','b','c']) df.iloc[0,0] = 'Jack' print(df) |
Output:
1 2 3 4 5 6 |
a b c 0 Jack 75 18 1 Mark 92 17 2 Neil 74 18 |
Using the map()
function to replace values of a column in a pandas
DataFrame
The map()
function can apply some function or collector on all the elements of a Series object or a DataFrame. We can use it to detect and replace values of a column in a DataFrame. We have to specify the old and new values within the function.
For example,
1 2 3 4 5 6 |
import pandas as pd df = pd.DataFrame([['Jay',75,18],['Mark',92,17],['Neil',74,18]], columns = ['a','b','c']) df['a'] = df['a'].map({'Jay':'Jack', 'Mark':'Mac', 'Neil':'Neil'}) print(df) |
Output:
1 2 3 4 5 6 |
a b c 0 Jack 75 18 1 Mac 92 17 2 Neil 74 18 |
Note that we also need to name the element we do not wish to replace otherwise, they will get replaced with the NaN
constant.
Using the replace()
function to replace values in column of pandas
DataFrame
This probably the most straightforward method to replace the values of a column. We can use the replace()
function to replace one or more values in a DataFrame.
In the following example, we will replace multiple values with one value.
1 2 3 4 5 6 |
import pandas as pd df = pd.DataFrame([['Jay',75,18],['Mark',92,17],['Neil',74,18]], columns = ['a','b','c']) df['a'] = df['a'].replace(['Jay','Mark'],'Unknown') print(df) |
Output:
1 2 3 4 5 6 |
a b c 0 Unknown 75 18 1 Unknown 92 17 2 Neil 74 18 |
Similarly, we can replace multiple values with different values for each.
For example,
1 2 3 4 5 6 |
import pandas as pd df = pd.DataFrame([['Jay',75,18],['Mark',92,17],['Neil',74,18]], columns = ['a','b','c']) df['a'] = df['a'].replace(['Jay','Mark'],['Jack','Mac']) print(df) |
Output:
1 2 3 4 5 6 |
a b c 0 Jack 75 18 1 Mac 92 17 2 Neil 74 18 |
This function can also be applied to the whole DataFrame.
Using the where()
function to replace values in column of pandas
DataFrame
The where()
function checks the DataFrame to detect some values based on a given condition. We can replace the values which satisfy the given condition with some new value.
See the following example.
1 2 3 4 5 6 |
import pandas as pd df = pd.DataFrame([['Jay',75,18],['Mark',92,17],['Neil',74,18]], columns = ['a','b','c']) df['b'].where(~(df.b < 80), other=60, inplace=True) print(df) |
Output:
1 2 3 4 5 6 |
a b c 0 Jay 60 18 1 Mark 92 17 2 Neil 60 18 |
The above example replaces all values less than 80 with 60.
Using the numpy.where()
function to to replace values in column of pandas
DataFrame
The where()
function from the numpy
module is generally used with arrays only. However, since we need to change the values of a column, we can use this function with a pandas
DataFrame also.
This method works similarly to the method discussed previously.
For example,
1 2 3 4 5 6 7 |
import pandas as pd import numpy as np df = pd.DataFrame([['Jay',75,18],['Mark',92,17],['Neil',74,18]], columns = ['a','b','c']) df['b'] = np.where((df.b < 80), 60, df.b) print(df) |
Output:
1 2 3 4 5 6 |
a b c 0 Jay 60 18 1 Mark 92 17 2 Neil 60 18 |
That’s all about how to replace values in column of pandas
DataFrame.