Table of Contents

We make use of the Pandas `dataframe`

to store data in an organized and tabular manner. Sometimes there, is a need to apply a function over a specific column or the whole table of the stored data.

This tutorial demonstrates the different methods available to apply a function to a column of a pandas `dataframe`

in Python.

## How do I apply function to column in pandas?

Here are multiple ways to apply function to column in Pandas.

### Using dataframe.apply() function

The `dataframe.apply()`

function is simply utilized to apply any specified function across an axis on the given pandas `DataFrame`

.

The syntax for the `dataframe.apply()`

function is:

1 2 3 |
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs) |

The `dataframe.apply()`

takes in a couple of parameters, all of which are mentioned below:

**func:**It specifies the function that needs to be applied.**axis:**It specifies the axis along with which the function needs to be implemented. The value`0`

denotes`column`

while`1`

denotes`row`

. By default, its value is taken as`0`

.

These two parameters are essential in order to understand the functioning and implementation of this method. Further information on the other optional parameters that the function takes in can be accessed here.

The following code uses the `dataframe.apply()`

function to apply a function to a specific column in pandas

1 2 3 4 5 6 7 8 9 |
import pandas as pd import numpy as np dfa = pd.DataFrame([[3,3,3], [4,4,4], [5,5,5]], columns=['X','Y','Z']) def a_2(x): return x+2 dfa['Y'] = dfa['Y'].apply(a_2) print (dfa) |

The above code provides the following output:

0 3 5 3

1 4 6 4

2 5 7 5

##### Explanation:

- The
`numpy`

and`pandas`

libraries are imported to the code first. - A pandas
`DataFrame`

named`dfa`

is created and is initialized. - A function that is to be implemented to the column of the
`DataFrame`

is created. - The function is then implemented to the second column
`Y`

of the given`DataFrame`

using the`apply()`

function.

### Using `lambda`

function along with the `apply()`

function

A `lambda`

function is an unnamed function that represents only a single expression while taking any number of arguments in it.

The `lambda`

function can be tweaked into the `apply()`

function to apply a function to a specific column. This method shortens the length of the code as compared to the method above.

The following code uses the `lambda`

function along with the `apply()`

function.

1 2 3 4 5 6 7 |
import pandas as pd import numpy as np dfa = pd.DataFrame([[3,3,3], [4,4,4], [5,5,5]], columns=['X','Y','Z']) dfa['Y'] = dfa['Y'].apply(lambda x: x+2) print (dfa) |

The above code provides the following output:

0 3 5 3

1 4 6 4

2 5 7 5

The working of this method is similar to the simple `dataframe.apply()`

method mentioned above, with the only difference being that we do not have to specifically create a function and can simply use a `lambda`

function directly in the `apply()`

function.

### Using `dataframe.transform()`

function

The `dataframe.transform()`

function is utilized in calling a given function `func`

on self and creating a `DataFrame`

that contains all the transformed values, provided the length of the transformed `DataFrame`

is the same as that of the initial value.

The syntax for the `dataframe.transform()`

function is:

1 2 3 |
DataFrame.transform(func, axis=0, *args, **kwargs) |

Similar to the `apply()`

function, the `transform()`

function contains several parameters:

**func:**It specifies the function that needs to be applied.**axis:**It specifies the axis along with which the function needs to be implemented. The value`0`

denotes`column`

while`1`

denotes`row`

. By default, its value is taken as`0`

.

The `dataframe.transform()`

takes in several other parameters which do not need to be explained to implement this method simply. However, more details on all the parameters of the `transform()`

function can be found here.

The following code uses the `dataframe.transform()`

function to apply a function to a specific column in pandas.

1 2 3 4 5 6 7 8 9 |
import pandas as pd import numpy as np dfa = pd.DataFrame([[3,3,3], [4,4,4], [5,5,5]], columns=['X','Y','Z']) def a_2(x): return x+2 dfa['Y'] = dfa['Y'].transform(a_2) print (dfa) |

The above code provides the following output:

0 3 5 3

1 4 6 4

2 5 7 5

##### Explanation:

- The
`numpy`

and`pandas`

libraries are imported to the code first. - A pandas
`DataFrame`

named`dfa`

is created and is initialized. - A function that is to be implemented to the column of the
`DataFrame`

is created. - The function is then implemented to the second column
`Y`

of the given`DataFrame`

using the`dataframe.transform()`

function.

## Further reading:

### Using `map()`

function

The `map()`

function which is provided by Python is utilized to seek a particular function to all the elements in any given iterable. It returns the iterator itself as the result.

The `map()`

function can be utilized in place of the apply function.

The following code uses the `map()`

function to apply a function to a specific column in pandas.

1 2 3 4 5 6 7 |
import pandas as pd import numpy as np dfa = pd.DataFrame([[3,3,3], [4,4,4], [5,5,5]], columns=['X','Y','Z']) dfa['Y'] = dfa['Y'].map(lambda x: x+2) print (dfa) |

The above code provides the following output:

0 3 5 3

1 4 6 4

2 5 7 5

##### Explanation:

- The
`numpy`

and`pandas`

libraries are imported to the code first. - A pandas
`DataFrame`

named`dfa`

is created and is initialized. - A
`lambda`

function is utilized in this case to specify the changes - This
`lambda`

function is then implemented to the second column`Y`

of the given`DataFrame`

using the`map()`

function.

### Using `NumPy.square()`

function

`NumPy`

is an abbreviation for `Numerical Python`

and is a library that Python provides which is utilized in dealing with and manipulating arrays and tabular data. The `numpy.square()`

function is a simple mathematical function that returns another array with the selected column values as the square of the original values.

The following code uses the `numpy.square()`

function to apply a function to a specific column in pandas.

1 2 3 4 5 6 7 |
import pandas as pd import numpy as np dfa = pd.DataFrame([[3,3,3], [4,4,4], [5,5,5]], columns=['X','Y','Z']) dfa['Y'] = np.square(dfa['Y']) print (dfa) |

The above code provides the following output:

0 3 9 3

1 4 16 4

2 5 25 5

##### Explanation:

- The
`numpy`

and`pandas`

libraries are imported to the code first. - A pandas
`DataFrame`

named`dfa`

is created and is initialized. - The
`numpy.square()`

function is then implemented to the second column`Y`

of the given`DataFrame`

in Python.

We should note that the`numpy.square()`

function can only change the elements and implement a square value of the set of elements of a column and cannot apply any other function to it.