Remove unicode characters in Python

In this post, we will see how to remove unicode characters in Python.

There are many situation in which you want to remove unicode characters in Python.
For example: You are reading tweets using tweepy in Python and tweepy gives you entire data which contains unicode characters and you want to remove the unicode characters from the String.

Remove unicode characters from String in python

There are many ways to to remove unicode characters from String in Python.

Using encode() and decode() method to remove unicode characters in Python

You can use String’s encode() with encoding as ascii and error as ignore to remove unicode characters from String and use decode() method to decode() it back.

Output:

This is Python tutorial

Using replace() method to remove unicode characters in Python

If you just want to special unicode character from String, then you can use String’s replace() method for it.

Output:

This is Python tutorial

Using character.isalnum() method to remove special characters from String

If you want to remove special characters such as whitespace or slash from String, then you can use character.isalnum() method.
Here is an exmaple:

Output:

abci20321

As you can see, all the special character are removed from the String.

How to remove Unicode "u" from string in Python

There are multiple ways to remove unicode "u" in Python.

Using replace() method

You can use String’s replace() method to remove unicode "u" from String in python.
Here is an example:

Output:

‘This is Python tutorial’

Using encode() and decode() method

You can use String’s encode() method with encoding as ascii and error as ignore to remove unicode "u" from String in python.
Here is an example:

Output:

This is Python tutorial

That’s all about how to remove unicode characters from String in Python.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *