nvarchar vs varchar in SQL Server

nvarchar vs varchar

This article will look at the difference between the data types NVARCHAR and VARCHAR in SQL. These data types are commonly used for storing textual data or strings for various design purposes. We will look at the description of each along with the best-case scenario to use each data type with practical examples. We will also have look at the differences between them in detail.

Introduction

Let’s have a quick look into VARCHAR and NVARCHAR in SQL.

VARCHAR: This is a variable-length datatype that uses 1 byte (8 bit) of space to store each character. It uses ASCII format to store each character. The length of a varchar type data ranges between 1 – 8000 (Maximum). It is suitable for storing data when the length is unknown and it adjusts its space according to the input provided.

Declaration Syntax : Variable_Name VARCHAR (Size) . 1<= Size <=8000.

NVARCHAR: This is also a variable-length datatype. It uses 2 bytes of space to store each character. NVARCHAR can store UNICODE characters as well. Unicode is a 16-bit encoding scheme that contains characters and expressions of multiple languages. This is the reason for the extra 1-byte space which allows NVARCHAR datatype to store UNICODE Characters. The length of an NVARCHAR type data range between 1 – 4000(Maximum). So, NVARCHAR data of size 4000 uses the same space as a VARCHAR of size 8000.

Declaration Syntax : Variable_Name NVARCHAR (Size) . 1<= Size <=4000.

Note: The type NVARCHAR is a keyword used only in MS SQL Server. Oracle uses NVARCHAR2 to implement the NVARCHAR datatype. In MYSQL Server we have to use a Character Set. To avoid confusion we demonstrate examples for only SQL Server.

Examples to Demonstrate Difference between VARCHAR and NVARCHAR

Firstly, to demonstrate the difference concerning their Size we assume this query in SQL Server.

We declare a Variable and we print the space occupied by it along with the length of actual data. The DATALENGTH function returns the space occupied by the data type of string provided to it as the parameter and the LEN function returns the length of the input. 

The output will be:

So, we can see the name James in our output even though we declare the variable string of size 20 the Total Space is 5. This is because VARCHAR adjusts its space according to input provided i.e. ‘James’ has 5 characters so only 5 bytes of space are used.

Hence, let us write the same query using the NVARCHAR data type.

The output of this query will be:

So, now for Input String 'James' Total Space will be 10 bytes as with NVARCHAR each character occupies 2 bytes. Moreover, it also adjusts space according to input provided so only 10 gets printed instead of 40.

Now, we differentiate according to their Use or Storage Property. We know Varchar stores only textual data or Non-Unicode Characters whereas NVARCHAR can store both. So, let us consider an example of a table Language_Details where we store a language name and some text in the respective Language as shown below:

Language_NameTextScript
EnglishText In English
Chinese中文文本
GreekΚείμενο στα ελληνικά

We can see the text in the TextScript column for some values contains non-English Alphabets. These are the Unicode characters that require NVARCHAR data type storage. The schema for this table is:

Let us insert the records as shown above. For values containing Unicode Characters we need to make slight change in INSERT operation.

We can see an N appended before each TextScript value in our Insert statement. This indicates the values contain Unicode Characters. Now, suppose if we insert a record with Unicode Characters in our Language_Name Column like this:

In this case, the data will get added but not in Unicode Format. Hence, on printing it will give uncertain output

We can see in Language_Name Column the last entry gives uncertain output whereas TextScript gives the TextScript ‘Persian’ precisely. This also proves that NVARCHAR can hold both types of characters. This was the difference according to their use.

VARCHAR V/S NVARCHAR Comparison Chart:

Now, we summarize the key differences between them.

VARCHAR NVARCHAR
1. It is a variable-length datatype that stores only non-Unicode Characters. 1. It is also a variable-length datatype that can store Textual Characters and Unicode Characters as well.
2. Each Character occupies 1 byte of space. 2. Each Character occupies 2 bytes of space.
3. The maximum length of VARCHAR type data can be 8000 occupying 8000 bytes of space. 3. The maximum length of NVARCHAR data can be 4000.
4. Quering is faster when using VARCHAR if we know the data has only non-Unicode Characters. 4. Query operations are slower using NVARCHAR.
5. It is recommended to use VARCHAR for storing data for applications using only ASCII. Otherwise, encoding conversions can be a huge overhead. 5. All modern operating systems use Unicode internally. So using this we can avoid encoding conversions and can withstand any textual data.

So, that’s all about the article you can try the queries shown above in SQL Server for clear idea.

 

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *