Table of Contents
This article will look at the difference between the data types NVARCHAR and VARCHAR
in SQL. These data types are commonly used for storing textual data or strings for various design purposes. We will look at the description of each along with the best-case scenario to use each data type with practical examples. We will also have look at the differences between them in detail.
Introduction
Let’s have a quick look into VARCHAR and NVARCHAR in SQL.
VARCHAR:
This is a variable-length datatype that uses 1 byte (8 bit) of space to store each character. It uses ASCII format to store each character. The length of a varchar type data ranges between 1 – 8000 (Maximum). It is suitable for storing data when the length is unknown and it adjusts its space according to the input provided.
Declaration Syntax :
Variable_Name VARCHAR (Size) . 1<= Size <=8000.
NVARCHAR:
This is also a variable-length datatype. It uses 2 bytes of space to store each character. NVARCHAR can store UNICODE characters as well. Unicode is a 16-bit encoding scheme that contains characters and expressions of multiple languages. This is the reason for the extra 1-byte space which allows NVARCHAR datatype to store UNICODE Characters. The length of an NVARCHAR type data range between 1 – 4000(Maximum). So, NVARCHAR data of size 4000 uses the same space as a VARCHAR of size 8000.
Declaration Syntax :
Variable_Name NVARCHAR (Size) . 1<= Size <=4000.
Note:
The type NVARCHAR is a keyword used only in MS SQL Server. Oracle uses NVARCHAR2 to implement the NVARCHAR datatype. In MYSQL Server we have to use a Character Set. To avoid confusion we demonstrate examples for only SQL Server.
Examples to Demonstrate Difference between VARCHAR and NVARCHAR
Firstly, to demonstrate the difference concerning their Size we assume this query in SQL Server.
1 2 3 4 5 |
DECLARE @string VARCHAR(20) SET @string = 'James' SELECT @string AS 'NAME',DATALENGTH(@string) AS 'Total Space', LEN(@string) AS 'Length' |
We declare a Variable and we print the space occupied by it along with the length of actual data. The DATALENGTH function returns the space occupied by the data type of string provided to it as the parameter and the LEN function returns the length of the input.
The output will be:
So, we can see the name James in our output even though we declare the variable string
of size 20 the Total Space
is 5. This is because VARCHAR adjusts its space according to input provided i.e. ‘James’ has 5 characters so only 5 bytes of space are used.
Hence, let us write the same query using the NVARCHAR data type.
1 2 3 4 5 |
DECLARE @string NVARCHAR(20) SET @string = 'James' SELECT @string AS 'NAME', DATALENGTH(@string) AS 'Total Space',LEN(@string) AS 'Actual Length' |
The output of this query will be:
So, now for Input String 'James'
Total Space will be 10 bytes as with NVARCHAR each character occupies 2 bytes. Moreover, it also adjusts space according to input provided so only 10 gets printed instead of 40.
Now, we differentiate according to their Use or Storage Property. We know Varchar stores only textual data or Non-Unicode Characters whereas NVARCHAR can store both. So, let us consider an example of a table Language_Details
where we store a language name and some text in the respective Language as shown below:
Language_Name | TextScript |
---|---|
English | Text In English |
Chinese | 中文文本 |
Greek | Κείμενο στα ελληνικά |
We can see the text in the TextScript column for some values contains non-English Alphabets. These are the Unicode characters that require NVARCHAR data type storage. The schema for this table is:
1 2 3 4 5 6 7 |
CREATE TABLE Language_Details ( Language_Name VARCHAR(20), TextScript NVARCHAR(20) ); |
Let us insert the records as shown above. For values containing Unicode Characters we need to make slight change in INSERT operation.
1 2 3 4 5 |
INSERT INTO Language_Details VALUES('English',N'Text in English'); INSERT INTO Language_Details VALUES('Chinese',N'中文文本'); INSERT INTO Language_Details VALUES('Greek',N'Κείμενο στα ελληνικά'); |
We can see an N
appended before each TextScript value in our Insert statement. This indicates the values contain Unicode Characters. Now, suppose if we insert a record with Unicode Characters in our Language_Name Column like this:
1 2 3 4 |
INSERT INTO Language_Details VALUES('متن به فارسی','Persian'); |
In this case, the data will get added but not in Unicode Format. Hence, on printing it will give uncertain output
We can see in Language_Name Column the last entry gives uncertain output whereas TextScript gives the TextScript ‘Persian’ precisely. This also proves that NVARCHAR can hold both types of characters. This was the difference according to their use.
VARCHAR V/S NVARCHAR Comparison Chart:
Now, we summarize the key differences between them.
VARCHAR | NVARCHAR |
---|---|
1. It is a variable-length datatype that stores only non-Unicode Characters. | 1. It is also a variable-length datatype that can store Textual Characters and Unicode Characters as well. |
2. Each Character occupies 1 byte of space. | 2. Each Character occupies 2 bytes of space. |
3. The maximum length of VARCHAR type data can be 8000 occupying 8000 bytes of space. | 3. The maximum length of NVARCHAR data can be 4000. |
4. Quering is faster when using VARCHAR if we know the data has only non-Unicode Characters. | 4. Query operations are slower using NVARCHAR. |
5. It is recommended to use VARCHAR for storing data for applications using only ASCII. Otherwise, encoding conversions can be a huge overhead. | 5. All modern operating systems use Unicode internally. So using this we can avoid encoding conversions and can withstand any textual data. |
So, that’s all about the article you can try the queries shown above in SQL Server for clear idea.