Total Pageviews

Unicode / Non-Unicode in SQL Server


Traditional non-Unicode data types in Microsoft® SQL Server™ allow the use of characters that are defined by a particular character set. A character set is chosen during SQL Server Setup and cannot be changed. Using Unicode data types, a column can store any character defined by the Unicode Standard, which includes all of the characters defined in the various character sets. Unicode data types take twice as much storage space as non-Unicode data types.

Unicode data is stored using the nchar, nvarchar, and ntext data types in SQL Server. Use these data types for columns that store characters from more than one character set.

The SQL Server Unicode data types are based on the National Character data types in the SQL-92 standard. SQL-92 uses the prefix character "n" to identify these data types and values.

Use of nchar, nvarchar, and ntext is the same as char, varchar, and text, respectively, except that:
  • Unicode supports a wider range of characters.
  • More space is needed to store Unicode characters.
  • The maximum size of nchar and nvarchar columns is 4,000 characters, not 8,000 characters like char and varchar.
  • Unicode constants are specified with a leading N: N'A Unicode string'.
  • All Unicode data uses the same Unicode code page. Collations do not control the code page used for Unicode columns, only attributes such as comparison rules and case sensitivity.
Unicode is best suited for systems that need to support at least one foreign language: "The Unicode specification defines a single encoding scheme for most characters widely used in businesses around the world. All computers consistently translate the bit patterns in Unicode data into characters using the single Unicode specification. This ensures that the same bit pattern is always converted to the same character on all computers. Data can be freely transferred from one database or computer to another without concern that the receiving system will translate the bit patterns into characters incorrectly.



No comments:

Post a Comment