Difference between NVARCHAR in Oracle and SQL Server?

Zenil picture Zenil · Aug 20, 2013 · Viewed 51.1k times · Source

We are migrating some data from sql server to oracle. For columns defined as NVARCHAR in SQL server we started creating NVARCHAR columns in Oracle thinking them to be similar..But it looks like they are not.

I have read couple of posts on stackoverflow and want to confirm my findings.

Oracle VARCHAR2 already supports unicode if the database character set is say AL32UTF8 (which is true for our case).

SQLServer VARCHAR does not support unicode. SQLServer explicitly requires columns to be in NCHAR/NVARCHAR type to store data in unicode (specifically in the 2 byte UCS-2 format)..

Hence would it be correct to say that SQL Server NVARCHAR columns can/should be migrated as Oracle VARCHAR2 columns ?

Answer

Justin Cave picture Justin Cave · Aug 20, 2013

Yes, if your Oracle database is created using a Unicode character set, an NVARCHAR in SQL Server should be migrated to a VARCHAR2 in Oracle. In Oracle, the NVARCHAR data type exists to allow applications to store data using a Unicode character set when the database character set does not support Unicode.

One thing to be aware of in migrating, however, is character length semantics. In SQL Server, a NVARCHAR(20) allocates space for 20 characters which requires up to 40 bytes in UCS-2. In Oracle, by default, a VARCHAR2(20) allocates 20 bytes of storage. In the AL32UTF8 character set, that is potentially only enough space for 6 characters though most likely it will handle much more (a single character in AL32UTF8 requires between 1 and 3 bytes. You probably want to declare your Oracle types as VARCHAR2(20 CHAR) which indicates that you want to allocate space for 20 characters regardless of how many bytes that requires. That tends to be much easier to communicate than trying to explain why some 20 character strings are allowed while other 10 character strings are rejected.

You can change the default length semantics at the session level so that any tables you create without specifying any length semantics will use character rather than byte semantics

ALTER SESSION SET nls_length_semantics=CHAR;

That lets you avoid typing CHAR every time you define a new column. It is also possible to set that at a system level but doing so is discouraged by the NLS team-- apparently, not all the scripts Oracle provides have been thoroughly tested against databases where the NLS_LENGTH_SEMANTICS has been changed. And probably very few third-party scripts have been.