Monday, March 12, 2012

Non localized unicode collating

I noticed that SQL Server 7 had a specific ID for collating unicode in binary ordering. I could not find find a collation in SQL Server 2000 that was clearly a non localized binary ordering.
Does anyone know what collation I want for this purpose?
Michael,
Not sure, I ran the below query and came up with this:
SQL_Latin1_General_CP437_BIN
SQL_Latin1_General_CP850_BIN
select *
from ::fn_helpcollations()
where [description] like '%binary%'
Dinesh
SQL Server MVP
--
SQL Server FAQ at
http://www.tkdinesh.com
"Michael Bauers" <anonymous@.discussions.microsoft.com> wrote in message
news:BA9AFE0F-4DEA-4EDA-A538-80FFDD11945A@.microsoft.com...
> I noticed that SQL Server 7 had a specific ID for collating unicode in
binary ordering. I could not find find a collation in SQL Server 2000 that
was clearly a non localized binary ordering.
> Does anyone know what collation I want for this purpose?
>
|||I believe I also saw those. But it was not clear to me if that's what we are looking for. I do appreciate you looking.
|||The "*_BIN" collations like Latin1_General_BIN do provide non-localized
sorting and comparisons of Unicode data. For example, the collation
Greek_BIN behaves has the following characteristics:
- Unicode data is sorted and compared using a binary collation that only
takes into account the code point of Unicode characters (for example an
"x" is not assigned any linguistic meaning but is simply sorted as
U+0078). No two Unicode characters are considered equal.
- Non-Unicode data is stored using Greek code page 1253.
- Non-Unicode data is also sorted using a binary sort order that only
considers the character's code point value. No two non-Unicode
characters are considered equal.
In other words, the "Greek_" in "Greek_BIN" simply refers to the code
page that is used to store non-Unicode data. It doesn't change how the
binary collation sorts or compares Unicode data.
Binary sorting for non-Unicode characters is fairly intuitive; for
example, code point 0x49 sorts before 0x4a, which sorts before 0x4b, etc.
Binary Unicode sorting, however, also depends only on the character code
points, but due to an implementation quirk does not order characters in
strict code point order. The actual binary Unicode collation behavior is
discussed briefly in 816039 FIX: Code Point Comparison Semantics for
SQL_Latin1_General_Cp850_BIN (http://support.microsoft.com/?id=816039)
and is also demonstrated by the script below. Note that this behavior is
also how SQL 7.0 worked in binary Unicode string comparisons.
drop table #t1
declare @.c1 nchar(1)
declare @.c2 nchar(1)
declare @.c3 nchar(1)
declare @.c4 nchar(1)
declare @.c5 nchar(1)
set @.c1 = nchar (0x03C0)
set @.c2 = nchar (0x00F0)
set @.c3 = nchar (0x01E0)
set @.c4 = nchar (0x01E1)
set @.c5 = nchar (0xABCD)
create table #t1 (c1 nchar(2) collate Latin1_General_BIN)
insert into #t1 values (@.c1 + @.c1)
insert into #t1 values (@.c1 + @.c2)
insert into #t1 values (@.c2 + @.c1)
insert into #t1 values (@.c2 + @.c2)
insert into #t1 values (@.c3 + @.c4)
insert into #t1 values (@.c4 + @.c3)
insert into #t1 values (@.c5 + @.c5)
select c1, convert (varbinary (4), c1) as bin_string,
convert (varbinary(1), (convert (int, unicode (substring (c1, 1, 1)))
& 0xFF00) / 256) as byte2_bin,
convert (varbinary(1), (convert (int, unicode (substring (c1, 1, 1)))
& 0x00FF)) as byte1_bin,
convert (varbinary(2), substring (c1, 2, 100)) as remaining_bytes_bin
from #t1 order by c1
HTH,
Bart
Bart Duncan
Microsoft SQL Server Support
Please reply to the newsgroup only - thanks.
This posting is provided "AS IS" with no warranties, and confers no
rights.
Thread-Topic: Non localized unicode collating
thread-index: AcQ4brcfkye8UvuwSXaGffp7bpZtog==
X-WN-Post: microsoft.public.sqlserver.server
From: "=?Utf-8?B?TWljaGFlbCBCYXVlcnM=?="
<anonymous@.discussions.microsoft.com>
References: <BA9AFE0F-4DEA-4EDA-A538-80FFDD11945A@.microsoft.com>
Subject: RE: Non localized unicode collating
Date: Wed, 12 May 2004 15:16:04 -0700
Lines: 3
Message-ID: <3FDBAF61-39D9-4BD7-BFDF-A5E330B354F1@.microsoft.com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.sqlserver.server
Path: cpmsftngxa10.phx.gbl
Xref: cpmsftngxa10.phx.gbl microsoft.public.sqlserver.server:341014
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.sqlserver.server
I believe I also saw those. But it was not clear to me if that's what we
are looking for. I do appreciate you looking.
From: "Dinesh T.K" <tkdinesh@.nospam.mail.tkdinesh.com>
Subject: Re: Non localized unicode collating
Date: Thu, 13 May 2004 03:24:28 +0530
Newsgroups: microsoft.public.sqlserver.server
Michael,
Not sure, I ran the below query and came up with this:
SQL_Latin1_General_CP437_BIN
SQL_Latin1_General_CP850_BIN
select *
from ::fn_helpcollations()
where [description] like '%binary%'
Dinesh
SQL Server MVP
--
SQL Server FAQ at
http://www.tkdinesh.com
"Michael Bauers" <anonymous@.discussions.microsoft.com> wrote in message
news:BA9AFE0F-4DEA-4EDA-A538-80FFDD11945A@.microsoft.com...
> I noticed that SQL Server 7 had a specific ID for collating unicode in
binary ordering. I could not find find a collation in SQL Server 2000
that
was clearly a non localized binary ordering.
> Does anyone know what collation I want for this purpose?
>

No comments:

Post a Comment