Monday, March 12, 2012

Non unique Clustered index

I have 1,000,000 records there are unique by member(int),vin(20),stock(18).
there are 1200 unique members all updates are done by the primary key
(member,vin,stock). My customer does not want to add an identity column.
There is only a non clustered unique PK on the table, no clustered index.
I am wondering which would be better
1. Put a unique Clustered PK constraint on the 40 byte fields
member(int),vin(20),stock(18) (indexes would be large)
or
2 Put a non Clustered index on member id (4 bytes)(let sql add the
identifier(4 bytes))and put the Primary Key on member(int),vin(20),stock(18)
as a unique non
clustered constraint.
This table has heavy updates(no Pkey fields updated ) and inserts at night
in batch (15,000 updates 5,000 inserts approx per night).
There is currently no clustered index and there is no way to control
fragmentation.
Thanks,
Jon A
If you have only one index on a table, it's usually best to be clustered.
The downside to a wide clustered index is that the clustered index keys are
stored in non-clustered indexes as well. This is not an issue when you
don't have non-clustered indexes, though.
Hope this helps.
Dan Guzman
SQL Server MVP
"Jon A" <JonA@.discussions.microsoft.com> wrote in message
news:E1406DAF-36A7-46AB-9EFD-27F942A59B51@.microsoft.com...
>I have 1,000,000 records there are unique by member(int),vin(20),stock(18).
> there are 1200 unique members all updates are done by the primary key
> (member,vin,stock). My customer does not want to add an identity column.
> There is only a non clustered unique PK on the table, no clustered index.
> I am wondering which would be better
> 1. Put a unique Clustered PK constraint on the 40 byte fields
> member(int),vin(20),stock(18) (indexes would be large)
> or
> 2 Put a non Clustered index on member id (4 bytes)(let sql add the
> identifier(4 bytes))and put the Primary Key on
> member(int),vin(20),stock(18)
> as a unique non
> clustered constraint.
> This table has heavy updates(no Pkey fields updated ) and inserts at night
> in batch (15,000 updates 5,000 inserts approx per night).
> There is currently no clustered index and there is no way to control
> fragmentation.
> --
> Thanks,
> Jon A
|||Jon A wrote:
> I have 1,000,000 records there are unique by
> member(int),vin(20),stock(18). there are 1200 unique members all
> updates are done by the primary key (member,vin,stock). My customer
> does not want to add an identity column. There is only a non
> clustered unique PK on the table, no clustered index.
> I am wondering which would be better
> 1. Put a unique Clustered PK constraint on the 40 byte fields
> member(int),vin(20),stock(18) (indexes would be large)
> or
> 2 Put a non Clustered index on member id (4 bytes)(let sql add the
> identifier(4 bytes))and put the Primary Key on
> member(int),vin(20),stock(18) as a unique non
> clustered constraint.
> This table has heavy updates(no Pkey fields updated ) and inserts at
> night in batch (15,000 updates 5,000 inserts approx per night).
> There is currently no clustered index and there is no way to control
> fragmentation.
I would argue that because you have a natural key as your PK, you're
likely to see page splitting if you use a clustered index. Since this is
a nightly batch, it may not matter. But then again, having a clustered
index on this table may not matter either, depnding on how the SELECTS
and UPDATES look.
I do agree with Dan. That is, it's best for most, if not all, tables to
have a clustered index. But to add one without a careful investigation
of the table and how it's used is necessary. Just as you would consider
what columns would best make use of a clustered index during database
design, you should perform the same due diligence now.
Look at your queries and table access. See how the data is updated. Is
it more than one row at a time? Is it ever changing a column value that
could be in the clustered index? What do the inserts look like? Are they
adding rows with column values that will most likely cause spage
splitting and slower insert performance at night? Look at the SELECTS on
the table. Do you ever return more than one row at a time? If so, what
criteria determine the rows returned? Do you have ORDER BY statements in
your queries? Do they really need to be there?
If you can post more information about how the table is used, we may be
able to offer more advice.
David Gugick
Imceda Software
www.imceda.com
|||Hi Jon,
I am not really sure what problem you are trying to solve. Is there a
problem?
If the potential problem is fragmentation control, then you could simply
add and drop a clustered index (on any column) during a service window.
If you do that periodically, fragmentation should be under control.
The rest depends on the queries you are using. A 40-byte index in itself
doesn't cause problems. If Insert and Delete performance during the day
is not an issue, then you could safely make the Primary Key index
clustered. And even with the proper fillfactors, Insert performance
shouldn't be a problem.
Having a clustered index can help Select performance on ranges a lot
(for example, the range member = <somevalue>). For high selectivity
Selects, a clustered index does not add much value.
HTH,
Gert-Jan
Jon A wrote:
> I have 1,000,000 records there are unique by member(int),vin(20),stock(18).
> there are 1200 unique members all updates are done by the primary key
> (member,vin,stock). My customer does not want to add an identity column.
> There is only a non clustered unique PK on the table, no clustered index.
> I am wondering which would be better
> 1. Put a unique Clustered PK constraint on the 40 byte fields
> member(int),vin(20),stock(18) (indexes would be large)
> or
> 2 Put a non Clustered index on member id (4 bytes)(let sql add the
> identifier(4 bytes))and put the Primary Key on member(int),vin(20),stock(18)
> as a unique non
> clustered constraint.
> This table has heavy updates(no Pkey fields updated ) and inserts at night
> in batch (15,000 updates 5,000 inserts approx per night).
> There is currently no clustered index and there is no way to control
> fragmentation.
> --
> Thanks,
> Jon A
|||An IDENTITY is a lousy candidate for a Clustered Index, almost as horrible
as allowing a table without a Clustered Index at all (a heap).
Heaps are large and, as you've noticed, do not allow you to as easily
control your index rebuild (defragmentaiton) as easily. First of all, the
Clustered Index itself adds no space to the table; its only the use of the
key as a pointer in the other indexes that can grow your non-clustered
indexes. However, consider how large the ROWID is as the alternative to the
clustered index key.
As far as uniqueness, if the Clustered Index is not unique, SQL Server will
make it so by appending a GUID to the key to force it to be unique. Weigh
that against the composite index length, not to mention the size of the heap
alternative.
As to the IDENTITY, if you use one, NEVER make it a clustered index, unless
there are absolutely no other candidates. When will you EVER query an
IDENTITY by range? Also, if you use an IDENTITY, this is normally used as a
surrogate, as in your case, which does not remove the uniqueness requirement
of the business key you would be replacing as the Primary Key. So, better
add an UNIQUE non-Clustered Constraint to the original candidate(s).
Sincerely,
Anthony Thomas

"Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
news:422B03AC.344C5898@.toomuchspamalready.nl...
Hi Jon,
I am not really sure what problem you are trying to solve. Is there a
problem?
If the potential problem is fragmentation control, then you could simply
add and drop a clustered index (on any column) during a service window.
If you do that periodically, fragmentation should be under control.
The rest depends on the queries you are using. A 40-byte index in itself
doesn't cause problems. If Insert and Delete performance during the day
is not an issue, then you could safely make the Primary Key index
clustered. And even with the proper fillfactors, Insert performance
shouldn't be a problem.
Having a clustered index can help Select performance on ranges a lot
(for example, the range member = <somevalue>). For high selectivity
Selects, a clustered index does not add much value.
HTH,
Gert-Jan
Jon A wrote:
> I have 1,000,000 records there are unique by
member(int),vin(20),stock(18).
> there are 1200 unique members all updates are done by the primary key
> (member,vin,stock). My customer does not want to add an identity column.
> There is only a non clustered unique PK on the table, no clustered index.
> I am wondering which would be better
> 1. Put a unique Clustered PK constraint on the 40 byte fields
> member(int),vin(20),stock(18) (indexes would be large)
> or
> 2 Put a non Clustered index on member id (4 bytes)(let sql add the
> identifier(4 bytes))and put the Primary Key on
member(int),vin(20),stock(18)
> as a unique non
> clustered constraint.
> This table has heavy updates(no Pkey fields updated ) and inserts at night
> in batch (15,000 updates 5,000 inserts approx per night).
> There is currently no clustered index and there is no way to control
> fragmentation.
> --
> Thanks,
> Jon A
|||I added the clustered PK index as (member(int),vin(20),stock(18)). And with
adjustment the page splitting is minimal. But the Table is now 3x the size it
was previously.
My question is this in general terms. What is the problems / overhead of a
non unique clustered index? Is this a bad thing? I have never had a case
where I would do that. But as a result of this problem I am now curious.
"David Gugick" wrote:
|||I wouldn't expect changing the PK from non-clustered to clustered to
increase space requirements. In fact, I would think the space would
decrease. Are you certain there are no non-clustered indexes on the table?
You can double check with sp_helpindex.
Hope this helps.
Dan Guzman
SQL Server MVP
"Jon A" <JonA@.discussions.microsoft.com> wrote in message
news:E551392D-6A3D-4176-B1EB-7847AB685B51@.microsoft.com...
>I added the clustered PK index as (member(int),vin(20),stock(18)). And with
> adjustment the page splitting is minimal. But the Table is now 3x the size
> it
> was previously.
> My question is this in general terms. What is the problems / overhead of a
> non unique clustered index? Is this a bad thing? I have never had a case
> where I would do that. But as a result of this problem I am now curious.
> "David Gugick" wrote:
>
|||My sources: http://www.sql-server-performance.co...ed_indexes.asp
say that the uniqueifier used in a non-unique clustered index is a 4 byte value, as opposed to a (16 byte) GUID. Is there any other support either way? I can't find any in BOL.
jg

Quote:

...Originally posted by Anthony Thomas
As far as uniqueness, if the Clustered Index is not unique, SQL Server will
make it so by appending a GUID to the key to force it to be unique...

No comments:

Post a Comment