Showing posts with label growing. Show all posts
Showing posts with label growing. Show all posts

Tuesday, March 20, 2012

Non-clustered indexes on 1 separate fileGroup?

Hi Guys,

I am reviewing the storage structure of my growing db.

I am going to create multiple FileGroups and db files.
Most of the filegroups will be used to allow table partitioning and put the heavily used partitions on different storage than the read-only archives. TempDB will also have its own filegroup/storage.

For Indexes, I have a doubt. I am sure I can have all my non-clustered indexes on a separate fileGroup and storage but what about primary keys and/or clustered indexes?

Should I move them to the same filegroup than the non-clustered indexes or should I leave them with the data? What would be best?

I am running a 100GB reporting datamart on SQL2005 enterprise, There are almost no transactions, just a daily and weekly refresh. I will soon have more storage, SAN or NetApp, not sure yet.

Thanks,

Philippe

Hi

Moving your indexes to a seperate filegroup can improve the parallism of some querries that need to work with indexes and then do a bookmark lookup afterwards. If you have a VLDB like you said you should consider doing it, since the indexes cant be kept in memory all the time. So when you access the indexes it will not block the retrieval of your data from the other partition.

Monday, March 19, 2012

Nonclustered index question

I have a table that has around 100,000 records in it, growing at a rate
of 200 records/day:
CREATE TABLE [Main] (
[ID] [int] IDENTITY (1, 1) NOT NULL ,
[Code] [char] (17) NOT NULL ,
[AgentID] [int] NULL ,
[DueDate] [smalldatetime] NOT NULL ,
[Status] [varchar] (18) NOT NULL
)
(This table also has about 30 more fields, but I have left them out for
brevity.)
I am looking to speed up a few specific search queries. I already have
a clustered index on my primary key field [ID], and a non-clustered
index on field [Code]. These indexes are fine performance wise.
I am looking to speed up 2 searches:
1) A combination of [AgentID] and [DueDate]: "... WHERE AgentID = n
AND DueDate BETWEEN (...)
2) [Status]: WHERE [Status]=xxx. Actually, this search is already
fast, and [Status] will really only have about 7 distinct values, so I
didn't index this field, as I figured a table scan would be faster in
this case. IS THIS A CORRECT ASSUMPTION. The result set for this
query is almost aways under 25 rows
My main question is: Should I put nonclustered indexes on [DueDate]
and [AgentID]. AgentID will have maybe 25 distinct values throughout
the table. The result set will usually be under 50 rows.
I just want to make sure I am not overindexing the table, and that if I
index these fields it will be beneficial.
Any help would be appreciated.for query 1, an index on ([AgentID], [DueDate]) could be an option, but
it looks like AgentID is not too sleective:

> AgentID will have maybe 25 distinct values throughout
so I'd also consider indexes on ([DueDate], [AgentID]) and just on
([DueDate])|||> 2) [Status]: WHERE [Status]=xxx. Actually, this search is already
> fast, and [Status] will really only have about 7 distinct values, so I
> didn't index this field, as I figured a table scan would be faster in
> this case. IS THIS A CORRECT ASSUMPTION. The result set for this
> query is almost aways under 25 rows
How exactly are you using this column? (i.e. how many rows per each value)
Searching for individual status values might be improved through the use of
indexed views - one per each status value. But may have a negative impact on
inserts/updates.
ML
http://milambda.blogspot.com/|||With 7 distinct values for Status, 100,000 rows in the table, and
WHERE [Status]=xxx returning under 25 rows, I have to assume that the
values for Status are severely skewed, and the status being tested is
one of the low frequency ones. Even with so few values, a query on a
specific low frequency value may in fact benefit from an index. I
would add an index on Status and see if it was used and if it helped.
With 100,000 rows, 25 distinct values for AgentID, and a result set of
around 50 rows, I have to assume that the DueDate BETWEEN test is
quite restrictive. I would try an index on (AgentID,DueDate).
Roy
On 23 Feb 2006 08:29:24 -0800, kaczmar2@.hotmail.com wrote:

>I have a table that has around 100,000 records in it, growing at a rate
>of 200 records/day:
>CREATE TABLE [Main] (
> [ID] [int] IDENTITY (1, 1) NOT NULL ,
> [Code] [char] (17) NOT NULL ,
> [AgentID] [int] NULL ,
> [DueDate] [smalldatetime] NOT NULL ,
> [Status] [varchar] (18) NOT NULL
> )
>(This table also has about 30 more fields, but I have left them out for
>brevity.)
>I am looking to speed up a few specific search queries. I already have
>a clustered index on my primary key field [ID], and a non-clustered
>index on field [Code]. These indexes are fine performance wise.
>I am looking to speed up 2 searches:
>1) A combination of [AgentID] and [DueDate]: "... WHERE AgentID = n
>AND DueDate BETWEEN (...)
>2) [Status]: WHERE [Status]=xxx. Actually, this search is already
>fast, and [Status] will really only have about 7 distinct values, so I
>didn't index this field, as I figured a table scan would be faster in
>this case. IS THIS A CORRECT ASSUMPTION. The result set for this
>query is almost aways under 25 rows
>My main question is: Should I put nonclustered indexes on [DueDate]
>and [AgentID]. AgentID will have maybe 25 distinct values throughout
>the table. The result set will usually be under 50 rows.
>I just want to make sure I am not overindexing the table, and that if I
>index these fields it will be beneficial.
>Any help would be appreciated.|||kaczmar2@.hotmail.com wrote:
> I have a table that has around 100,000 records in it, growing at a
> rate of 200 records/day:
> CREATE TABLE [Main] (
> [ID] [int] IDENTITY (1, 1) NOT NULL ,
> [Code] [char] (17) NOT NULL ,
> [AgentID] [int] NULL ,
> [DueDate] [smalldatetime] NOT NULL ,
> [Status] [varchar] (18) NOT NULL
> )
> (This table also has about 30 more fields, but I have left them out
> for brevity.)
> I am looking to speed up a few specific search queries. I already
> have a clustered index on my primary key field [ID], and a
> non-clustered index on field [Code]. These indexes are fine
> performance wise.
> I am looking to speed up 2 searches:
> 1) A combination of [AgentID] and [DueDate]: "... WHERE AgentID = n
> AND DueDate BETWEEN (...)
Depending on the selectiveness of your query criteria and distribuition of
values it might in fact be better to change the PK to non clustered and
have a clustered index on (DueDate, AgentID). If inserted DueDates are
increasing then inserts may also benefit from this index layout. On the
other hand, ID will be increasing, too. But for a range query a clustered
index is usually favourable.

> 2) [Status]: WHERE [Status]=xxx. Actually, this search is already
> fast, and [Status] will really only have about 7 distinct values, so I
> didn't index this field, as I figured a table scan would be faster in
> this case. IS THIS A CORRECT ASSUMPTION. The result set for this
> query is almost aways under 25 rows
Assuming equal distribution of values you'll get only 14% of the rows with
this criterion. I would assume that an index pays off here as this seems
pretty selective. Since you seem to be mostly searching for low freq
values it's highly likely that and index would help, especially as the
table grows. The more often you search for low freq values the more the
index pays off.

> My main question is: Should I put nonclustered indexes on [DueDate]
> and [AgentID]. AgentID will have maybe 25 distinct values throughout
> the table. The result set will usually be under 50 rows.
> I just want to make sure I am not overindexing the table, and that if
> I index these fields it will be beneficial.
You'll have to do some testing to get definitive answers. And you
probably should verify your findings from time to time as the table grows.
You might as well create a test table with the same layout and much more
data (but with typical distribution) if you want to see the behavior for
large tables beforehand.
Kind regards
robert

Saturday, February 25, 2012

No truncate of transaction log during database backup

Hello,
A transaction log on one of my sql 2000 server is growing
bigger and bigger. During the backup with backupexec 9.1
the log isn't truncated. I use the option full in the
option tab of the dbase.
cheers!
KeesFull backups do not truncate inactive log entries. Transaction log backups
in full recovery model will remove inactive portions. Removing inactive
entries will not shrink the log file. There is a separate command for that.
I suggest reading the entire backup and recovery section in BOL (Books
On-Line) as your choice of recovery models has several implications for
database maintenance.
--
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com
"Kees" <anonymous@.discussions.microsoft.com> wrote in message
news:000c01c3dc64$0821f7b0$a001280a@.phx.gbl...
> Hello,
> A transaction log on one of my sql 2000 server is growing
> bigger and bigger. During the backup with backupexec 9.1
> the log isn't truncated. I use the option full in the
> option tab of the dbase.
> cheers!
> Kees|||Are you backing up the transaction log? That is the only
way to control the log size when using the Full recovery
model. If you don't need the transaction log, just
switch to the simple recovery model.
Sincerely,
Invotion Engineering Team
Advanced Microsoft Hosting Solutions
http://www.Invotion.com
>--Original Message--
>Hello,
>A transaction log on one of my sql 2000 server is growing
>bigger and bigger. During the backup with backupexec 9.1
>the log isn't truncated. I use the option full in the
>option tab of the dbase.
>cheers!
>Kees
>.
>