Showing posts with label troble. Show all posts
Showing posts with label troble. Show all posts

Friday, March 23, 2012

Noob having troble with group by etc.

Well i guess it is a bit more than just group by but here we go...

General description :( background info on the problem; may or may not be needed)

I am trying to get a list of active card holders for a particular month. A card holder is considered active if their status was active any time during that month. I have a card holder status in my CardHolder table but that will only give me their current status. I also have a CardHolderHistory table that saves any changes to a card holder including their status. Using the CardHolderHistory table i can find out who was active for any month in question by using the following algorithm:

A card holder is active if they meet one or both of the following conditions;

1) The last entry before the month in question has them as active in CardHolderHistory or

2) They were set to active any time during the month in question

The problem:

I create a temporary table and copy the card holders into the table with separate select into statements for each of the above conditions. For the first condition, I first gather all the CardHolderHistory entries before the month in question with the following select

select EmployerName, cardholderhistory.FirstName, cardholderhistory.Initial, cardholderhistory.LastName, ChangeDate, CardHolderHistory.CardState, FKEmployerID

into #MonthlyCards from cardholderhistory left join Employer on PKEmployerID = cardholderhistory.FKEmployerID

where changedate<@.datespecified

This gives me ALL of the CardHolderHistory before the month in question (@.dateSpecified).

It would look something like this

EmployerName First Initial Last Status ChangeDate EmployerID

-

.

.

.

Some Fake Company John P Doe A 2006-05-02 11:34:26.360 109

Some Fake Company John P Doe A 2006-03-28 11:14:10.520 109

Some Fake Company Jane T Doe S 2006-05-30 15:34:14.900 109

Some Fake Company Jane T Doe A 2006-03-28 12:20:03.670 109

Some Fake Company Jane T Doe A 2006-03-01 10:12:45.320 109

.

. (S = suspended; A = Active)

.

Now I need to make sure people are only listed once and remove everyone whose status = suspended for their last change date. That is my problem; I can’t figure out how to remove these people. The next step i took was to do a group by with EmployerName, FirstName, Initial, LastName, Status, Max(ChangeDate), FKEmployerID. This removes everyone listed multiple times except if their status changes. After the group by the list would be

EmployerName First Initial Last Status ChangeDate EmployerID

-

Some Fake Company John P Doe A 2006-05-02 11:34:26.360 109

Some Fake Company Jane T Doe S 2006-05-30 15:34:14.900 109

Some Fake Company Jane T Doe A 2006-03-28 12:20:03.670 109

Now I need to delete Jane completely, since her latest status is S. I could delete where Status = ‘S’ but that leaves her as active from 3/28. I can’t figure out how to completely remove Jane. The rest I can figure out I think. Please help me get rid of Jane for good, I don’t like her.

Thanks for any help,

Rico

Instead of grabbing too many records, how about grabbing only the records you want. In this case, construct a query that uses a full outer join of the two conditions you have stated. Now you will have only one record for the active members and you can sort out the values of the columns.

Hopefully, this code gives you something to work with. I have assumed that there is a CardHolderID that is unique to each CardHolder record.

Select

From(

select c1.CardHolderID,

c1.FirstName,

c1.Initial,

c1.LastName,

c1.ChangeDate,

C1.CardState,

FKEmployerID

From cardholderhistory c1

Where ChangeDate = (

Select Top 1

changeDate

From cardholderhistory c2

Where c1.CardHolderID = c2.CardHolderID

And c2.ChangeDate < @.DateSpecified

Order by changeDate desc )

And CardState = 'A'

) t1

Full Outer Join

(

select c3.CardHolderID,

c3.FirstName,

c3.Initial,

c3.LastName,

c3.ChangeDate,

C3.CardState,

FKEmployerID

From cardholderhistory c3

Where ChangeDate between @.monthStart and @.monthEnd

) t2

On t1.CardHolderID = t2.CardHolderID

Join Employer

On t1.FKEmployerID = Employer.PKEmployerID

|||

Anomolus,

Thank you very much for your reply. I have wasted an embarrassing number of hours trying to figure this out but with your help I was able to get what I needed. The code i used was

declare @.dateSpecified datetime
--@.employerID int
declare @.endDate datetime

set @.dateSpecified = '8/1/2006'
set @.endDate = dateadd(m, 1, @.dateSpecified)


select EmployerName, chh.FirstName, chh.Initial, chh.LastName, FKEmployerID
into #MonthlyCards
From CardHolderHistory chh left join
Employer on PKEmployerID = FKemployerID
Where ChangeDate = ( Select Top 1 ChangeDate
From CardHolderHistory chh2
Where chh.PKCardHolderID = chh2.PKCardHolderID
And chh2.ChangeDate < @.DateSpecified
Order by changeDate desc )
And (CardState = 'A' or CardState = '')
Order By EmployerName, LastName, FirstName, Initial

insert into #MonthlyCards([EmployerName],[FirstName],[Initial],[LastName],[FKEmployerID])
(select EmployerName, chh.FirstName, chh.Initial, chh.LastName, FKEmployerID
From CardHolderHistory chh left join
Employer on PKEmployerID = FKemployerID
where (changedate between dateadd(ms, (-5),@.dateSpecified) and @.enddate)
And (CardState = 'A' or CardState = ''))

Select * from #monthlycards
group by EmployerName, FirstName, Initial, LastName, FKEmployerID
--order by pkcardholderid

drop table #MonthlyCards

I was having a problem getting that first condition in the form of a select. I have few follow-up questions, now that i got it to work.

1) Why use the Full outer join? It seems to get a bunch of info that i don't need and has a bunch of nulls. Wouldn't the full outer join join the two conditions into the same record?

2) What if there are too ChangeDates with the same value? Will this cause errors or is it generally acceptable to rely on the millisecond precision of the datetime data type?

3) Is there anything i should do to optimize this code? My CardHolderHistory table has thousands of entries already and it will continue to get bigger.

Again, thanks Anomolus, and thank you to anyone that can help with these questions,

Rico

|||

1) The Full outer join allows you to get the list of customers in a single select. The customers will match at least one of the two conditions and if it matches both, then they are combined into a single record. You will get nulls when the customer only satisfies one condition. It occurs to me that all you really need out of this is the CustomerID. Isn't there a risk of getting old information if you grab the name from history as it could have been what was changed last?

2) This is really a business question since your rule says the last change. If it is possible for two changes to happen at the same time, then you ought to have another field that designates the version order in the history table.

|||

"Isn't there a risk of getting old information if you grab the name from history as it could have been what was changed last?"

Yes there is, thanks for catching that! I thouught that i would save some resources by using one less join but you are right... i had thought of that in my earlier failed attemps but looked over it in this version.

"If it is possible for two changes to happen at the same time"

I dont think this is possible, maybe you could help me find out. The ChangeDate value is filled in automatically (the default value is getdate())when a new CardHolderHistory record is created. It is not changes beyond that. If 2 different administrators used the website and tried to update card holders at the same and my SaveCardHolder stored procedure is called at the same time im guessing this could generate the same ChangeDate. Besides the fact that it is unlikly, can more than one record be saved at the same time? I insert into the CardHolderHistory table from Save CardHolder using transactions (roll back on error otherwise commit) if this makes a difference.

Thank you for your help! You have really saved me!

sql