Thursday, March 8, 2012

A simple group by query seems Impossible

I've been fighting with this all day there has to be an easy way to do this in a single query:

FOR EACH fac_id, get the INSPECTION_ID of the earliest start_date:

INSPECTION_ID FAC_ID START_DATE
300720040628004 566 2004-07-07
300720030618002 566 2003-06-25
200820021219011 1436 2002-12-19
300720030206002 1458 2003-02-03
300720030206003 1458 2003-02-05
300720030121002 1480 2003-03-25
300720030121003 1480 2005-02-02
200820031230001 1436 2003-12-30
300720040616006 1566 2004-08-26
300720040616001 1566 2001-08-26
300720040616005 1566 2002-08-26
300720040616002 1566 2003-08-26
300720040616003 1566 2004-08-26
300720040616004 1566 2006-08-26
300720040504004 1569 2004-09-10

DESIRED OUTPUT
INSPECTION_ID FAC_ID START_DATE
300720030618002 566 2003-06-25
200820021219011 1436 2002-12-19
300720030206002 1458 2003-02-03
300720030121002 1480 2003-03-25
200820031230001 1436 2003-12-30
300720040616001 1566 2001-08-26
300720040504004 1569 2004-09-10you have FAC_ID 1436 listed twice with two start dates ... this appears to voilate the ground rules, since there are only 6 distinct FAC_ID values.

Given that, this should work:


-- create the table
create table #temp (
INSPECTION_ID bigint,
FAC_ID int,
START_DATE datetime)

-- populate the table
insert into #temp
select 300720040628004, 566 , '2004-07-07'
UNION
select 300720030618002, 566 , '2003-06-25'
UNION
select 200820021219011, 1436, '2002-12-19'
UNION
select 300720030206002, 1458, '2003-02-03'
UNION
select 300720030206003, 1458, '2003-02-05'
UNION
select 300720030121002, 1480, '2003-03-25'
UNION
select 300720030121003, 1480, '2005-02-02'
UNION
select 200820031230001, 1436, '2003-12-30'
UNION
select 300720040616006, 1566, '2004-08-26'
UNION
select 300720040616001, 1566, '2001-08-26'
UNION
select 300720040616005, 1566, '2002-08-26'
UNION
select 300720040616002, 1566, '2003-08-26'
UNION
select 300720040616003, 1566, '2004-08-26'
UNION
select 300720040616004, 1566, '2006-08-26'
UNION
select 300720040504004, 1569, '2004-09-10'

-- code to select FAC_ID with earliest start date
select t.INSPECTION_ID, t.FAC_ID, t.START_DATE
from #temp t
inner join
(select t2.fac_id, min(t2.start_date) start_date
from #temp t2
group by t2.FAC_ID) t2 on t2.start_date = t.start_date
order by t.fac_id

-- only 6 distinct FAC_ID
select distinct (fac_id)
from #temp

drop table #temp|||I'm simply going to assume that listing TWO 1436 rows in the output was a mistake, if not you're going to have to give a better explaination of what you want - if you want the earliest START_DATE for each FAC_ID, then shouldn't be getting two rows for a single FAC_ID, should you?

Seems to me the problem is how to reterive the INSPECTION_ID from the minimum row without messing up the grouping. I've long wished SQL had the ability to do this...

SELECT ... FROM ... WHERE MIN_OF(column)

but alas, it does not exist natively. Here's a way to do it with a nested query.

SELECT i.*
FROM inspection_table AS i
JOIN (SELECT f=fac_id, s=MIN(start_date)
FROM inspection_table
GROUP BY fac_id) AS q ON i.fac_id = q.f AND i.start_date = q.s

No comments:

Post a Comment