问题描述
我有一个关于 T-SQL 和 SQL Server 的问题.
I have a question related to T-SQL and SQL Server.
假设我有一个包含 2 列的 Orders 表:
Let's say I have a table Orders with 2 columns:
- ProductId int
- CustomerId int
- 日期日期时间
我想要每个产品的第一个订单的日期,所以我执行这种类型的查询:
I want the date of the first order for every product, so I perform this type of query:
SELECT ProductId, MIN(Date) AS FirstOrder
FROM Orders
GROUP BY ProductId
我在 ProductId
上有一个索引,包括列 CustomerId
和 Date
以加快查询速度 (IX_Orders
>).查询计划看起来像是对 IX_Orders
的非聚集索引扫描,然后是流聚合(由于索引没有排序).
I have an index on ProductId
, including the columns CustomerId
and Date
to speed up the query (IX_Orders
). The query plan looks like a non-clustered index scan on IX_Orders
, followed by a stream aggregate (no sort thanks to the index).
现在我的问题是我还想检索与每个产品的第一个订单相关联的 CustomerId
(产品 26 于 25 日星期二由客户 12 首次订购).棘手的部分是我不希望在执行计划中有任何内部循环,因为这意味着表中每个 ProductId
的额外读取,这是非常低效的.
Now my problem is that I also want to retrieve the CustomerId
associated with the first order for each product (Product 26 was first ordered on Tuesday 25, by customer 12). The tricky part is that I don't want any inner loop in the execution plan, because it would mean an additional read per ProductId
in the table, which is highly inefficient.
这应该可以使用相同的非聚集索引扫描,然后是流聚合,但是我似乎找不到可以做到这一点的查询.有什么想法吗?
This should just be possible using the same non-clustered index scan, followed by stream aggregates, however I can't seem to find a query that would do that. Any idea?
谢谢
推荐答案
这将处理具有重复日期的产品:
this will handle products that have duplicate dates:
DECLARE @Orders table (ProductId int
,CustomerId int
,Date datetime
)
INSERT INTO @Orders VALUES (1,1,'20090701')
INSERT INTO @Orders VALUES (2,1,'20090703')
INSERT INTO @Orders VALUES (3,1,'20090702')
INSERT INTO @Orders VALUES (1,2,'20090704')
INSERT INTO @Orders VALUES (4,2,'20090701')
INSERT INTO @Orders VALUES (1,3,'20090706')
INSERT INTO @Orders VALUES (2,3,'20090704')
INSERT INTO @Orders VALUES (4,3,'20090702')
INSERT INTO @Orders VALUES (5,5,'20090703') --duplicate dates for product #5
INSERT INTO @Orders VALUES (5,1,'20090703') --duplicate dates for product #5
INSERT INTO @Orders VALUES (5,5,'20090703') --duplicate dates for product #5
;WITH MinOrders AS
(SELECT
o.ProductId, o.CustomerId, o.Date
,row_number() over(partition by o.ProductId order by o.ProductId,o.CustomerId) AS RankValue
FROM @Orders o
INNER JOIN (SELECT
ProductId
,MIN(Date) MinDate
FROM @Orders
GROUP BY ProductId
) dt ON o.ProductId=dt.ProductId AND o.Date=dt.MinDate
)
SELECT
m.ProductId, m.CustomerId, m.Date
FROM MinOrders m
WHERE m.RankValue=1
ORDER BY m.ProductId, m.CustomerId
这将返回相同的结果,只需使用与上述代码相同的声明和插入:
this will return the same results, just use the same declare and inserts as the above code:
;WITH MinOrders AS
(SELECT
o.ProductId, o.CustomerId, o.Date
,row_number() over(partition by o.ProductId order by o.ProductId,o.CustomerId) AS RankValue
FROM @Orders o
)
SELECT
m.ProductId, m.CustomerId, m.Date
FROM MinOrders m
WHERE m.RankValue=1
ORDER BY m.ProductId, m.CustomerId
您可以尝试每个版本,看看哪个版本运行得更快...
You can try out each version to see which will run faster...
这篇关于查找与 Min/Max 关联的行,没有内部循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!