问题描述
我正在努力解决这个问题,因为它涉及到连续行的比较.我正在尝试对相差某个数字的值进行分组.例如,假设我有这张表:
I am trying to get my head around doing this as it involves comparison of consecutive rows. I am trying to group values that differ by a certain number. For instance, let us say I have this table:
CREATE TABLE #TEMP (A int, B int)
-- Sample table
INSERT INTO #TEMP VALUES
(3,1),
(3,2),
(3,3),
(3,4),
(5,1),
(6,1),
(7,2),
(8,3),
(8,4),
(8,5),
(8,6)
SELECT * FROM #TEMP
DROP TABLE #TEMP
假设我必须将所有相差 1 且 A 具有相同值的值进行分组.然后我试图获得这样的输出:
And let us say I have to group all values that differ by 1 having the same value for A. Then I am trying to get an output like this:
A B GroupNo
3 1 1
3 2 1
3 3 1
3 4 1
5 1 2
6 1 3
7 2 4
8 3 5
8 4 5
8 5 5
8 6 5
(3,1) (3,2) (3,3) (3,4)
和 (8,3) (8,4) (8,5) (8,6)
已被放入同一组,因为它们的值相差 1.我将首先展示我的尝试:
(3,1) (3,2) (3,3) (3,4)
and (8,3) (8,4) (8,5) (8,6)
have been put into the same group because they differ by a value 1. I will first show my attempt:
CREATE TABLE #TEMP (A int, B int)
-- Sample table
INSERT INTO #TEMP VALUES
(3,1), (3,2), (3,3), (3,4), (5,1), (6,1), (7,2),
(8,3), (8,4), (8,5), (8,6)
-- Assign row numbers and perform a left join
-- so that we can compare consecutive rows
SELECT ROW_NUMBER() OVER (ORDER BY A ASC) ID, *
INTO #TEMP2
FROM #TEMP
;WITH CTE AS
(
SELECT X.A XA, X.B XB, Y.A YA, Y.B YB
FROM #TEMP2 X
LEFT JOIN #TEMP2 Y
ON X.ID = Y.ID - 1
WHERE X.A = Y.A AND
X.B = Y.B - 1
)
SELECT XA, XB
INTO #GROUPS
FROM CTE
UNION
SELECT YA, YB
FROM CTE
ORDER BY XA ASC
-- Finally assign group numbers
SELECT X.XA, X.XB, Y.GID
FROM #GROUPS X
INNER JOIN
(SELECT XA, ROW_NUMBER() OVER (ORDER BY XA ASC) GID
FROM #GROUPS Y
GROUP BY XA
) Y
ON X.XA = Y.XA
DROP TABLE #TEMP
DROP TABLE #TEMP2
DROP TABLE #GROUPS
我将在一个大表(大约 3000 万行)上执行此操作,因此我希望有一种更好的方法可以对任意值执行此操作(例如,不仅相差 1,而且可能是 2 或 3稍后我将把它合并到一个程序中).关于我的方法是否没有错误以及是否可以改进的任何建议?
I will be doing this on a large table (about 30 million rows) so I was hoping there is a better way of doing this for arbitrary values (for instance, not just differing by 1, but it could be 2 or 3 which I will incorporate later into a procedure). Any suggestions on whether my approach is bug-free and if it can be improved?
推荐答案
declare @Diff int = 1
;with C as
(
select A,
B,
row_number() over(partition by A order by B) as rn
from #TEMP
),
R as
(
select C.A,
C.B,
1 as G,
C.rn
from C
where C.rn = 1
union all
select C.A,
C.B,
G + case when C.B-R.B <= @Diff
then 0
else 1
end,
C.rn
from C
inner join R
on R.rn + 1 = C.rn and
R.A = C.A
)
select A,
B,
dense_rank() over(order by A, G) as G
from R
order by A, G
这篇关于获取所有连续行的不同值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!