问题描述
我正在寻找一种将不同行中的字符串聚合为一行的方法.我希望在许多不同的地方这样做,所以有一个功能来促进这一点会很好.我已经尝试过使用 COALESCE
和 FOR XML
的解决方案,但他们只是不适合我.
I'm finding a way to aggregate strings from different rows into a single row. I'm looking to do this in many different places, so having a function to facilitate this would be nice. I've tried solutions using COALESCE
and FOR XML
, but they just don't cut it for me.
字符串聚合会做这样的事情:
String aggregation would do something like this:
id | Name Result: id | Names
-- - ---- -- - -----
1 | Matt 1 | Matt, Rocks
1 | Rocks 2 | Stylus
2 | Stylus
我看过CLR 定义的聚合函数 作为 COALESCE
和 FOR XML
的替代品,但显然 SQL Azure 确实可以不支持 CLR 定义的东西,这对我来说很痛苦,因为我知道能够使用它会为我解决很多问题.
I've taken a look at CLR-defined aggregate functions as a replacement for COALESCE
and FOR XML
, but apparently SQL Azure does not support CLR-defined stuff, which is a pain for me because I know being able to use it would solve a whole lot of problems for me.
是否有任何可能的解决方法或类似的最佳方法(可能不如 CLR 最佳,但是嘿我会用我能得到的)来聚合我的东西?
Is there any possible workaround, or similarly optimal method (which might not be as optimal as CLR, but hey I'll take what I can get) that I can use to aggregate my stuff?
推荐答案
解决方案
optimal 的定义可能会有所不同,但这里介绍了如何使用常规 Transact SQL 连接来自不同行的字符串,这在 Azure 中应该可以正常工作.
The definition of optimal can vary, but here's how to concatenate strings from different rows using regular Transact SQL, which should work fine in Azure.
;WITH Partitioned AS
(
SELECT
ID,
Name,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Name) AS NameNumber,
COUNT(*) OVER (PARTITION BY ID) AS NameCount
FROM dbo.SourceTable
),
Concatenated AS
(
SELECT
ID,
CAST(Name AS nvarchar) AS FullName,
Name,
NameNumber,
NameCount
FROM Partitioned
WHERE NameNumber = 1
UNION ALL
SELECT
P.ID,
CAST(C.FullName + ', ' + P.Name AS nvarchar),
P.Name,
P.NameNumber,
P.NameCount
FROM Partitioned AS P
INNER JOIN Concatenated AS C
ON P.ID = C.ID
AND P.NameNumber = C.NameNumber + 1
)
SELECT
ID,
FullName
FROM Concatenated
WHERE NameNumber = NameCount
说明
该方法归结为三个步骤:
The approach boils down to three steps:
使用
OVER
和PARTITION
对行进行编号,并根据需要对它们进行串联排序.结果是Partitioned
CTE.我们保留每个分区中的行数,以便稍后过滤结果.
Number the rows using
OVER
andPARTITION
grouping and ordering them as needed for the concatenation. The result isPartitioned
CTE. We keep counts of rows in each partition to filter the results later.
使用递归 CTE(Concatenated
)遍历行号(NameNumber
列),将 Name
值添加到 FullName
列.
Using recursive CTE (Concatenated
) iterate through the row numbers (NameNumber
column) adding Name
values to FullName
column.
过滤除NameNumber
最高的所有结果.
Filter out all results but the ones with the highest NameNumber
.
请记住,为了使此查询可预测,必须同时定义分组(例如,在您的场景中,具有相同 ID
的行被连接)和排序(我假设您只需在串联之前按字母顺序对字符串进行排序).
Please keep in mind that in order to make this query predictable one has to define both grouping (for example, in your scenario rows with the same ID
are concatenated) and sorting (I assumed that you simply sort the string alphabetically before concatenation).
我已经使用以下数据在 SQL Server 2012 上快速测试了该解决方案:
I've quickly tested the solution on SQL Server 2012 with the following data:
INSERT dbo.SourceTable (ID, Name)
VALUES
(1, 'Matt'),
(1, 'Rocks'),
(2, 'Stylus'),
(3, 'Foo'),
(3, 'Bar'),
(3, 'Baz')
查询结果:
ID FullName
----------- ------------------------------
2 Stylus
3 Bar, Baz, Foo
1 Matt, Rocks
这篇关于连接/聚合字符串的最佳方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!