问题描述
问题分为两部分.如何检查我的数据库中缺少哪些工作日,如果缺少某些工作日,则添加它们并用最近日期的值填充该行.
The problem splits into two parts. How to check which working days are missing from my database, if some are missing then add them and fill the row with the values from the closest date.
第一部分,检查并找到日期.我应该使用下面示例中的间隙方法吗?
First part, check and find the days. Should i use a gap approach like in the example below?
SELECT t1.col1 AS startOfGap, MIN(t2.col1) AS endOfGap
FROM
(SELECT col1 = theDate + 1 FROM sampleDates tbl1
WHERE NOT EXISTS(SELECT * FROM sampleDates tbl2
WHERE tbl2.theDate = tbl1.theDate + 1)
AND theDate <> (SELECT MAX(theDate) FROM sampleDates)) t1
INNER JOIN
(SELECT col1 = theDate - 1 FROM sampleDates tbl1
WHERE NOT EXISTS(SELECT * FROM sampleDates tbl2
WHERE tbl1.theDate = tbl2.theDate + 1)
AND theDate <> (SELECT MIN(theDate) FROM sampleDates)) t2
ON t1.col1 <= t2.col1
GROUP BY t1.col1;
然后我需要查看哪个日期与我丢失的日期最接近,并用最近的值填充新插入的日期(丢失的日期).前段时间,我想出了一些方法来从一行中获得最接近的值,但这次我需要调整它以检查向下和向上.
Then i need to see which is the closest date to the one i was missing and fill the new inserted date (the one which was missing) with the values from the closest. Some time ago, I came up with something to get the closest value from a row, but this time i need to adapt it to check both down and upwards.
SELECT
t,A, C,Y,
COALESCE(Y,
(SELECT TOP (1) Y
FROM tableT AS p2
WHERE
p2.Y IS NOT NULL
AND p2.[t] <= p.[t] and p.C = p2.C
ORDER BY p2.[t] DESC)) as 'YNew'
FROM tableT AS p
order by c, t
如何将它们合二为一?
谢谢
预期结果
Date 1mA
20.12.2012 0.152
21.12.2012 0.181
22 weekend so it's skipped (they are skipped automatically)
23 weekend -,-
24 missing
25 missing
26 missing
27.12.2012 0.173
28.12.2012 0.342
Date 1mA
20.12.2012 0.152
21.12.2012 0.181
22 weekend so it's skipped (they are skipped automatically)
23 weekend 0.181
24 missing 0.181
25 missing 0.181
26 missing 0.173
27.12.2012 0.173
28.12.2012 0.342
因此,24,25,26 甚至不存在空值.他们根本不在那里.
So, 24,25,26 are not even there with null values. They are simply not there.
编辑 2:为了取最接近的值,让我们考虑一下我一直在上面看的场景.所以当它丢失时总是返回 1.
EDIT 2: For taking the closest value, let's consider the scenario in which i'm always looking above. So always going back 1 when it's missing.
Date 1mA
20.12.2012 0.152
21.12.2012 0.181
22 weekend so it's skipped (they are skipped automatically)
23 weekend 0.181
24 missing 0.181
25 missing 0.181
26 missing 0.181
27.12.2012 0.173
28.12.2012 0.342
推荐答案
对于这些类型的查询,您可以通过创建包含您需要测试的每个日期的日历表获得显着的性能优势.(如果您熟悉术语维度表",这只是一个用于枚举每个感兴趣日期的表格.)
For these types of query you gain significant performance benefits from creating a calendar table containing every date you'll ever need to test. (If you're familiar with the term "dimension tables", this is just one such table to enumerate every date of interest.)
此外,整个查询可以变得非常简单.
Also, the query as a whole can become significantly simpler.
SELECT
cal.calendar_date AS data_date,
CASE WHEN prev_data.gap <= next_data.gap
THEN prev_data.data_value
ELSE COALESCE(next_data.data_value, prev_data.data_value)
END
AS data_value
FROM
calendar AS cal
OUTER APPLY
(
SELECT TOP(1)
data_date,
data_value,
DATEDIFF(DAY, data_date, cal.calendar_date) AS gap
FROM
data_table
WHERE
data_date <= cal.calendar_date
ORDER BY
data_date DESC
)
prev_data
OUTER APPLY
(
SELECT TOP(1)
data_date,
data_value,
DATEDIFF(DAY, cal.calendar_date, data_date) AS gap
FROM
data_table
WHERE
data_date > cal.calendar_date
ORDER BY
data_date ASC
)
next_data
WHERE
cal.calendar_date BETWEEN '2015-01-01' AND '2015-12-31'
;
编辑以不同的要求回复您的评论
EDIT Reply to your comment with a different requirement
始终获得上面的值"更容易,并且将这些值插入到表中也很容易...
To always get "the value above" is easier, and to insert those values in to a table is easy enough...
INSERT INTO
data_table
SELECT
cal.calendar_date,
prev_data.data_value
FROM
calendar AS cal
CROSS APPLY
(
SELECT TOP(1)
data_date,
data_value
FROM
data_table
WHERE
data_date <= cal.calendar_date
ORDER BY
data_date DESC
)
prev_data
WHERE
cal.calendar_date BETWEEN '2015-01-01' AND '2015-12-31'
AND cal.calendar_date <> prev_data.data_date
;
注意:您可以添加 WHERE prev_data.gap >0
到上面更大的查询,只获取没有数据的日期.
Note: You could add WHERE prev_data.gap > 0
to the bigger query above to only get dates that don't already have data.
这篇关于找到工作日的缺失条目,并用最近日期的值填充该行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!