<i id='COhQi'><tr id='COhQi'><dt id='COhQi'><q id='COhQi'><span id='COhQi'><b id='COhQi'><form id='COhQi'><ins id='COhQi'></ins><ul id='COhQi'></ul><sub id='COhQi'></sub></form><legend id='COhQi'></legend><bdo id='COhQi'><pre id='COhQi'><center id='COhQi'></center></pre></bdo></b><th id='COhQi'></th></span></q></dt></tr></i><div id='COhQi'><tfoot id='COhQi'></tfoot><dl id='COhQi'><fieldset id='COhQi'></fieldset></dl></div>

      <small id='COhQi'></small><noframes id='COhQi'>

      1. <tfoot id='COhQi'></tfoot>
      2. <legend id='COhQi'><style id='COhQi'><dir id='COhQi'><q id='COhQi'></q></dir></style></legend>
          <bdo id='COhQi'></bdo><ul id='COhQi'></ul>

        获取信封.即重叠的时间跨度

        Get envelope.i.e overlapping time spans(获取信封.即重叠的时间跨度)
          <bdo id='v18mX'></bdo><ul id='v18mX'></ul>
            • <legend id='v18mX'><style id='v18mX'><dir id='v18mX'><q id='v18mX'></q></dir></style></legend>

              <small id='v18mX'></small><noframes id='v18mX'>

                    <tbody id='v18mX'></tbody>

                1. <tfoot id='v18mX'></tfoot>
                  <i id='v18mX'><tr id='v18mX'><dt id='v18mX'><q id='v18mX'><span id='v18mX'><b id='v18mX'><form id='v18mX'><ins id='v18mX'></ins><ul id='v18mX'></ul><sub id='v18mX'></sub></form><legend id='v18mX'></legend><bdo id='v18mX'><pre id='v18mX'><center id='v18mX'></center></pre></bdo></b><th id='v18mX'></th></span></q></dt></tr></i><div id='v18mX'><tfoot id='v18mX'></tfoot><dl id='v18mX'><fieldset id='v18mX'></fieldset></dl></div>
                  本文介绍了获取信封.即重叠的时间跨度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我有一个这样的在线会话表(空行只是为了更好的可见性):

                  I have a table with online sessions like this (empty rows are just for better visibility):

                  ip_address  | start_time       | stop_time
                  ------------|------------------|------------------
                  10.10.10.10 | 2016-04-02 08:00 | 2016-04-02 08:12
                  10.10.10.10 | 2016-04-02 08:11 | 2016-04-02 08:20
                  
                  10.10.10.10 | 2016-04-02 09:00 | 2016-04-02 09:10
                  10.10.10.10 | 2016-04-02 09:05 | 2016-04-02 09:08
                  10.10.10.10 | 2016-04-02 09:05 | 2016-04-02 09:11
                  10.10.10.10 | 2016-04-02 09:02 | 2016-04-02 09:15
                  10.10.10.10 | 2016-04-02 09:10 | 2016-04-02 09:12
                  
                  10.66.44.22 | 2016-04-02 08:05 | 2016-04-02 08:07
                  10.66.44.22 | 2016-04-02 08:03 | 2016-04-02 08:11
                  

                  而且我需要信封"在线时间跨度:

                  And I need the "envelop" online time spans:

                  ip_address  | full_start_time  | full_stop_time
                  ------------|------------------|------------------
                  10.10.10.10 | 2016-04-02 08:00 | 2016-04-02 08:20
                  10.10.10.10 | 2016-04-02 09:00 | 2016-04-02 09:15
                  10.66.44.22 | 2016-04-02 08:03 | 2016-04-02 08:11
                  

                  我有这个返回所需结果的查询:

                  I have this query which returns desired result:

                  WITH t AS 
                      -- Determine full time-range of each IP
                      (SELECT ip_address, MIN(start_time) AS min_start_time, MAX(stop_time) AS max_stop_time FROM IP_SESSIONS GROUP BY ip_address),
                  t2 AS
                      -- compose ticks
                      (SELECT DISTINCT ip_address, min_start_time + (LEVEL-1) * INTERVAL '1' MINUTE AS ts
                      FROM t
                      CONNECT BY min_start_time + (LEVEL-1) * INTERVAL '1' MINUTE <= max_stop_time),
                  t3 AS 
                      -- get all "online" ticks
                      (SELECT DISTINCT ip_address, ts
                      FROM t2
                          JOIN IP_SESSIONS USING (ip_address)
                      WHERE ts BETWEEN start_time AND stop_time),
                  t4 AS
                      (SELECT ip_address, ts,
                          LAG(ts) OVER (PARTITION BY ip_address ORDER BY ts) AS previous_ts
                      FROM t3),
                  t5 AS 
                      (SELECT ip_address, ts, 
                          SUM(DECODE(previous_ts,NULL,1,0 + (CASE WHEN previous_ts + INTERVAL '1' MINUTE <> ts THEN 1 ELSE 0 END))) 
                              OVER (PARTITION BY ip_address ORDER BY ts ROWS UNBOUNDED PRECEDING) session_no
                      FROM t4)
                  SELECT ip_address, MIN(ts) AS full_start_time, MAX(ts) AS full_stop_time
                  FROM t5
                  GROUP BY ip_address, session_no
                  ORDER BY 1,2;
                  

                  但是,我担心性能.该表有数亿行,时间分辨率为毫秒(不是示例中给出的一分钟).因此 CTE t3 将是巨大的.有没有人有避免自加入和CONNECT BY"的解决方案?

                  However, I am concerned about the performance. The table has hundreds of million rows and the time resolution is millisecond (not one Minute as given in example). Thus CTE t3 is gonna be huge. Does anybody have a solution which avoids the Self-Join and "CONNECT BY"?

                  单个智能分析函数很棒.

                  推荐答案

                  也试试这个.我尽我所能对其进行了测试,我相信它涵盖了所有可能性,包括合并相邻的间隔(10:15 到 10:30 和 10:30 到 10:40 合并为一个间隔,10:15 到 10:40)).它也应该很快,它用的不多.

                  Try this one, too. I tested it the best I could, I believe it covers all the possibilities, including coalescing adjacent intervals (10:15 to 10:30 and 10:30 to 10:40 are combined into a single interval, 10:15 to 10:40). It should also be quite fast, it doesn't use much.

                  with m as
                          (
                           select ip_address, start_time,
                                     max(stop_time) over (partition by ip_address order by start_time 
                                               rows between unbounded preceding and 1 preceding) as m_time
                           from ip_sessions
                           union all
                           select ip_address, NULL, max(stop_time) from ip_sessions group by ip_address
                          ),
                       n as
                          (
                           select ip_address, start_time, m_time 
                           from m 
                           where start_time > m_time or start_time is null or m_time is null
                          ),
                       f as
                          (
                           select ip_address, start_time,
                              lead(m_time) over (partition by ip_address order by start_time) as stop_time
                           from n
                          )
                  select * from f where start_time is not null
                  /
                  

                  这篇关于获取信封.即重叠的时间跨度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  How to delete duplicate records in mysql database?(如何删除mysql数据库中的重复记录?)
                  Python Pandas write to sql with NaN values(Python Pandas 使用 NaN 值写入 sql)
                  MySQL Insert amp; Joins(MySQL 插入 amp;加入)
                  MySQL concat() to create column names to be used in a query?(MySQL concat() 创建要在查询中使用的列名?)
                  NodeJS responded MySQL timezone is different when I fetch directly from MySQL(当我直接从 MySQL 获取时,NodeJS 响应 MySQL 时区不同)
                  WHERE datetime older than some time (eg. 15 minutes)(WHERE 日期时间早于某个时间(例如 15 分钟))
                    <bdo id='TZKAH'></bdo><ul id='TZKAH'></ul>
                      <i id='TZKAH'><tr id='TZKAH'><dt id='TZKAH'><q id='TZKAH'><span id='TZKAH'><b id='TZKAH'><form id='TZKAH'><ins id='TZKAH'></ins><ul id='TZKAH'></ul><sub id='TZKAH'></sub></form><legend id='TZKAH'></legend><bdo id='TZKAH'><pre id='TZKAH'><center id='TZKAH'></center></pre></bdo></b><th id='TZKAH'></th></span></q></dt></tr></i><div id='TZKAH'><tfoot id='TZKAH'></tfoot><dl id='TZKAH'><fieldset id='TZKAH'></fieldset></dl></div>
                        <tbody id='TZKAH'></tbody>
                      <tfoot id='TZKAH'></tfoot>

                      • <small id='TZKAH'></small><noframes id='TZKAH'>

                          <legend id='TZKAH'><style id='TZKAH'><dir id='TZKAH'><q id='TZKAH'></q></dir></style></legend>