SQL计算时间序列事件的数量,缺少一些开始或停止条目

2020-02-22 sql sql-server

我有一些开始/停止事件,我需要计算事件总数,但是有时缺少开始或停止事件,例如:

Time   Event
10:50   START
10:52   STOP
10:59   START
11:01   STOP
11:45   STOP

Count(Event),其中Event ='START' 将返回2,我还需要计算丢失的START值,因此结果应该为3。如何实现此目标的任何想法?谢谢!

Answers

必须满足两个约束条件才能启用事件计数。

  1. 两个START-STOP时间段不能重叠。
  2. 两个连续且按时间顺序排列的STARTSTOP事件不可能源自两个不同的事件,即START+(missing TOP)(missing START)+STOP

如果满足条件,则可以实现一个简单的状态机来检测“丢失”事件。这样的逐行逻辑可以(几乎总是)使用cursor语法来实现。

注意:为了举例说明cursor方法的一般性,您还可以看到我做出的其他答案A(更新列)B(繁琐的算法) 。代码结构非常相似。

测试数据集

use [testdb];
if OBJECT_ID('testdb..test') is not null
    drop table testdb..test;

create table test (
    [time] varchar(50),
    [event] varchar(50),
);

insert into test ([time], [event])
values ('10:50', 'START'),('10:52', 'STOP'),('10:59', 'START'),
       ('11:01', 'STOP'),('11:45', 'STOP'),('11:50', 'STOP'),('11:55', 'START');

select * from test;

/* cursor variables */

-- storage for each row
declare @time varchar(50),
        @event varchar(50),
        @state int = 0,  -- state variable
        @count int = 0;  -- event count

-- open a cursor ordered by [time]
declare cur CURSOR local
for select [time], [event]
    from test
    order by [time]
open cur;

/* main loop */

while 1=1 BEGIN

    /* fetch next row and check termination condition */
    fetch next from cur 
        into @time, @event;

    -- termination condition
    if @@FETCH_STATUS <> 0 begin
        -- check unfinished START before exit
        if @state = 1
            set @count += 1;
        -- exit loop
        break;
    end

    /* program body */

    -- case 1. state = 0 (clear state)
    if @state = 0 begin
        -- 1-1. normal case -> go to state 1
        if @event = 'START'
            set @state = 1;
        -- 1-2. a STOP without START -> keep state 0 and count++
        else if @event = 'STOP'
            set @count += 1;
        -- guard
        else 
            print '[Error] Bad event name: ' + @event
    end
    -- case 2. start = 1 (start is found)
    else if @state = 1 begin
        -- 2-1. normal case -> go to state 0 and count++
        if @event = 'STOP' begin
            set @count += 1;
            set @state = 0;            
        end
        -- 2-2. a START without STOP -> keep state 1 and count++
        else if @event = 'START'
            set @count += 1;
        -- guard
        else 
            print '[Error] Bad event name: ' + @event
    end
END

-- cleanup
close cur;
deallocate cur;

结果

print @count;  -- correct answer: 5

已在SQL Server 2017(Linux Docker映像,最新版本)上测试。

好吧,您可以计算每个开始,然后计算前面事件不是开始的每个“停止”:

select count(*)
from (select t.*,
             lag(event) over (order by time) as prev_event
      from t
     ) t
where event = 'start' or
      (prev_event = 'stop' and event = 'stop');

Related