Gregg Rosenstein
Gregg Rosenstein

Reputation: 111

Identifying Specific Time Segments in Excel

I have 10,000 rows of data in Excel and the first column is the time of a data recording (hh:mm:ss). I want to filter/extract (in the front end) only the rows where the readings are at least six consecutive seconds - these are signal strength recordings and I only consider readings of at least six consecutive seconds to be valid for my purposes. I would only want to extract the rows in green in the attached image. Separate ask - what if I wanted segments of only six to eight seconds? Thanks!

enter image description here

Upvotes: 1

Views: 160

Answers (1)

David Leal
David Leal

Reputation: 6769

With O365 (assuming no excel version constraints per tags used in the question) you can try the following in cell D1:

=LET(rng, A1:B32, ref, SCAN("", INDEX(rng,,1), LAMBDA(ac,a, IF(ac="", 1,
  IF((a - TIME(0,0,1)) = OFFSET(a,-1,0), 0, 1)))), size, ROWS(ref),
  seq, SEQUENCE(size), start, FILTER(seq, ref=1),
  end, VSTACK(DROP(start-1, 1), size), DROP(REDUCE("", start, LAMBDA(ac,s,
   LET(e, XLOOKUP(s, start, end), f, FILTER(rng, (seq >= s) * (seq <= e)),
    IF(ROWS(f) > 5, VSTACK(ac, f), ac)))),1))

Note: The same result can be achieved to avoid using the volatile function OFFSET, not using SCAN, and instead calculating the ref value using comparison as follow, where A represents the first column of rng (we keep the same logic as previous formula, but it can be simplified changing the logic to check 0 instead 1 to remove 1-N()):

=LET(rng,A1:B32,A,INDEX(rng,,1), ref,VSTACK(1,1-N(DROP(A,1)
     =(DROP(A,-1)+TIME(0,0,1)))), size, ROWS(ref), seq, SEQUENCE(size), 
   start, FILTER(seq, ref=1), end, VSTACK(DROP(start-1, 1), size), 
   DROP(REDUCE("", start, LAMBDA(ac,s, LET(e, XLOOKUP(s, start, end), 
   f, FILTER(rng, (seq >= s) * (seq <= e)),
    IF(ROWS(f) > 5, VSTACK(ac, f), ac)))),1))

Here is the output: excel output

Then adjust the input range (rng) to your real case.

Explanation

Using LET for easy reading and composition. The name ref identifies with 1, the start of each group of values with consecutive seconds of the first column from rng. It has the same number of rows as the input data.

We use DROP/REDUCE/VSTACK pattern to generate iteratively the output with the data that satisfies the conditions the group has more than 5 consecutive elements. Check the following answer to this question: how to transform a table in Excel from vertical to horizontal but with different length

Via REDUCE we iterate over all start group positions (start). For each start group position (s), finds the corresponding end group position (e) via XLOOKUP. Filter the range (rng) for rows (seq) between start (s) and end (e) rows via FILTER. Append the filter result (f) only if the number of rows is bigger than 5 via the VSTACK function.

Conditional Format

The formula provided is too large to be used for a conditional format formula (255 maximum number of characters). A possible workaround could be to use a helper column (you can hide it). It returns TRUE if the row belongs to a valid group, otherwise returns FALSE. If no valid group were found it returns FALSE too (we use this trick: NOT(SEQUENCE(size,,1,0)=1) to generate a constant column with FALSE values). Then you can highlight the column A values that correspond with TRUE values in the helper column.

=LET(A, TOCOL(A:A,1), ref, VSTACK(1,N(DROP(A,1) = (DROP(A,-1)+TIME(0,0,1)))),
  size, ROWS(ref), seq, SEQUENCE(size), start, FILTER(seq, ref=0), 
  end, VSTACK(DROP(start-1, 1), size), gr, FILTER(HSTACK(start, end), 
   (end-start +1) > 5, -1), sgr, INDEX(gr,,1), egr, INDEX(gr,,2),
  IF(@gr=-1, NOT(SEQUENCE(size,,1,0)=1), MAP(seq, LAMBDA(x, 
   LET(overlaps, SUM((sgr <= x) * (egr >= x)), IF(overlaps = 1, TRUE, FALSE))))))

as input, we use the entire column and filter by nonblank values via the TOCOL function (named A) using the second input argument of this function. In case of more than one overlapping interval, it returns FALSE too, just for testing purposes, it indicates some error calculating start and end names because per design it should never happen.

Tip: The previous formula can be used for the original purpose, using its output as a condition for a FILTER function to select only the input range where the value is TRUE. It is a matter of taste which route you want to go. For example, as follow:

=LET(rng,A1:B32,A,INDEX(rng,,1), ref, VSTACK(1,N(DROP(A,1) 
    = (DROP(A,-1)+TIME(0,0,1)))), size, ROWS(ref), seq, SEQUENCE(size), 
  start, FILTER(seq, ref=0), end, VSTACK(DROP(start-1, 1), size), 
  gr, FILTER(HSTACK(start, end), (end-start +1) > 5, -1), 
  sgr, INDEX(gr,,1), egr, INDEX(gr,,2),
  incl, IF(@gr=-1, NOT(SEQUENCE(size,,1,0)=1), MAP(seq, LAMBDA(x, 
   LET(overlaps, SUM((sgr <= x) * (egr >= x)), IF(overlaps = 1, TRUE, FALSE))))),
  FILTER(rng, incl=TRUE, "No Group found"))

Upvotes: 1

Related Questions