RuZ
RuZ

Reputation: 77

Big Query / SQL finding "new" data in a date range

I have a pretty big event log with columns:

id, timestamp, text, user_id

The text field contains a variety of things, like:

Road: This is the road name
City: This is the city name
Type: This is a type
etc..

I would like to get the result to the following:

Given a start and end date, how many **new** users used a road (that haven't before) grouped by road.

I've got various parts of this working fine (like the total amount of users, the grouping by, date range and so on. The SQL for getting the new users is alluding me though, having tried solutions like SELECT AS STRUCT on sub queries amongst other things.

Ultimately, I'd love to see a result like:

road, total_users, new_users

Any help would be much appreciated.

Upvotes: 1

Views: 78

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270503

If I understand correctly, you want something like this:

select road, counif(seqnum = 1) as new_users, count(distinct user_id) as num_users
from (select l.*,
             row_number() over (partition by l.user_id, l.text order by l.timestamp) as seqnum
      from log l
      where l.type = 'Road'
     ) l
where timestamp >= @timestamp1 and timestamp < @timestamp2
group by road;

This assumes that you have a column that specifies the type (i.e. "road") and another column with the name of the road (i.e. "Champs-Elysees").

Upvotes: 1

Related Questions