Reputation: 14581
I'm having trouble finding how to convert a pyspark window statement into the equivalent SQL format.
Example:
eventsDF \
.groupBy(
"deviceId",
window("eventTime", "10 minutes", "5 minutes")) \
.count()
It should be something like:
select window, deviceId, count(deviceId)
from events
group by window eventTime 10 minutes 5 minutes, deviceId
Upvotes: 1
Views: 377
Reputation: 42332
You need to fix some syntax errors. The window
should be wrapped in parentheses.
select
window(eventTime, '10 minutes', '5 minutes'),
deviceId,
count(deviceId)
from events
group by
window(eventTime, '10 minutes', '5 minutes'),
deviceId
Upvotes: 1
Reputation: 419
I reckon the result of the following query is what you expected:
select deviceId, count(deviceId)
from events
group by `eventTime`, `10 minutes`, `5 minutes`, `deviceId`
the group by
will group the rows on all the windows you expect
Upvotes: 0