Reputation: 7
I have a huge file in this format:
{
"Name": "Alex",
"Id": 1,
"time": "2022-01-01T04:29:18.9099882Z"
}
{
"Name": "Homer",
"Id": 2,
"time": "2022-01-02T04:29:18.9099882Z"
}
{
"Name": "Bart",
"Id": 3,
"time": "2022-01-03T04:29:18.9099882Z"
}
How can I filter this file by date range and specific period of time? I was trying to solve this using jq, but was not able to do so. If there another way how to filter it without jq, please let me know.
Upvotes: 0
Views: 213
Reputation: 27443
A powershell example. Taking the input as an array:
[{
"Name": "Alex",
"Id": 1,
"time": "2022-01-01T04:29:18.9099882Z"
},
{
"Name": "Homer",
"Id": 2,
"time": "2022-01-02T04:29:18.9099882Z"
},
{
"Name": "Bart",
"Id": 3,
"time": "2022-01-03T04:29:18.9099882Z"
}]
For some reason in powershell 5, I have use parentheses to complete the conversion first (could use a lot of memory). The time fields are converted to the .Net [datetime] type, and then it's easy to compare them. The right side of a comparision automatically gets converted to a datetime if it can be.
(get-content file.json | convertfrom-json) |
foreach-object { $_.time = [datetime]$_.time; $_ } |
where-object { $_.time -ge '1/1/22' -and $_.time -le '1/2/22' }
Name Id time
---- -- ----
Homer 2 1/1/2022 11:29:18 PM
Here's a workaround for the original file of consecutive json objects:
get-content file.json -ReadCount 5 |
foreach-object { $_ | convertfrom-json } |
foreach-object { $_.time = [datetime]$_.time; $_ } |
where-object { $_.time -ge '1/1/22' -and $_.time -le '1/2/22' }
Upvotes: 0
Reputation: 36151
Doesn't a simple select
do what you want? For example using plain string comparison makes easy date ranges inclusive on the left and exclusive on the right:
jq 'select(.time > "2022-01-01" and .time < "2022-01-03")'
{
"Name": "Alex",
"Id": 1,
"time": "2022-01-01T04:29:18.9099882Z"
}
{
"Name": "Homer",
"Id": 2,
"time": "2022-01-02T04:29:18.9099882Z"
}
You could also convert the date string into e.g. Unix time ("seconds since the Epoch") using fromdate
and operate on that as a number (after employing sub
to cut off second fractions in order to obtain an ISO 8601 compatible date):
jq 'select(.time | sub("\\.\\d+"; "") | fromdate < 1641100000)'
{
"Name": "Alex",
"Id": 1,
"time": "2022-01-01T04:29:18.9099882Z"
}
{
"Name": "Homer",
"Id": 2,
"time": "2022-01-02T04:29:18.9099882Z"
}
Or, for the sake of another example, using strptime
to convert the date into a broken-down array (containing in this order: year, (zero-based) month, day, hour, minute, second, day of week, and day of year) and then filtering e.g. for Mondays (represented as 1
in that array at the (zero-based) position 6
):
jq 'select(.time | sub("\\.\\d+"; "") | strptime("%FT%TZ") | .[6] == 1)'
{
"Name": "Bart",
"Id": 3,
"time": "2022-01-03T04:29:18.9099882Z"
}
Upvotes: 2