Alexander Mills
Alexander Mills

Reputation: 100010

Python/JS regex to match any number that is less than X

I have some JSON data in a text file, something like:

..."priority":1,...
..."priority":3,...
..."priority":5,...

what I want to do is match all lines where priority is less than or equal to 3.

so the matches would be:

..."priority":1,...
..."priority":3,...

So my basic regex might look like:

/"priority":[0-9],/

and a simple script to test this

const regex = /"priority":[0-9],/;

const str = '{"dateCreated":"2016-12-31T06:41:32.298Z","pid":15154,"count":1,"uid":"77d55631-36ab-4805-aa17-1c984c8d4d04","priority":1,"isRead":false,"line":"foo bar baz"}';

console.log(regex.test(str));   // true

so this will match any number 0-9, but how can I match X and below? Where X might be a 2 digit number?

Is this a concise way to do that somehow?

The raw data in the file looks like this:

{"dateCreated":"2016-12-31T20:31:42.928Z","pid":5502,"count":0,"uid":"b21ac5eb-e25f-411c-80e9-955ff6eab964","priority":2,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:43.058Z","pid":5502,"count":1,"uid":"19f69863-7bf7-45f1-99b8-30fe332e88b9","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:43.262Z","pid":5502,"count":2,"uid":"ee8ca050-d1b5-4787-8d9f-78b17a9c4e9e","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:43.462Z","pid":5502,"count":3,"uid":"40fdbf37-0caf-4dde-b6fa-8522cd333b6e","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:43.666Z","pid":5502,"count":4,"uid":"d1928b11-e93c-413a-8e4e-49ac8a7b38b5","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:43.865Z","pid":5502,"count":5,"uid":"9fdff4bc-5126-43f9-85c6-d1212bac777b","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:44.071Z","pid":5502,"count":6,"uid":"3058cc78-eea6-4de9-96de-69c107b73724","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:44.271Z","pid":5502,"count":7,"uid":"52899ff3-2c17-4392-9409-a24152d4d15c","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:44.468Z","pid":5502,"count":8,"uid":"a775cd96-be9e-4713-83d3-92441d73443d","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:44.675Z","pid":5502,"count":9,"uid":"4fedb82b-1fb9-4049-870e-ff3bdf32fd5e","priority":2,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:44.877Z","pid":5502,"count":10,"uid":"c3fd144c-0182-4777-80a8-ce5db0da6525","priority":2,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:45.075Z","pid":5502,"count":11,"uid":"70fde6ea-f57c-4d2c-9eb0-d2b04fae23f2","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:45.277Z","pid":5502,"count":12,"uid":"540db3c7-bfe0-4d09-b4b6-d8922455c8d5","priority":3,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:45.483Z","pid":5502,"count":13,"uid":"1eca7c3b-be01-4457-885f-b7cb807952aa","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:45.683Z","pid":5502,"count":14,"uid":"d7f3cf02-e302-49d5-9977-fac592349335","priority":1,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:45.881Z","pid":5502,"count":15,"uid":"8760aa12-5c10-456f-8ae1-b570bc939ff7","priority":2,"isRead":false,"line":"foo bar baz"}
{"dateCreated":"2016-12-31T20:31:46.087Z","pid":5502,"count":16,"uid":"152a51da-d253-4245-999d-6518318c305d","priority":1,"isRead":false,"line":"foo bar baz"}

(It's newline separate JSON data)

Upvotes: 0

Views: 128

Answers (2)

Chris Larson
Chris Larson

Reputation: 1724

This isn't a good case for regex. Instead, you should be loading the file using the json module, and then simply testing for the value of "priority".

import json

with open('./sample_json.json', 'r') as json_data_file:
    json_data = json_data_file.readlines()
json_data_file.closed

Using a list comprehension, we create a list of dicts, one for each line in json_data:

json_array = [json.loads(line) for line in json_data]

Finally, we run your greater-than less-than test for each dict in json_array and print each result: for line in json_array:

    if line['priority'] < 20:
        print("Success!")

Note that we could accomplish this by using a for loop instead of the list comprehension, with the same result:

json_array = []
for line in json_data:
    json_array.append(json.loads(line))

To learn more about the json module, read up on it here:

https://docs.python.org/3/library/json.html


For easier copy/pasting, here's the uninterrupted code:


import json

with open('./sample_json.json', 'r') as json_data_file:
    json_data = json_data_file.readlines()
json_data_file.closed

# Using list comprehension to populate json_array
json_array = [json.loads(line) for line in json_data]

# Using for loop to populate json_array
json_array = []
for line in json_data:
    json_array.append(json.loads(line))

for line in json_array:
    if line['priority'] < 20:
        print("Success!")

Note that with the parsed JSON as a dict, you have access to any of the keys for each line of your JSON:

for item in json_array:
    for key in item.keys():
        print("KEY:", key, "VALUE:", json_array[0][key])
    print()

Upvotes: 2

Alexander Mills
Alexander Mills

Reputation: 100010

Using JSON load/parse etc, is definitely right way to do this, but in terms of solving this with regex

The brute force method, like so:

"\"priority\":(1|2|3|4),"

should work just fine for smaller integers, and is probably pretty non-error-prone.

Upvotes: 1

Related Questions