Newbie
Newbie

Reputation: 29

Escaping in regex expression python

I'm looking to extract the id tag from the following field of data:

{"purchased_at":"2020-04-21T05:55:30.000Z","product_desc":"Garnier 2019 Shampoo","onhold":{"copyright":true,"country_codes":["ABC"],"scope":"poss"},"id":"8745485"}

The regex I'm using breaks when this field is encountered as I'm using '"id":\s*"(.*?)"'.

Because, only some fields have such extra onhold tag:

{"purchased_at":"2020-04-21T05:55:30.000Z","product_desc":"All clear 2019 \n ","id":"7462764"}

The whole file is of the form:

{"info":[{"purchased_at":"","product_desc":"","id":""}{..}]}

Upvotes: -2

Views: 41

Answers (2)

Liju
Liju

Reputation: 2313

Just use findall method in re module to extract data.

import re
line='{"purchased_at":"2020-04-21T05:55:30.000Z","product_desc":"Garnier 2019 Shampoo","onhold":{"copyright":true,"country_codes":["ABC"],"scope":"poss"},"id":"8745485"}'
print(re.findall('"id":\s*"(.*?)"',line))

Output

['8745485']

Upvotes: 0

Barbaros Özhan
Barbaros Özhan

Reputation: 65408

You can import json library in order to extract the desired value for the key (id), rather than using a regular expression :

import json
str = '{"purchased_at":"2020-04-21T05:55:30.000Z","product_desc":"Garnier 2019 Shampoo","onhold":{"copyright":true,"country_codes":["ABC"],"scope":"poss"},"id":"8745485"}'

js = json.loads(str)

for i in js:
      if i == 'id':
            print(js[i])

>>>
8745485   

Update : If you need to find out by using methods related with regular expression, then using search function of re library with proper pattern might help :

import re
str = '{"purchased_at":"2020-04-21T05:55:30.000Z","product_desc":"Garnier 2019 Shampoo","onhold":{"copyright":true,"country_codes":["ABC"],"scope":"poss"},"id":"8745485"}'

s = re.search('id":"(.+?)"', str)

if s:
    print( s.group(1) )

>>>
8745485 

Upvotes: 1

Related Questions