Saad Rehman Shah
Saad Rehman Shah

Reputation: 946

How do I extract only certain email addresses from a body of text has many?

There are a number of email addresses, sent along with their roles, in an url encoded json. Like so:

hl=en_US&token=AFNOsBbXXvng6zJmmPyIlya1dT48RKqmaQ%3A1441100947178&foreignService=explorer&shareService=explorer&authuser=0&locale=en_US&requestType=aclChange&itemIds=0B-i4kCZeNb05Y3FrVXFLYU41N0U&confirmed=false&modelChanges=%7B%22aclEntries%22%3A%5B%7B%22scope%22%3A%7B%22scopeType%22%3A%22user%22%2C%22name%22%3A%22babar.memon%40gmail.com%22%2C%22id%22%3A%22112542596153041291285%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22babar.memon%40gmail.co%22%7D%2C%22role%22%3A30%7D%2C%7B%22scope%22%3A%7B%22iconUrl%22%3A%22%2Fc%2Fu%2F0%2Fphotos%2Fpublic%2FAIbEiAIAAABDCL_k77OCsqvJPSILdmNhcmRfcGhvdG8qKGM0MmEwMjBkZWQ0MDAzMzMwYjI2MjczZmNlZWVlMDA3NDUxMGI2N2MwAdau5OHbez_zFcRyTELkBcRF-Lv9%22%2C%22scopeType%22%3A%22user%22%2C%22name%22%3A%22Saad%20Rehman%22%2C%22id%22%3A%22104436799417545912895%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22this.saad%40gmail.com%22%7D%2C%22role%22%3A20%7D%2C%7B%22scope%22%3A%7B%22iconUrl%22%3A%22%2Fc%2Fu%2F0%2Fphotos%2Fpublic%2FAIbEiAIAAABDCL_k77OCsqvJPSILdmNhcmRfcGhvdG8qKGM0MmEwMjBkZWQ0MDAzMzMwYjI2MjczZmNlZWVlMDA3NDUxMGI2N2MwAdau5OHbez_zFcRyTELkBcRF-Lv9%22%2C%22scopeType%22%3A%22user%22%2C%22name%22%3A%22Saad%20Rehman%22%2C%22id%22%3A%22104436799417545912895%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22this.saad%40gmail.com%22%7D%2C%22role%22%3A60%7D%2C%7B%22scope%22%3A%7B%22scopeType%22%3A%22user%22%2C%22name%22%3A%22Asim%20Kazmi%22%2C%22id%22%3A%22118161687853857289891%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22asim.kazmi%40elasticaqa.info%22%7D%2C%22role%22%3A20%7D%2C%7B%22scope%22%3A%7B%22scopeType%22%3A%22user%22%2C%22name%22%3A%22Asim%20Kazmi%22%2C%22id%22%3A%22118161687853857289891%22%2C%22me%22%3Afalse%2C%22requiresKey%22%3Afalse%2C%22email%22%3A%22asim.kazmi%40elasticaqa.info%22%7D%2C%22role%22%3A60%7D%5D%7D

What if I only want the email addresses that have role 30 next to them, or in another rule, all the email addresses that have role 20 next to them.

This is what I have done so far: `

.*?email.22.3A.22([a-zA-Z0-9_.+%-]+?%40[a-zA-Z0-9_%-]+?[.][a-zA-Z0-9_.%-]+?).22[^r]+role.22.3A30.7D

This is supposed to give me all the email addresses that have role 30 next to them i.e. babar.memon%40elastica.com. If I put .0 in place of 30, then I get all the email addresses, just like how I want them, except that I want them separately, first all the ones that have role 30, then role 20 etc.

This regex can be found in action here https://regex101.com/r/rW0qO9/1

Upvotes: 0

Views: 74

Answers (2)

Salman Hasni
Salman Hasni

Reputation: 194

Try this regular expression .*?email.22.3A.22([a-zA-Z0-9_.+-]+?%40[a-zA-Z0-9_%-]+?[.][a-zA-Z0-9_.-]+).22[^r]+role.22.3A(20).7D , I have slightly changed the capturing group since in email address we cannot have special character like %, so it will be only appearing when '@' is encoded as %40

Upvotes: 0

NeverHopeless
NeverHopeless

Reputation: 11233

Regex can be use to extract the pattern from string, you can't extract them in specific order since each match is a sub-string of sample string. You have to collect these matches and order them in the later stage.

Also there is a possibility that you construct your regex dynamically by parameterizing the (.0) part (in your regex) with 20 and 30 and 40 by taking a variable and extract each of them one by one.

Upvotes: 1

Related Questions