tomcam
tomcam

Reputation: 5295

Google Apps Script regex parsing address: Unable to match embedded newline

Parsing PayPal emails to get the shipping address. Portion I am interested in is the shipping address:

Shipping address - confirmed
Mr Example
4692 E. Willow ave
Possible 2nd street address
My Nice City, CA 95337
United States

Working backwards from United States. This works:

// Returns "United States" as expected:
var matched = messageText.match(/United States/m)

As does this:

// Returns "\nUnited States" as expected:
var matched = messageText.match(/\nUnited States/m)

This pattern works for zip code:

// Returns "95337-4423" as well as "95337"
var matched = messageText.match(/\d{5}(?:[-\s]\d{4})?/m);

Working backwards to get both, this fails; no value of newline works for me:

var matched = messageText.match(/\d{5}(?:[-\s]\d{4})?\nUnited States/m)

Similar variations also fail:

var matched = messageText.match(/\d{5}(?:[-\s]\d{4})?[\s\S]United States/m)
var matched = messageText.match(/\d{5}(?:[-\s]\d{4})?[\r\n]United States/m)

What am I doing wrong to match a line preceded by another line?

Upvotes: 2

Views: 283

Answers (2)

spoonscen
spoonscen

Reputation: 117

/(.*(\s)){4}\d{5}(?:[-\s]\d{4})?[\s\S]?United States/m

This part here (.*(\s)){4} It will grab any any character except new line and then any whitespace 4 times. This captures the four lines above the zip code line that were being ignored by the code above. If you want the text that says "Shipping address - confirmed" just change that 4 to a 5!

Hope this helps!

check it out here! http://regexr.com/3e4jt

Upvotes: 1

gotnull
gotnull

Reputation: 27224

First matches okay:

var messageText = "Shipping address - confirmed Mr Example 4692 E. Willow ave Possible 2nd street address My Nice City, CA 95337 United States";

var matched = messageText.match(/\d{5}(?:[-\s]\d{4})?[\s\S]United States/m);

console.log(matched);

var matched = messageText.match(/\d{5}(?:[-\s]\d{4})?[\r\n]United States/m);

console.log(matched);

You're better off going with \s+United States so as to include white space characters between code and country.

var messageText = "Shipping address - confirmed Mr Example 4692 E. Willow ave Possible 2nd street address My Nice City, CA         95337 United States";

var matched = messageText.match(/\d{5}(?:[-\s]\d{4})?\s+United States/m);

console.log(matched);

Upvotes: 1

Related Questions