Reputation: 22580
I'm trying to pull data of Shipping info from a string containing line breaks. The shipping info maybe between 3 and 5 lines. Here's a few examples:
/* Just need from here */
Ship NEXT DAY PM to:
John Doe
Address Street
Sweet, NC 55555
/* to here */
Email: [email protected]
Phone: 555-555-555
Ship NEXT DAY AM to:
John Doe
Address Street
Apt. 555
Sweet, NC 55555
Email: [email protected]
Phone: 555-555-555
Ship NEXT DAY PM to:
John Doe
c/o Jane Doe
Address Street
Apt. 555
Sweet, NC 55555
Email: [email protected]
Phone: 555-555-555
UPDATE
I finally figured out a solution for the regex. Also one was mentioned in comments that seems to work well. Not sure which would be better based on any "why" but here they are:
Ship (.)+ to\:((\n)(.)+){1,5}
Ship (.)+ to\:[\s\S]*?(?=\nEmail)
Now I've got a whole new issue. Turns out, MySQL REGEXP does not function as I expected (which I would have known had I bothered to read rather than skim API). So Now I'm trying to pull what I just regexed out of a substring IF it exist. The IF part I got, I simply use the REGEX to check if Ship To is found and if not NULL then start on substring extraction. The Substring extraction is where I'm stuck now as it's proving rather tedious in using LOCATE
in multiple combinations to try and get exact character position. Anyone know a quick and easy way? I'll post what I'm working on as soon as I don't get an ERROR, lol.
So far working the following into If statement:
MID(`Notes`, (LOCATE(' to:\n', `Notes`) + 5), (LOCATE('\n\nEmail:', `Notes`) - LOCATE(' to:\n', `Notes`) - 5)) as ShipAddress
Upvotes: 2
Views: 373
Reputation: 22580
While the suggestion of the extra lib was probably a good way to go, I'd rather just go with what's available. In the end I made use of the Regex and a simple CASE
Statement as follows:
CASE
WHEN `Notes` REGEXP 'Ship (.)* to\:((\n)(.)+){1,5}'
THEN MID(`Notes`, (LOCATE(' to:\n', `Notes`) + 5), (LOCATE('\n\nEmail:', `Notes`) - LOCATE(' to:\n', `Notes`) - 5))
END as 'ShipAddress',
Of course this requires knowing the exact layout of each note being parsed. Luckily I wrote that code to and I do know the exact layout saved to the DB every time. :)
Upvotes: 0
Reputation: 32561
The following pattern should work:
Ship (.)+ to\:[\s\S]*?(?=\nEmail)
Here's explained what it does.
For the MySQL part, maybe you could use the lib_mysqludf_preg UDF which is also referenced here: https://stackoverflow.com/a/2742764/674700.
Upvotes: 1