mchangun
mchangun

Reputation: 10322

Value.match() Regex in Google Refine

I am trying to extract a sequence of numbers from a column in Google Refine. Here is my code for doing it:

value.match(/[\d]+/)[0]

The data in my column is in the format of

abcababcabc 1234566 abcabcbacdf

The results is "null". I have no idea why!! It is also null if instead of \d I try \w.

Upvotes: 7

Views: 5142

Answers (1)

Tom Morris
Tom Morris

Reputation: 10540

OpenRefine doesn't add implicit wildcards to the end of the pattern as some systems do (and as one might expect). Try this pattern instead:

value.match(/.*?(\d+).*?/)[0]

You need the lazy/non-greedy qualifier (ie question mark) on the wildcards so that they don't gobble up some of your digits too. If you just use /.*(\d+).*/ you'll only match a single digit because the rest of them will be taken by the .* pattern.

Full documentation for the implementation can be seen in Java's Pattern class docs.

Upvotes: 8

Related Questions