Kate Topham
Kate Topham

Reputation: 11

OpenRefine: regex returns null with match() but true with contain()

I'm trying to extract dates from a column of string values in OpenRefine. All dates are formatted with either periods or dashes between values. (e.g. "a_string_12-2-15", "3.12.99_another_string")

I tried value.contains(/[0-9]+[.-][0-9]+[.-][0-9]+/) and they all returned true. However, value.match(/[0-9]+[.-][0-9]+[.-][0-9]+/) returns null. I've also tried replacing [0-9] with \d and that hasn't fixed it. What am I doing wrong?

Upvotes: 1

Views: 423

Answers (1)

Ettore Rizza
Ettore Rizza

Reputation: 2830

The match function is very counter-intuitive and doesn't work as you think. You can do without it. Since Open Refine 3 there is a find function that does exactly what you want. :)

So try this instead:

value.find(/[0-9]+[.-][0-9]+[.-][0-9]+/).join(',')

The .join(',') part is just there in case you have several dates in a single string. Otherwise, this is an alternative:

value.find(/[0-9]+[.-][0-9]+[.-][0-9]+/)[0]

Just for the record, you can get the same result with match using this horror (which will not work anyway as you want if you have multiple dates in the same string)

value.match(/.*?([0-9]+[.-][0-9]+[.-][0-9]+).*/).join(',')

Upvotes: 2

Related Questions