Lara M.
Lara M.

Reputation: 855

OpenRefine custom text faceting

I have a column of names like:

I want to make a Custom text facet with openrefine that mark as "true" the names with one comma and "false" all the others, so that I can work with those last (".E., Calvin F." is not a problem, I'll work with that later).

I'm trying using "Custom text facet" and this expression:

if(value.match(/([^,]+),([^,]+)/), "true", "false")

But the result is all false. What's the wrong part?

Upvotes: 3

Views: 686

Answers (3)

Owen Stephens
Owen Stephens

Reputation: 1560

The expression you are using:

if(value.match(/([^,]+),([^,]+)/), "true", "false")

will always evaluate to false because the output of the 'match' function is either an array, or null. When evaluated by 'if' neither an array nor 'null' evaluate to true.

You can wrap the match function in a 'isNonBlank' or similar to get a boolean true/false, which would then cause the 'if' function to work as you want. However, once you have a boolean true/false result the 'if' becomes redundant as its only function is to turn the boolean true/false into string "true" or "false" - which won't make any difference to the values function of the custom text facet.

So:

isNonBlank(value.match(/([^,]+),([^,]+)/))

should give you the desired result using match

Upvotes: 3

Owen Stephens
Owen Stephens

Reputation: 1560

Instead of using 'match' you could use 'split' to split the string into an array using the comma as a split character. If you measure the length of the resulting array, it will give you the number of commas in the string (i.e. number of commas = length-1).

So your custom text facet expression becomes:

value.split(",").length()==2

This will give you true/false

If you want to break down the data based on the number of commas that appear, you could leave off the '==2' to get a facet which just gives you the length of the resulting array.

Upvotes: 1

zolo
zolo

Reputation: 469

I would go with lookahead assertion to check if only 1 "," can find from the beginning until the end of line.

^(?=[^\,]+,[^\,]+$).* https://regex101.com/r/iG4hX6/2

Upvotes: 0

Related Questions