Extracting multiple values with RegEx in a Google Sheet formula

Question

I have a Google spreadsheet with 2 columns.

Each cell of the first one contains JSON data, like this:

{
    "name":"Love",
    "age":56
},
{
    "name":"You",
    "age":42
}

Then I want a second column that would, using a formula, extract every value of name and string it like this:

Love,You

Right now I am using this formula:

=REGEXEXTRACT(A1, CONCATENER(CHAR(34),"name",CHAR(34),":",CHAR(34),"([^",CHAR(34),"]+)",CHAR(34),","))

The RegEx expresion being "name":"([^"]+)",

The problem being that it currently only returns the first occurence, like this:

Love

(Also, I don't know how many occurences of "name" there are. Could be anywhere from 0 to around 20.)

Is it even possible to achieve what I want?

Thank you so much for reading!

EDIT:

My JSON data starts with:

{
   "time":4,
   "annotations":[
      {

Then in the middle, something like this:

{
    "name":"Love",
    "age":56
},
{
    "name":"You",
    "age":42
}

and ends with:

],
   "topEntities":[
      {
         "id":247120,
         "score":0.12561166,
         "uri":"http://en.wikipedia.org/wiki/Revenue"

},
      {
         "id":31512491,
         "score":0.12504959,
         "uri":"http://en.wikipedia.org/wiki/Wii_U"

}

],
   "lang":"en",
   "langConfidence":1.0,
   "timestamp":"2020-05-22T12:17:47.380"
}

Wiktor Stribiżew · Accepted Answer

Since your text is basically a JSON string, you may parse all name fields from it using the following custom function:

function ExtractNamesFromJSON(input) {
  var obj = JSON.parse("[" + input + "]");
  var results = obj.map((x) => x["name"])
  return results.join(",")
}

Then use it as =ExtractNamesFromJSON(C1).

If you need a regex, use a similar approach:

function ExtractAllRegex(input, pattern,groupId,separator) {
  return Array.from(input.matchAll(new RegExp(pattern,'g')), x=>x[groupId]).join(separator);
}

Then use it as =ExtractAllRegex(C1, """name"":""([^""]+)""",1,",").

Note:

input - current cell value
pattern - regex pattern
groupId - Capturing group ID you want to extract
separator - text used to join the matched results.

Extracting multiple values with RegEx in a Google Sheet formula

Answers (1)

Related Questions