Cleveburgian
Cleveburgian

Reputation: 51

REGEX to parse words from cell in googlesheets

Hoping somebody can help me with the syntax.

Cell A1 contains the following value:

Content Type: Pitch Decks; Region: North America; Content Type: Data Sheet;

I'm trying to create a formula that parses the values between the words "Content Type: " and the semicolon that follows.

So in B1, the expected result of the formula should be:

Pitch Decks
Data Sheet

(Yes, with char(10) separating the values in B1)

I've tried a number of different combinations of REGEXEXTRACT, for example:

REGEXEXTRACT(A11,"Content Type: ([\w\/\s]*);?")

..only returns:

Pitch Decks

I've read that global modifiers are not supported in Google Sheets. That formula works when I add a global modifier in a regex test harness, but not when used within Google Sheets itself.

Upvotes: 1

Views: 243

Answers (2)

player0
player0

Reputation: 1

try:

=ARRAYFORMULA(SUBSTITUTE(SUBSTITUTE(TRIM(FLATTEN(QUERY(TRANSPOSE(SUBSTITUTE(
 IF(IFERROR(REGEXMATCH(SPLIT(A1:A, ";"), "Content Type")), 
 REGEXEXTRACT(SPLIT(A1:A, ";"), ": (.+)"), ), " ", "♥"))
 ,,9^9))), " ", CHAR(10)), "♥", " "))

enter image description here

Upvotes: 0

41686d6564
41686d6564

Reputation: 19641

I thought this would be easy with substitution and a Lookahead but turns out Google Sheets doesn't support Lookaheads either. After some research, I came up with the following solution:

=join(CHAR(10),REGEXEXTRACT(A1,REGEXREPLACE(A1,"(Content Type: )([^;]+)","$1($2)")))

This will find all the strings that come after "Content Type: " and until the next semicolon or the end of the string. Then, it will join them using a LineFeed. If you want to join using a different character, you may replace CHAR(10) with ",", for example, to join with a comma.

Output:

Pitch Decks
Data Sheet

Upvotes: 1

Related Questions