Reputation: 1105
I'm trying to extract all occurrences matching a regex from a Google Sheets cell.
Example of a cell content in D5:
Genauer zu reden, sie sind gegeben sozusagen als Einbettungen in konkreten psychischen Erlebnissen, die in der Funktion der Bedeutungsintention oder Bedeutungserfüllung (in letzterer Hinsicht als illustrierende oder evidentmachende Anschauung) zu gewissen sprachlichen Ausdrücken gehören und mit ihnen eine phänomenologische Einheit bilden
REGEXEXTRACT(D5; "\w+edeut\w+")
only returns the first match, i.e. Bedeutungsintention.
I need all of them. In this case, Bedeutungsintention and Bedeutungserfüllung
Thanks !
Upvotes: 1
Views: 897
Reputation: 9928
All of these examples can be seen in this shared file
=Filter(split(textjoin(" ",true,D5:D)," "),find("edeut",split(textjoin(" ",true,D5:D)," "),1)>0)
This is probably the easiest as it uses standard sheet functionality. It splits (converts to an array) the text based on a space ' '
separator. It then filters on any words that contain the specified text. From there you can use other sheet functionality to manipulate the text (transpose
, unique
, substitute
out other characters such as periods)
Using the textjoin
function will allow for this to be applied to multiple cells.
This is basically doing the same as the first option but using app scripts. It might provide more flexibility.
/**
* @OnlyCurrentDoc
*/
/**
* Finds instances of value
*
* @param {range} theCells the Cell(s) with a value.
* @param {string} theTextToSearch Text to find.
* @param {string} theSplitter Text to split values
* @return the results split by the thesplitter.
* @customfunction
*/
function returnSomeText(theCells, theTextToSearch, theSplitter) {
var allValues = theCells;
if (Array.isArray(allValues)){
allValues = allValues.join(" ");
}
var theArray = allValues.split(" ");
var allResults =[];
for(var i= 0;i<theArray.length;i++){
if(theArray[i].search(theTextToSearch)>0){
allResults.push(theArray[i]);
}
}
return allResults.join(theSplitter);
}
Upvotes: 2
Reputation: 9355
To process D2:D with an array formula, try the following formula in, say, E2 (or the second cell of any otherwise empty range of columns wide enough to accommodate the maximum number of extractions that any one row may produce):
=ArrayFormula(IF(D2:D="",,SPLIT(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(REGEXEXTRACT(SPLIT(D2:D," "),"[A-Za-zäÄöÖüÜß]+edeut[A-Za-zäÄöÖüÜß]+"))),,COLUMNS(SPLIT(D2:D," "))))," ",0,1)))
This takes advantage of a quirk with QUERY
whereby any number of header columns can be selected, not just 0 or 1. If we TRANSPOSE
all data, then run a QUERY
for nothing but headers, requesting as many headers as max results per transposed row, and then TRANSPOSE
back to the original orientation, we get all matching results in one string. Then we can SPLIT
that composite header string into separate cells by row.
Upvotes: 1