Basch
Basch

Reputation: 431

Extracting strings between specific characters in excel and separating them with a comma

Is there a way to extract multiple strings between specific characters in excel and separating them with a comma.

For example:

enter image description here

I am thankful for every help!

Upvotes: 1

Views: 185

Answers (4)

Harun24hr
Harun24hr

Reputation: 36870

You can't do it with only formulas in Excel 2007. You need VBA UDF then. With Excel O365 having dynamic formula access, this can be done like:

=TEXTJOIN(", ",TRUE,TRIM(LEFT(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(A1,"]",REPT(" ",100)),"[","</s><s> ")&"</s></t>","//s[starts-with(., ' ')]"),100)))

enter image description here

Upvotes: 2

Axuary
Axuary

Reputation: 1507

The following requires a version of Excel O365 that supports dynamic arrays and the LET function.

If I understand correctly you are looking for something like this.

target output

This formula will list all the positions of all [ in the string using dynamic array functions.

=LET(x,$B$2,
    y, SEQUENCE(LEN(x)),
    raw, IF(MID(x,y,1)="[",y,""),
    filtered, FILTER(raw,raw<>"",""),
    filtered)

LET allows you to set names within a formula using parameter pairs. The first is the name; the second is the value of the name. The last parameter is the value returned. In this case, x is set to cell B2. y is set to the array listing the numbers from 1 to the length of x using the dynamic array function SEQUENCE. raw is a list that shows the value of y if that position is [, otherwise it is blank. filtered uses the FILTER function to remove all of the blank rows. filtered is the last argument of the LET function so that is what is returned. If this formula is entered into cell A4 then A4 will show 1, A5 will show 14 and A6 will show 28.

If you then enter a similar formula in B4 replacing [ with ] then the result is {7,21,35} in cells B4, B5 and B6.

Finally, in cell B2 you can enter =TEXTJOIN(", ",TRUE,MID(A2,A4#+1,B4#-A4#-1)). This will return the result you are looking for.

enter image description here

If you are unfamiliar with dynamic arrays, A4# indicates an entire dynamic array starting in cell A4. In this case, it is the same as A4:A6.

Upvotes: 2

AnilGoyal
AnilGoyal

Reputation: 26218

Considering that you are using too old version of excel which lacks dynamic array functions and besides too many other useful functions, You may probably have to do a long workaround here.

First using substitute and len You have to find out number of such square parenthesis in each of the row. Thereafter you have to work out the formula of max of such numbers.

Do it like this

Assuming your text-values in A2 for a max of 4 occurrences enter the following formula in B2

=SUBSTITUTE(TRIM(REPLACE(LEFT(A2,FIND("]",A2&"]")-1),1,FIND("[",A2&"["),"")&" "&REPLACE(LEFT(A2,FIND("#",SUBSTITUTE(A2&REPT("]",2),"]","#",2))-1),1,FIND("#",SUBSTITUTE(A2&REPT("[",2),"[","#",2)),"")&" "&REPLACE(LEFT(A2,FIND("#",SUBSTITUTE(A2&REPT("]",3),"]","#",3))-1),1,FIND("#",SUBSTITUTE(A2&REPT("[",3),"[","#",3)),"")&" "&REPLACE(LEFT(A2,FIND("#",SUBSTITUTE(A2&REPT("]",4),"]","#",4))-1),1,FIND("#",SUBSTITUTE(A2&REPT("[",4),"[","#",4)),"")), " ", ", ")

Let's say this is the text in A2

I have a text [123] and some more [4523] and also [552222] how to extract [22]?

This will create a output of 123,4523,552222,22 in B2

enter image description here

Upvotes: 1

Variatus
Variatus

Reputation: 14383

There is no "extraction" in your question, just removal. The formula below replaces the unwanted characters with "", thereby removing them. Please try it.

=SUBSTITUTE(SUBSTITUTE(A2,"[",""),"]","")

Upvotes: 1

Related Questions