Manz
Manz

Reputation: 605

Select Specific character from a column and paste the related value to another Column

Working on a 2016 version excel file where i need to copy only the specific characters which contains _ or - from a column and paste the related data into another column.

If any of the row contains "|" then remove everything after this symbol.

Input Data:

col1                                                  col2
Disconnected on B_NKLA_TUV_0022 As Per 
Roll DCN-BVLR-OCR-NKLT | Flapping
VAAP CLR_ NLTK_BKC_ASSO CBR Thing
VCR-DCW- NKL-NKJ-MLP | CBR-XDL-BHG
Disconnected on B_NFA_SUV_0025 As 

Steps: First we need to remove all the character after "|" from col1 Then Exctract characters containing "" OR "-", If there is any space between 2 sentance containing "" OR "-", We need to combibne it and paste into col2

Output Data:

col1                                                  col2
Disconnected on B_NKLA_TUV_0022 As Per             B_NKLA_TUV_0022
Roll DCN-BVLR-OCR-NKLT | Flapping                  DCN-BVLR-OCR-NKLT
VAAP CLR_ NLTK_BKC_ASSO CBR Thing                  CLR_ NLTK_BKC_ASSO
Disconnected on 23 Dec 2022
VCR-DCW- NKL-NKJ-MLP | CBR-XDL-BHG                 VCR-DCW- NKL-NKJ-MLP
Disconnected on B_NFA_SUV_0025 As                  B_NFA_SUV_0025

Need Suggestion on it.

Upvotes: 1

Views: 413

Answers (2)

JvdV
JvdV

Reputation: 76000

Excel 2016 - Formula:

One could try FILTERXML() and use a simple predicate in the xpath expression to filter the correct substring:

enter image description here

Formula in B2:

=IFERROR(SUBSTITUTE(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(LEFT(A1,FIND("|",A1&"|")),"- ","-|"),"_ ","_|")," ","</s><s>")&"</s></t>","//s[translate(.,'-_','')!=.]"),"|"," "),"")

You can be much more restrictive if need be with xpath. This is done through anding more predicates. More information here.


PowerQuery:

Alternatively, you could load your table into PQ. Follow the steps described here to add a custom function based on a regular expression and invoke this using the following regular expression:

^[^|]*?(\\b[A-Z0-9]+(?:[-_] ?[A-Z0-9]+)+)\\b.*$|.

Replace this with a backreference to the 1st capture group $1. Now you can, as per given link, simply replace error values with a NULL if so desired.

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Col1", type text}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "fnRegexExtr", each fnRegexExtr([Col1], "^[^|]*?(\\b[A-Z0-9]+(?:[-_] ?[A-Z0-9]+)+)\\b.*$|.", "$1")),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Invoked Custom Function", {{"fnRegexExtr", null}})
in
    #"Replaced Errors"

VBA:

Another alternative is creating your own custom User Defined Function (UDF) through VBA, and invoke this like a formula on your sheet. For example:

Public Function RegexReplace(s As String, pat As String, rep As String, Optional cse As Boolean = False) As String

Static RE As Object: If RE Is Nothing Then Set RE = CreateObject("vbscript.regexp")

RE.Pattern = pat
RE.IgnoreCase = cse
RegexReplace = RE.Replace(s, rep)

End Function

Now invoke this through using a slightly different pattern:

=RegexReplace(A2,"^[^|]*?(\b[A-Z0-9]+(?:[-_] ?[A-Z0-9]+)+)\b.*$|.+","$1")

Upvotes: 2

Harun24hr
Harun24hr

Reputation: 37155

Try the following formula-

=TEXTJOIN(" ",1,
IFERROR(FILTERXML("<t><s>"&SUBSTITUTE(TEXTBEFORE(A2,"|",,,,A2)," ","</s><s>")&"</s></t>",
"//s[contains(., '_') or contains(., '-')]"),""))

enter image description here

Upvotes: 3

Related Questions