Ringo Blancke
Ringo Blancke

Reputation: 2444

Google Spreadsheets: Remove substring if string contains one of several strings

I have a spreadsheet that has the following column:

Alice Aikens Engineer
Bob Lob Law Engineer
Carrie Crenshaw Executive
Don Draper Dapper Executive
Elizabeth Elias
Fatrick Fitzgerald

I would like to have a second column where I would basically be able to remove that last word (Engineer, Executive) from the cells that contain those words. Not all cells end with those words though.

Any idea on how I can do this?

Upvotes: 2

Views: 2089

Answers (2)

Lod
Lod

Reputation: 779

I needed the same solution but working on Capitalized / with 1st uppercase letter words substrings only also appearing any where in the string and got this solution:

With Formula Below In B1:

=ArrayFormula(IF(REGEXMATCH(trim(regexreplace(regexreplace(A:A,"(?:([A-Z]([a-z]){1,}))|.", " $1"), "(\s)([A-Z])","$1 $2")),"(?i)\b(Law Engineer|Engineer|Executive|Law)\b"), 
trim(regexreplace(regexreplace(regexreplace(A:A,"(?:([A-Z]([a-z]){1,}))|.", " $1"), "(\s)([A-Z])","$1 $2"),"(?i)\b(Law Engineer|Engineer|Executive|Law)\b", " ")),
trim(regexreplace(regexreplace(A:A,"(?:([A-Z]([a-z]){1,}))|.", " $1"), "(\s)([A-Z])","$1 $2"))))

or same Formula Beautified from this tool / this source:

=ArrayFormula(
  IF(
    REGEXMATCH(
      trim(
        regexreplace(
          regexreplace(
            A:A,
            "(?:([A-Z]([a-z]){1,}))|.",
            " $1"
          ),
          "(\s)([A-Z])",
          "$1 $2"
        )
      ),
      "(?i)\b(Law Engineer|Engineer|Executive|Law)\b"
    ),
     trim(
      regexreplace(
        regexreplace(
          regexreplace(
            A:A,
            "(?:([A-Z]([a-z]){1,}))|.",
            " $1"
          ),
          "(\s)([A-Z])",
          "$1 $2"
        ),
        "(?i)\b(Law Engineer|Engineer|Executive|Law)\b",
        " "
      )
    ),
     trim(
      regexreplace(
        regexreplace(
          A:A,
          "(?:([A-Z]([a-z]){1,}))|.",
          " $1"
        ),
        "(\s)([A-Z])",
        "$1 $2"
      )
    )
  )
)

Input in A1:A12:

Law Engineer Engineer Executive Law Alice Law Engineer Engineer Executive Law Aikens Law Engineer Engineer Executive Law
Law Engineer Engineer Executive Law Bob Law Engineer Engineer Executive Law Lob Law Engineer Engineer Executive Law
Law Engineer Engineer Executive Law Carrie Law Engineer Engineer Executive Law Crenshaw Law Engineer Engineer Executive Law
Law Engineer Engineer Executive Law Don Law Engineer Engineer Executive Law Draper Law Engineer Engineer Executive Law Dapper Law Engineer Engineer Executive Law
Law Engineer Engineer Executive Law Elizabeth Law Engineer Engineer Executive Law Elias Law Engineer Engineer Executive Law
Law Engineer Engineer Executive Law Fatrick Law Engineer Engineer Executive Law Fitzgerald Law Engineer Engineer Executive Law
Alice Aikens
Bob Lob
Carrie Crenshaw
Don Draper Dapper
Elizabeth Elias
Fatrick Fitzgerald

Output in B1:B12:

Alice Aikens
Bob Lob
Carrie Crenshaw
Don Draper Dapper
Elizabeth Elias
Fatrick Fitzgerald
Alice Aikens
Bob Lob
Carrie Crenshaw
Don Draper Dapper
Elizabeth Elias
Fatrick Fitzgerald

Upvotes: 0

JPV
JPV

Reputation: 27282

Assuming you have the data in Col A, try:

=ArrayFormula(regexreplace(A1:A6, "Law Engineer|Engineer|Executive",""))

and see if that works ?

Note: you can simply expand the substrings to be removed by adding a "|" (means or) and then the substring. e.g:

=ArrayFormula(regexreplace(A1:A6, "Law Engineer|Engineer|Executive|Dapper Executive",""))

Upvotes: 4

Related Questions