Nidheesh
Nidheesh

Reputation: 4562

Remove words from a string based on appearance

I have to remove a word/words in my string for some cases. For example, My string is incabcincdefinc inc. inc. For this I need to remove both inc. and inc. ie, output should be incabcincdefinc. ie, remove all incs as per the following conditions:

<space>inc<space>
<space>inc<.>
<space>inc<end string>
<space>inc

Upvotes: 2

Views: 163

Answers (1)

Martin Ender
Martin Ender

Reputation: 44299

You can probably get away with something like this:

str = str.replaceAll("[ ](?:inc|ltd|corp)\\b\\.?", "");

The square brackets are only used to make the space characters in between more visible, they could be omitted, as long as the space is kept. Your conditions are met by asserting that there is a word boundary (\\b) after the business entity extension. That means that there is no letter, digit or underscore coming next (which captures all your conditions). Then the pattern also tries to include a literal period (\\.), but does not care if there is non (?). Everything is replaced with an empty string. Note that in your first condition I do not match and remove the space, because that would make SomeCompany inc inc become SomeCompanyinc.

If you want to look for the extension case-insensitively, you need to use the longer syntax:

Pattern pattern = Pattern.compile(
    "[ ](?:inc|ltd|corp)\\b\\.?",
    Pattern.CASE_INSENSITIVE
);
Matcher matcher = pattern.matcher(str);
str = matcher.replaceAll("");

Upvotes: 2

Related Questions