user3392296
user3392296

Reputation: 644

Regexreplace all lowercase words not working with german umlauts

Trying this to replace all lowercase words is not working properly

=regexreplace(A1;"\b[a-züöäß]+\b";"")

Example sentence:

Mit Sätzen wie Gewinne laufen lassen Verluste begrenzen können vor allem weniger erfahrene Aktienkäufer oder Börseneinsteiger die wichtigsten Grundregeln des Aktienhandels kennenlernen und besser verinnerlichen.

also matches "ätzen" in "Sätzen" but Sätzen start with uppercase. Or matches "Aktienkäufer" to "Aktienk".

Upvotes: 3

Views: 90

Answers (3)

marikamitsos
marikamitsos

Reputation: 10573

Your text includes german umlaut characters

You want to "replace all lowercase words"

We always use this simple formula

=REGEXREPLACE(G106," [a-züöäß]+","")

If you also want to omit the first word you can try

=REGEXREPLACE(G106,"^\w+ | [a-züöäß]+","")

It even works as an arrayformula

=INDEX(REGEXREPLACE(G106:G109,"^\w+ | [a-züöäß]+",""))

(Do adjust the formulae according to your ranges and locale)

enter image description here

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521514

My guess is that the word boundaries are not working as expected because the German umlaut characters are considered non word characters. Try this version:

=TRIM(SUBSTITUTE(REGEXREPLACE(SUBSTITUTE(A1, " ", "  "), "(^| )[a-züöäß]+( |$)", ""), "  ", " "))

Upvotes: 2

user3392296
user3392296

Reputation: 644

This is working

=INDEX(TEXTJOIN(" "; 1; LAMBDA(x;IF(REGEXMATCH(x&""; "^[a-züöäß]");;x))(SPLIT(A1; " "&CHAR(10)))))

Upvotes: 1

Related Questions