Python regex to remove alphanumeric characters without removing words at the end of the string

Question

I'm trying to clean some text by removing alphanumeric characters from the end of the string, but I'm also removing normal words as shown on the output. Can someone help me achieve the expected result?

re.sub(r'[a-zA-Z0-9/]{5,}$', '', text)

asus zenfone 3s max zc521tl
asus zenfone max plus (m1) zb570tl
asus zenfone max pro (m1) zb601kl/zb602k
nokia 3.1 c
nokia 3
asus zenfone 3 zoom ze553k
asus zenfone 3 deluxe zs570kl
blackberry keyone
htc explorer
lg tribute
acer liquid z520

Output:

asus zenfone 3s max 
asus zenfone max plus (m1) 
asus zenfone max pro (m1) 
nokia 3.1 c
nokia 3
asus zenfone 3 zoom 
asus zenfone 3 deluxe 
blackberry 
htc 
lg 
acer liquid z520

Expected output:

asus zenfone 3s max
asus zenfone max plus (m1) 
asus zenfone max pro (m1)
nokia 3.1 c
nokia 3
asus zenfone 3 zoom 
asus zenfone 3 deluxe 
**blackberry keyone**
**htc explorer**
**lg tribute**
acer liquid z520

The fourth bird · Accepted Answer

If it should be the last word in a string and there are always multiple words, you might use:

[ 	]+(?=[a-zA-Z0-9/]{5})[a-zA-Z/]*[0-9][a-zA-Z0-9/]*[A-Za-z]$

[ ]+ Match 1+ spaces or tabs
(?=[a-zA-Z0-9/]{5}) Assert at least 5 chars of any of the listed
[a-zA-Z/]* Match 0+ times any of the listed
[0-9] Match a digit
[a-zA-Z0-9/]* Match 0+ times any of the listed in the character class
[A-Za-z] Match a char a-zA-Z
$ End of string

Regex demo

In the replacement use an empty string.

Python regex to remove alphanumeric characters without removing words at the end of the string

Answers (2)

Related Questions