mikey
mikey

Reputation: 135

Split on a numerical character

I have the following text strings:

vine-1.3.0.txt
wcwidth-0.1.8.txt
websocket-client-python-0.57.0.txt
xml-security-for-java-2.3.0.txt

How can I use split() to remove the -{version}.txt substring and return the following:

vine
wcwidth
websocket-client-python
xml-security-for-java

I am trying to emulate the following bash/sed command:

sed "s/[-0-9.]*$//"

Upvotes: 0

Views: 45

Answers (2)

ShadowRanger
ShadowRanger

Reputation: 155353

If you want to be sure you only remove a single file extension, plus all digit extensions after the final hyphen, a mix of tools works best:

import os.path
import re

# Tweaked regex so it won't treat repeated dots as a match
trimmable_re = re.compile(r'-\d+(?:\.\d+)*$')  # Optional precompile done once at top of file
# or version matching your original sed but matching more than the pattern you describe:
trimmable_re = re.compile(r'[-0-9\.]*$')

name, ext = os.path.splitext(filename)  # Split off main extension the nice way
name = trimmable_re.sub('', name)  # Split off numeric extensions

This answer ended up being less simple (forgot how rpartition handles missing separator), but kept just for illustration (after making it work, but less simply):

Safest simple solution (that works even if the string doesn't have a hyphen) is rpartition:

name, sep, name2 = filename.rpartition('-')
name = name or name2  # When hyphen doesn't occur, entire string stored in name2
                      # and name/sep are empty
# Or equivalently, just take first non-empty string from result:
name = next(filter(None, filename.rpartition('-')))

where name will initially be everything to the left of the final hyphen (empty if no hyphen occurs), sep will be '-' or the empty string (when there's no hyphen to be found), and name2 will be whatever came after the final hyphen or the entire string. For the "no hyphen case" we just use name2.

Upvotes: 0

deadshot
deadshot

Reputation: 9061

You can use rsplit()

data = '''vine-1.3.0.txt
wcwidth-0.1.8.txt
websocket-client-python-0.57.0.txt
xml-security-for-java-2.3.0.txt'''

for text in data.splitlines():
    name, *_ = text.rsplit('-', 1)
    print(name)

Output:

vine
wcwidth
websocket-client-python
xml-security-for-java

Upvotes: 2

Related Questions