Reputation: 136
I have few file name :
xyz-1.23.35.10.2.rpm
xyz-linux-version-90.12.13.689.tar.gz
xyz-xyz-xyz-13.23.789.0-xyz-xyz.rpm
Here xyz can be any string of any size(only alpha no numerals)
Here the numbers with('.') are a version for each file.
Can I have a one common function to extract the version from each of the filename? I tried but the function is getting too big and very much use of hard coded constants. please suggest a simple way
Upvotes: 1
Views: 193
Reputation: 686
We can use the re
module to do this. Let's define the pattern we're trying to match.
We'll need to match a string of digits:
\d+
These digits may be followed by either a period or a hyphen:
\d+[\-\.]?
And this pattern can repeat many times:
(\d[\-\.]?)*
Finally, we always end with at least one digit:
(\d+[\-\.]?)*\d+
This pattern can be used to define a function that returns a version number from a filename:
import re
def version_from(filename, pattern=r'(\d+[\-\.]?)*\d+'):
match = re.search(pattern, filename)
if match:
return match.group(0)
else:
return None
Now we can use the function to extract all the versions from the data you provided:
data = ['xyz-1.23.35.10.2.rpm', 'xyz-linux-version-90-12-13-689.tar.gz', 'xyz-xyz-xyz-13.23.789.0-xyz-xyz.rpm']
versions = [version_from(filename) for filename in data]
The result is the list you ask for:
['1.23.35.10.2', '90-12-13-689', '13.23.789.0']
Upvotes: 1
Reputation: 11615
Not sure if there's a better way regular expressions aren't really my thing, but here's one way you can see the version of your files assuming the only occurrences of numbers are the versions in this format.
import re
strings = [
"xyz-1.23.35.10.2.rpm",
"xyz-linux-version-90.12.13.689.tar.gz",
"xyz-xyz-xyz-13.23.789.0-xyz-xyz.rpm",
]
for string in strings:
matches = re.findall("\d+", string)
version = ".".join(matches)
print(version)
Result:
1.23.35.10.2
90.12.13.689
13.23.789.0
Upvotes: 1
Reputation: 660
Assuming that the only numbers in your string are the version you need to extract, you could try something like this:
def func(someString):
version = ''
found = False
for character in someString:
if character.isdigit():
found = True
elif character.isalpha():
found = False
if found:
version += character
return version
Basically we search each character of the string, and when the version part begins found becomes true (because 'number'.isdigit()
returns true
). When we reach that part each character is added to the version string. isdigit()
and isalpha()
are part of python's basic library so you don't need to import anything.
P.S. I haven't tested this for errors
Upvotes: 0