Reputation: 5413
I have a string template that looks like 'my_index-{year}'
.
I do something like string_template.format(year=year)
where year is some string. Result of this is some string that looks like my_index-2011
.
Now. to my question. I have a string like my_index-2011
and my template 'my_index-{year}'
What might be a slick way to extract the {year}
portion?
[Note: I know of the existence of parse library
]
Upvotes: 2
Views: 151
Reputation: 7837
Yes, a regex would be helpful here.
In [1]: import re
In [2]: s = 'my_string-2014'
In [3]: print( re.search('\d{4}', s).group(0) )
2014
Edit: I should have mentioned your regex can be more sophisticated. You can haul out a subcomponent of a more specific string, for example:
In [4]: print( re.search('my_string-(\d{4})$', s).group(1) )
2014
Given the problem you presented, I think any "find the year" formula should be expressible in terms of a regular expression.
Upvotes: 2
Reputation: 473893
There is this module called parse
which provides an opposite to format()
functionality:
Parse strings using a specification based on the Python format() syntax.
>>> from parse import parse
>>> s = "my_index-2011"
>>> f = "my_index-{year}"
>>> parse(f, s)['year']
'2011'
And, an alternative option and, since you are extracting a year, would be to use the dateutil
parser in a fuzzy mode:
>>> from dateutil.parser import parse
>>> parse("my_index-2011", fuzzy=True).year
2011
Upvotes: 2
Reputation: 6121
I assume "year" is 4 digits and you have multiple indexes
import re
res = ''
patterns = [ '%s-[0-9]{4}'%index for index in idx ]
for index,pattern in zip(idx,patterns):
res +=' '.join( re.findall(pattern ,data) ).replace(index+'-','') + ' '
---update---
dummyString = 'adsf-1234 fsfdr lkjdfaif ln ewr-1234 adsferggs sfdgrsfgadsf-3456'
dummyIdx = ['ewr','adsf']
output
1234 1234 3456
Upvotes: 2
Reputation: 550
You are going to want to use the string method split
to split on "-", and then catch the last element as your year:
year = "any_index-2016".split("-")[-1]
Because you caught the last element (using -1 as the index), your index can have hyphens in them, and you will still extract the year appropriately.
Upvotes: 1
Reputation: 33335
Use the split()
string function to split the string into two parts around the dash, then grab just the second part.
mystring = "my_index-2011"
year = mystring.split("-")[1]
Upvotes: 2