Reputation: 47
I have a DataFrame with character strings of upper and lower case values and I need to extract only the lower case values between strings of 3 upper case values.
I'm using python and pandas to do this but have been unsuccessful. This is what the data looks like:
afklajrwouoivWERvalueineedREWkfjdsl
Upvotes: 0
Views: 183
Reputation: 3331
You can also use the re
package with the same regex :
import re
re.search('[A-Z]{3}(.+?)[A-Z]{3}', s).group()[3:-3]
Output :
valueineed
If there are several occurences you should instead use :
matches = re.finditer('[A-Z]{3}(.+?)[A-Z]{3}',s)
results = [match.group(1) for match in matches]
Upvotes: 1
Reputation: 153500
Let's try this:
df = pd.DataFrame({'text':['afklajrwouoivWERvalueineedREWkfjdsl']}, index=[0])
df['text'].str.extract('[A-Z]{3}(.+?)[A-Z]{3}')
Output:
valueineed
Note, this gets all characters between 3 uppercased letters.
Upvotes: 2