Reputation: 24665
I use R a lot more and it is easier for me to do it in R:
> test <- c('bbb', 'ccc', 'axx', 'xzz', 'xaa')
> test[grepl("^x",test)]
[1] "xzz" "xaa"
But how to do it in python if test
is a list?
P.S. I am learning python using google's python exercise and I prefer using regular expression.
Upvotes: 21
Views: 21327
Reputation: 626794
In general, you may use
import re # Add the re import declaration to use regex
test = ['bbb', 'ccc', 'axx', 'xzz', 'xaa'] # Define a test list
reg = re.compile(r'^x') # Compile the regex
test = list(filter(reg.search, test)) # Create iterator using filter, cast to list
# => ['xzz', 'xaa']
Or, to inverse the results and get all items that do not match the regex:
list(filter(lambda x: not reg.search(x), test))
# >>> ['bbb', 'ccc', 'axx']
See the Python demo.
USAGE NOTE:
re.search
finds the first regex match anywhere in a string and returns a match object, otherwise None
re.match
looks for a match only at the string start, it does NOT require a full string match. So, re.search(r'^x', text)
= re.match(r'x', text)
re.fullmatch
only returns a match if the full string matches the pattern, so, re.fullmatch(r'x')
= re.match(r'x\Z')
= re.search(r'^x\Z')
.If you wonder what the r''
prefix means, see Python - Should I be using string prefix r when looking for a period (full stop or .) using regex? and Python regex - r prefix.
Upvotes: 23
Reputation: 1496
An example when you want to extract more than one datapoint from each string in the list:
Input:
2021-02-08 20:43:16 [debug] : [RequestsDispatcher@_execute_request] Requesting: https://test.com&uuid=1623\n
Code:
pat = '(.* \d\d:\d\d:\d\d) .*_execute_request\] (.*?):.*uuid=(.*?)[\.\n]'
new_list = [re.findall(pat,s) for s in my_list]
Output:
[[('2021-02-08 20:43:15', 'Requesting', '1623')]]
Upvotes: 1
Reputation: 141
Here is some improvisation that works fine. Probably helps..
import re
l= ['bbb', 'ccc', 'axx', 'xzz', 'xaa'] #list
s= str( " ".join(l)) #flattening list to string
re.findall('\\bx\\S*', s) #regex to find string starting with x
['xzz', 'xaa']
Upvotes: 0
Reputation: 33370
You could use filter
. I am assuming you want a new list with certain elements from the old one.
new_test = filter(lambda x: x.startswith('x'), test)
Or if you want to use a regular expression in the filter function you could try the following.
It requires the re
module to be imported.
new_test = filter(lambda s: re.match("^x", s), test)
Upvotes: 2
Reputation: 63727
You can use the following to find if any of the strings in list starts with 'x'
>>> [e for e in test if e.startswith('x')]
['xzz', 'xaa']
>>> any(e.startswith('x') for e in test)
True
Upvotes: 6