NachoMiguel
NachoMiguel

Reputation: 993

Get a specific string from a list - Python

I have a list that looks like this:

list = ['Julio Cesar por inhumana (?)', '1/4/2015', '1/4/2015', '1/4/2015']

and i just want the dates. I have a regex that looks like this :

r'\b(\d+/\d+/\d{4})\b'

but i don´t really know how to use it in a list. Or maybe can be done in other way

Any help will be really appreciated

Upvotes: 1

Views: 90

Answers (3)

Moinuddin Quadri
Moinuddin Quadri

Reputation: 48067

You can achieve this by using re.match().

Note: list is reserved keyword in Python. You should not use that.

import re
str_list = ['Julio Cesar por inhumana (?)', '1/4/2015', '1/4/2015', '1/4/2015']

# Using list(str_list) to iterate over the copy of 'str_list'
# to remove unmatched strings from the original list
for s in list(str_list):
    if not re.match(r'\b(\d+/\d+/\d{4})\b', s):
        str_list.remove(s)

OR, you may use list comprehension if you also want to keep original list:

import re
str_list = ['Julio Cesar por inhumana (?)', '1/4/2015', '1/4/2015', '1/4/2015']
new_list = [s for s in str_list if re.match(r'\b(\d+/\d+/\d{4})\b', s)]

Upvotes: 1

zyxue
zyxue

Reputation: 8840

If the list is long, compile the pattern first will result in better performance

import re

# list is a keyword in Python, so when used as a variable name, append
# underscore, according to PEP8 (https://www.python.org/dev/peps/pep-0008/)
# quote: single_trailing_underscore_ : used by convention to avoid conflicts
# with Python keyword, e.g.
list_ = ['Julio Cesar por inhumana (?)', '1/4/2015', '1/4/2015', '1/4/2015']

date_pattern = re.compile(r'\b(\d+/\d+/\d{4})\b')

print filter(date_pattern.match, list_)
# equivalent to
# print [i for i in list_ if date_pattern.match(i)]
# produces ['1/4/2015', '1/4/2015', '1/4/2015']

Upvotes: 3

Patrick Maupin
Patrick Maupin

Reputation: 8127

Very simple. Just use re.match:

>>> import re
>>> mylist = ['Julio Cesar por inhumana (?)', '1/4/2015', '1/4/2015', '1/4/2015']
>>> dates = [x for x in mylist if re.match(r'\b(\d+/\d+/\d{4})\b', x)]
>>> dates
['1/4/2015', '1/4/2015', '1/4/2015']

re.match only matches at the start of the string, so it's what you want for this case. Also, I wouldn't name a list "list" -- because that's the name of the built-in list class, you could hurt yourself later if you try to do list(some_iterable). Best not to get in that habit.

Finally, your regex will match a string that starts with a date. If you want to insure that the entire string is your date, you could modify it slightly to r'(\d{1,2}/\d{1,2}/\d{4})$' -- this will insure that the month and day are each 1 or 2 digits and the year is exactly 4 digits.

Upvotes: 6

Related Questions