Amritha Dilip
Amritha Dilip

Reputation: 733

Regex :How to remove repetition of the same string?

I'm trying to find the year from the date. the dates are in the format

"Nov.-Dec. 2010"
"Aug. 30 2011-Sept. 3 2011"
"21-21 Oct. 1997"


my regular expression is
q = re.compile("\d\d\d\d")
a = q.findall(date)

so obviously in the list it has two items for a string like "Aug. 30 2011-Sept. 3 2011"

["2011","2011"]

i dont want a repetition, how do i do that?

Upvotes: 1

Views: 332

Answers (2)

MUY Belgium
MUY Belgium

Reputation: 2452

Use the following function :

def getUnique(date): 
  q = re.compile("\d\d\d\d") 
  output = [] 
  for x in q.findall(date): 
     if x not in output: 
         output.append(x) 
  return output 

It's O(n^2) though, with the repeated use of not in for each element of the input list

see How to remove duplicates from Python list and keep order?

Upvotes: 0

Lev Levitsky
Lev Levitsky

Reputation: 65781

You could use a backreference in the regex (see the syntax here):

(\d{4}).*\1

Or you could use the current regex and put this logic in the python code:

if a[0] == a[1]:
    ...

Upvotes: 1

Related Questions