Reputation: 610
I am new in python.
I have a list with seperator of "::" and it seems like that;
1::Erin Burkovich (2000)::Drama
2::Assassins (1995)::Thriller
I want to split them by "::" and extract the year from name and add it into the end of the line. Each movie has it own index.
Desired list seems like;
1::Erin Burkovich:Drama::2000
2::Assasins:Thriller:1995
I have below code:
for i in movies:
movie_id,movie_title,movie_genre=i.split("::")
movie_year=((movie_title.split(" "))[-1]).replace("(","").replace(")","")
movies.insert(-1, movie_year)
but it doesn't work at all.
Any help ?
Thanks in advance.
Upvotes: 1
Views: 70
Reputation: 1644
Many issues,
split
doesn't return a tuple but a list, so it can't be assigned directlyI've rewritten the code based on what you wanted, hope it helps
movies=["1::Erin Burkovich (2000)::Drama", "2::Assassins (1995)::Thriller"]
for i in range(len(movies)):
movie_details=movies[i].split("::")
print movie_details
movie_id=movie_details[0]
movie_title=movie_details[1]
movie_genre=movie_details[2]
movie_title_parts=movie_title.split(" ")
movie_year=((movie_title_parts[-1]).replace("(","").replace(")",""))
del movie_title_parts[-1]
movie_title=" ".join(movie_title_parts)
print movie_title+", "+movie_year
movies[i]=movie_id+"::"+movie_title+"::"+movie_genre+"::"+movie_year
Upvotes: 0
Reputation: 480
For python 3.6,check out this
a="""1::Erin Burkovich (2000)::Drama
2::Assassins (1995)::Thriller"""
a=a.split("\n")
c=[]
for b in range(len(a)):
g=[]
d=a[b].split("::")
e=d[1].split(" (")[1].split(")")[0]
f=d[1].split(" (")[0]
g.append(d[0])
g.append(f)
g.append(d[2])
g.append(e)
h="::".join(g)
c.append(h)
print("\n".join(c))
OUTPUT::
1::Erin Burkovich::Drama::2000
2::Assassins::Thriller::1995
Upvotes: 0
Reputation: 1511
Another (probably less elegant) way:
for i in movies:
split_list = i.split("::")
movie_id = split_list[0]
movie_title = split_list[1].split('(')
movie_genre = split_list[2]
print movie_id + '::' + movie_title[0].strip() + "::" + movie_genre + "::" + movie_title[1].strip(')')
Upvotes: 0
Reputation: 95968
You're having infinite loop, because when you add an item, your loop needs to iterate on more items, and then you're adding another item...
You should create a new list with the result.
Also, you can extract the list in a much easier way:
movie_year = re.findall('\d+', '(2000)')
Upvotes: 1
Reputation: 71451
Instead of splitting, you can use re.findall
to grab all alphanumeric characters, including whitespace, and then regroup:
import re
s = ['1::Erin Burkovich (2000)::Drama', '2::Assassins (1995)::Thriller']
new_data = [re.sub('\s(?=\:)', '', "{}::{}:{}:{}".format(id, name, type, year)) for id, name, year, type in [re.findall('[a-zA-Z0-9\s]+', i) for i in s]]
Output:
['1::Erin Burkovich:Drama:2000', '2::Assassins:Thriller:1995']
Upvotes: 0