Getting second element of a tuple (in a list of tuples) as a string

Question

I have an output that is a list of tuples. It looks like this:

annot1=[(402L, u"[It's very seldom that you're blessed to find your equal]"), 
        (415L, u'[He very seldom has them in this show or his movies]')…

I need to use the second part of the tuple only to apply ‘split’ and get each word on the sentence separately.

At this point, I’m not able to isolate the second part of the tuple (the text).

This is my code:

def scope_match(annot1):
    scope = annot1[1:]
    scope_string = ‘’.join(scope)
    scope_set = set(scope_string.split(' '))

But I get:

TypeError: sequence item 0: expected string, tuple found

I tried to use annot1[1] but it gives me the second index of the text instead of the second element of the tuple.

Mohammad Yusuf · Accepted Answer

You can do something like this with list comprehensions:

annot1=[(402L, u"[It's very seldom that you're blessed to find your equal]"), 
        (415L, u'[He very seldom has them in this show or his movies]')]
print [a[1].strip('[]').encode('utf-8').split() for a in annot1]

Output:

[["It's", 'very', 'seldom', 'that', "you're", 'blessed', 'to', 'find', 'your', 'equal'], ['He', 'very', 'seldom', 'has', 'them', 'in', 'this', 'show', 'or', 'his', 'movies']]

You can calculate the intersection of strings in corresponding positions in annot1 and annot2 like this:

for x,y in zip(annot1,annot2):
    print set(x[1].strip('[]').encode('utf-8').split()).intersection(y[1].strip('[]').encode('utf-8').split())

Getting second element of a tuple (in a list of tuples) as a string

Answers (2)

Related Questions