Reputation: 51
Here is the list which includes tags to the word type
t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),('Cook','ORGANIZATION')]
The expectation is to conditionally check if the subsequent tuple is tagged as organization if so concatenate them with a space and continue with the same over the entire list.
Expected output:
Wall Mart, Thomas Cook
for x in t:
if(x[1] == 'ORGANIZATION'):
org_list = org_list + ' | ' + x[0]
I was just able to extract the names but not really getting a way where I could concatenate the words tagged as organization.
Refereed to other Question asked: [Link]Concatenate elements of a tuple in a list in python
Expected output: Wall Mart, Thomas Cook
Upvotes: 1
Views: 172
Reputation: 88236
Given that there will always be an 'OTHER'
between two subsequent 'ORGANIZATION'
, one approach is using itertools.groupby
to group subsequent tuples by their second element, and str.join
their first items if the grouping key
is 'ORGANIZATION'
:
t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),
('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),
('Cook','ORGANIZATION')]
from itertools import groupby
from operator import itemgetter as g
[' '.join(i[0] for i in [*v]) for k,v in groupby(t, key=g(1)) if k=='ORGANIZATION']
# ['Wall Mart', 'Thomas Cook']
If you prefer a for loop solution without any imports, you can do: -- This will work only for two subsequent tags:
f = False
out = []
for i in t:
if i[1] == 'ORGANIZATION':
if not f:
out.append(i[0])
f = True
else:
out[-1] += f' {i[0]}'
f = False
print(out)
# ['Wall Mart', 'Thomas Cook']
Upvotes: 2
Reputation: 17814
You can use the following solution:
t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),('Cook','ORGANIZATION')]
result = [[]]
for i, j in t:
if j == 'ORGANIZATION':
result[-1].append(i)
elif result[-1]:
result.append([])
result = [' '.join(i) for i in result if i]
# ['Wall Mart', 'Thomas Cook']
Upvotes: 1