Reputation: 2158
I have list of list of tuples that I want to merge. Below code combines the properties with single list passed into 'classified_text', how do I iterate this concept for nested list of tuples? I tried adding another for loop and append method, but I get different error. Any simple way to do this? Thanks!
Input Text 1 - Working:
classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list
Output Text 1 - Working:
[('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]
Input Text 2 - Not Working: Nested list with tuples
classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]
Code:
from itertools import groupby
entity_extracted_words = []
for tag, chunk in groupby(classified_text, lambda x:x[1]):
if tag != "O":
info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)
entity_extracted_words.append(info_ner)
print('entity_extracted_words:\n', entity_extracted_words)
Out Text 2 - Trying to get this result:
[('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
Error: TypeError: not all arguments converted during string formatting
Upvotes: 2
Views: 238
Reputation: 2036
Try something like this. Simply for-loop
over the sublist
s, combining into a string and add them to the newlist
classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')],
[('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],
[('some', 'O'), ('text', 'O'), ('here', 'O')],
[('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]
newlist = []
for sublist in classified_text:
combined = []
for chunk, tag in sublist:
if tag == 'O':
continue
combined_tag = tag
combined.append(chunk)
# Append tag and string to list
if combined:
# If you wanted to space filled as in your example, you can use
# the strings ljust method
newlist.append((combined_tag.ljust(12), ' '.join(combined)))
print(newlist)
#[('PERSON ', 'John Smith'),
# ('ORGANIZATION', 'University of ABC'),
# ('ORGANIZATION', 'University of CA')]
Upvotes: 2
Reputation: 5429
You could first flatten your list of lists into just a list:
flat_list = [item for sublist in classified_text for item in sublist]
And that flat list should work with your original code.
Upvotes: 0