Reputation: 5
Good day, i'm working on an project and creating a dic out of the found entities.
Maybe somebody may help me. i would be apreciate to learn more about it.
I was thinkof the Counting possibility using counter
.
andGreets!!
Upvotes: 0
Views: 187
Reputation: 821
If I understand you correctly, you can achieve this by making some additions to your code
All you gotta do it use a Counter
on your perlist
and loclist
, and store the results in a dict.
...
final_dict = {} # stores the desired final output in a singe dict
for filepath in files:
with open(filepath, 'r', encoding='UTF8') as file_to_read:
some_text = file_to_read.read()
base_name = os.path.basename(filepath)
print(base_name)
doc = nlp(some_text)
perlist=[]
loclist=[]
for ent in doc.ents:
if ent.label_ == "PER":
perlist.append(str(ent))
elif ent.label_ == "LOC":
loclist.append(str(ent))
# Count the number of PER/LOC entities and store in final_dict
final_list = [] # {"1.txt": final_list}
# Count PER entities
c = Counter(perlist)
for p, count in c.most_common():
final_list.append({
'name': p,
'type': 'PER',
'frequency': count
})
# Count LOC entities
c = Counter(loclist)
for l, count in c.most_common():
final_list.append({
'name': l,
'type': 'LOC',
'frequency': count
})
# store list of results in final_dict.
# eg. final_dict['1.txt'] = [{'name': 'Englishmen', 'type': 'PER', 'frequency': 1}, ...]
final_dict[base_name] = final_list
print('Final result', final_dict)
Upvotes: 0