Reputation: 351
for quantitative analysis, I would like to count how many entities of a specific type were recognized in a set of descriptions.
so far so good - now I want to count any time a certain entity of a type is recognized in the processed line/row and print the results afterward:
eg.:
PERSON: 34,
ORG: 10,
PRODUCT: 23,...
print('RAWDATASIZE:',rawdata["Activity.Description"].size)
print('Summary of entities recognized:')
count = {}
for index, row in validation_rawdata.head(100).iterrows():
line = row['Activity.Description']
if not (line is None):
doc = nlp(str(line))
entities = {}
entities_text = []
for ent in doc.ents:
count[ent.label_] =+ 1
print(count)
the current output looks like this:
RAWDATASIZE: 233291
Summary of entities recognized:
{'PERSON': 1, 'DATE': 1, 'GPE': 1, 'SHS_PRODUCT': 1, 'ORG': 1, 'NORP': 1, 'CARDINAL': 1, 'TIME': 1, 'LOC': 1, 'WORK_OF_ART': 1}
so it seems like its resetting the count after each iteration. How can I change the code keep counting?
Upvotes: 0
Views: 97