Malte
Malte

Reputation: 351

How can I count recognized entities per label?

for quantitative analysis, I would like to count how many entities of a specific type were recognized in a set of descriptions.

  1. I'm reading the excel file
  2. checking the size
  3. iterating through the first 100 records

so far so good - now I want to count any time a certain entity of a type is recognized in the processed line/row and print the results afterward:

eg.:

PERSON: 34,
ORG: 10,
PRODUCT: 23,...
print('RAWDATASIZE:',rawdata["Activity.Description"].size)

print('Summary of entities recognized:')

count = {}

for index, row in validation_rawdata.head(100).iterrows():
    line = row['Activity.Description']
    if not (line is None):
        doc = nlp(str(line))
        entities = {}
        entities_text = []
        for ent in doc.ents:
            count[ent.label_] =+ 1
            
print(count)

the current output looks like this:

RAWDATASIZE: 233291
Summary of entities recognized:
{'PERSON': 1, 'DATE': 1, 'GPE': 1, 'SHS_PRODUCT': 1, 'ORG': 1, 'NORP': 1, 'CARDINAL': 1, 'TIME': 1, 'LOC': 1, 'WORK_OF_ART': 1}

so it seems like its resetting the count after each iteration. How can I change the code keep counting?

Upvotes: 0

Views: 97

Answers (1)

Lei Yang
Lei Yang

Reputation: 4335

There's a typo in your code: =+ 1 should be += 1

Upvotes: 1

Related Questions