Jino
Jino

Reputation: 53

Is there any way to ' pos_tag ' values into a list inside dictionary in python nltk?

I have a python dictionary contains list's of values. when I am trying to pos_tag the values inside the list, its showing error. Is there any way to fix it?

RuleSet = {1: ['drafts', 'duly', 'signed', 'beneficiary', 'drawn', 'issuing', 'bank', 'quoting', 'lc', ''], 2: ['date', ''], 3: ['signed', 'commerical', 'invoices', 'quadruplicate', 'gross', 'cifvalue', 'goods', '']}
for key in RuleSet:
    value = RuleSet[key]
    Tagged = nltk.pos_tag(value)
    print(Tagged)

IndexError: string index out of range

Upvotes: 2

Views: 387

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

You can use lists, you just cannot have an empty item in there. See the error log:

File "C:\Users\wstribizew\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nltk\tag\perceptron.py", line 240, in normalize
    elif word[0].isdigit():

There is no check for string length in elif word[0].isdigit() in perceptron.py because usually nltk.pos_tag is done after nltk.word_tokenize that does not output empty items when tokenizing a string.

Here is the working snippet:

import nltk
RuleSet = {1: ['drafts', 'duly', 'signed', 'beneficiary', 'drawn', 'issuing', 'bank', 'quoting', 'lc', ''], 2: ['date', ''], 3: ['signed', 'commerical', 'invoices', 'quadruplicate', 'gross', 'cifvalue', 'goods', '']}
for key in RuleSet:
    value = list(filter(None, RuleSet[key])) # Get rid of empty items
    Tagged = nltk.pos_tag(value)
    print(Tagged)

Output:

[('drafts', 'NNS'), ('duly', 'RB'), ('signed', 'VBD'), ('beneficiary', 'JJ'), ('drawn', 'NN'), ('issuing', 'VBG'), ('bank', 'NN'), ('quoting', 'VBG'), ('lc', 'NN')]
[('date', 'NN')]
[('signed', 'VBN'), ('commerical', 'JJ'), ('invoices', 'NNS'), ('quadruplicate', 'VBP'), ('gross', 'JJ'), ('cifvalue', 'NN'), ('goods', 'NNS')]

Upvotes: 1

Related Questions