Python SpaCy Create nlp Document - Argument 'string' has incorrect type

Question

I'm relatively new to Python NLP and I am trying to process a CSV file with SpaCy. I'm able to load the file just fine using Pandas, but when I attempt to process it with SpaCy's nlp function, the compiler errors out approximately 5% of the way through the file's contents.

Code block follows:

import pandas as pd
df = pd.read_csv('./reviews.washington.dc.csv')

import spacy
nlp = spacy.load('en')

for parsed_doc in nlp.pipe(iter(df['comments']), batch_size=1, n_threads=4):
    print (parsed_doc.text)

I've also tried:

df['parsed'] = df['comments'].apply(nlp)

with the same result.

The traceback I'm receiving is:

Traceback (most recent call last):
    File "/Users/john/Downloads/spacy_load.py", line 11, in 
        for parsed_doc in nlp.pipe(iter(df['comments']), batch_size=1,
        n_threads=4):
    File "/usr/local/lib/python3.6/site-packages/spacy/language.py",
        line 352, in pipe for doc in stream:
    File "spacy/syntax/parser.pyx", line 239, in pipe
        (spacy/syntax/parser.cpp:8912)
    File "spacy/matcher.pyx", line 465, in pipe (spacy/matcher.cpp:9904)
    File "spacy/syntax/parser.pyx", line 239, in pipe (spacy/syntax/parser.cpp:8912)
    File "spacy/tagger.pyx", line 231, in pipe (spacy/tagger.cpp:6548)
    File "/usr/local/lib/python3.6/site-packages/spacy/language.py", line 345,
        in  stream = (self.make_doc(text) for text in texts)
    File "/usr/local/lib/python3.6/site-packages/spacy/language.py", line 293,
        in  self.make_doc = lambda text: self.tokenizer(text)
    TypeError: Argument 'string' has incorrect type (expected str, got float)

Can anyone shed some light on why this is happening, as well as how I might work around it? I've tried various workarounds from the site to no avail. Try/except blocks have had no effect, either.

Python SpaCy Create nlp Document - Argument 'string' has incorrect type

Answers (1)

Related Questions

Python SpaCy Create nlp Document - Argument &#39;string&#39; has incorrect type

Answers (1)

Related Questions

Python SpaCy Create nlp Document - Argument 'string' has incorrect type