Shahzeb Naveed
Shahzeb Naveed

Reputation: 52

'Pipeline' object is not subscriptable

I'm trying to run the following code but I'm getting a 'Pipeline' object is not subscriptable' error when I do pipe['count'].


from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.pipeline import Pipeline
import numpy as np

corpus = ['this is the first document',
          'this document is the second document',
          'and this is the third one',
          'is this the first document']

vocabulary = ['this', 'document', 'first', 'is', 'second', 'the',
               'and', 'one']

pipe = Pipeline([('count', CountVectorizer(vocabulary=vocabulary)),
                 ('tfid', TfidfTransformer())]).fit(corpus)

pipe['count'].transform(corpus).toarray()
array([[1, 1, 1, 1, 0, 1, 0, 0],
       [1, 2, 0, 1, 1, 1, 0, 0],
       [1, 0, 0, 1, 0, 1, 1, 1],
       [1, 1, 1, 1, 0, 1, 0, 0]])


pipe['tfid'].idf_
array([1.        , 1.22314355, 1.51082562, 1.        , 1.91629073,
       1.        , 1.91629073, 1.91629073])

pipe.transform(corpus).shape
(4, 8)```

Upvotes: 3

Views: 3914

Answers (1)

thomaskolasa
thomaskolasa

Reputation: 186

Instead of pipe['count'], you can try pipe.named_steps['count']. To access your 'tfidf' step, try pipe.named_steps['tfid'].

Upvotes: 5

Related Questions