Chau Loi
Chau Loi

Reputation: 1225

How sklearn.pipeline works, in manually?

Currently, I am working on the sklearn.pipeline which is just wonderful Here is an example:

model = make_pipeline(TfidfVectorizer(), MultinomialNB())
model.fit(train.data, train.target)
labels = model.predict(test.data)

(*data is from train = fetch_20newsgroups(subset='train', categories=categories)) with categories= ['talk.religion.misc', 'soc.religion.christian', 'sci.space','comp.graphics']

However, my understanding is just still very vague. I would like to ask that if we do it step by step without pipeline how it could be. Here is just what I am trying to do but it failed.

from sklearn.datasets import fetch_20newsgroups
Categories = ['talk.religion.misc', 'soc.religion.christian', 'sci.space','comp.graphics']
train = fetch_20newsgroups(subset='train', categories=categories)`

from sklearn.feature_extraction.text import TfidfVectorizer
model1=TfidfVectorizer()
X=model1.fit_transform(train.data)

from sklearn.naive_bayes import MultinomialNB
model2=MultinomialNB
model2.fit(....)

At this far, I just don't know what to do next because the shape of Xis not suitable for model2.

For your further information of this, go to the book from this link at page (406/548)

*** Please pardon for my silly question. I know I can do it by using pipeline but just want to try

Upvotes: 2

Views: 392

Answers (1)

Venkatachalam
Venkatachalam

Reputation: 16966

You are almost there! you need to use MultinomialNB() instead of MultinomialNB.

Try the following procedure.

from sklearn.datasets import fetch_20newsgroups
Categories = ['talk.religion.misc', 'soc.religion.christian', 'sci.space','comp.graphics']
train = fetch_20newsgroups(subset='train', categories=categories)


from sklearn.feature_extraction.text import TfidfVectorizer
model1=TfidfVectorizer()
X=model1.fit_transform(train.data)

from sklearn.naive_bayes import MultinomialNB
model2=MultinomialNB()
model2.fit(X, train.target)
model2.predict(model1.transform(test.data))

# array([2, 1, 1, ..., 2, 1, 1])

Upvotes: 2

Related Questions