How to tell scikit-learn vectorizer use specific features?

Question

I have a set of features picked - up by hand. Not all of them are single words; some of them are bigrams and some other are trigrams. I want to model my texts - that are provided in the form of raw texts explicitly based on these features. How can I do that in sklearn? This is how I have defined my Vectorizer so far.

def initialize():
    from sklearn.feature_extraction.text import CountVectorizer
    vectorizer = CountVectorizer(ngram_range=(1, 3))
    return vectorizer

How to tell scikit-learn vectorizer use specific features?

Answers (1)

Related Questions