owise
owise

Reputation: 1065

Using hashing trick for new incoming data

Is there anyway of using hashing trick after I train and deploy my model? Assume I have the following data and I tased the Cat feature as follows:

from sklearn.feature_extraction import FeatureHasher 

D = {"ID": [1,2,3,4,5,6,7,8,9,10], "Cat": ["A", "A", "B", "A", "A", "B", "A", "B", "B", "B"]}
df = pd.DataFrame(D)
fh = FeatureHasher(n_features=1, input_type='string')
hashed_features = fh.fit_transform(df['Cat'])
hashed_features.toarray()

How can use the taser to hash incoming new data? I am looking for something like:

fh.predict('A')

Should I just build a dictionary from the hashing process during training and then just map the new incoming data to the build dictionary? Is there a better way?

Upvotes: 0

Views: 265

Answers (1)

Qusai Alothman
Qusai Alothman

Reputation: 2072

Use FeatureHasher.transform(). For example, try this in your code:

fh.transform(['A','B']).toarray()

# array([[ 1.], [-1.]])

Upvotes: 1

Related Questions