Reputation: 1065
Is there anyway of using hashing trick after I train and deploy my model? Assume I have the following data and I tased the Cat
feature as follows:
from sklearn.feature_extraction import FeatureHasher
D = {"ID": [1,2,3,4,5,6,7,8,9,10], "Cat": ["A", "A", "B", "A", "A", "B", "A", "B", "B", "B"]}
df = pd.DataFrame(D)
fh = FeatureHasher(n_features=1, input_type='string')
hashed_features = fh.fit_transform(df['Cat'])
hashed_features.toarray()
How can use the taser to hash incoming new data? I am looking for something like:
fh.predict('A')
Should I just build a dictionary from the hashing process during training and then just map the new incoming data to the build dictionary? Is there a better way?
Upvotes: 0
Views: 265
Reputation: 2072
Use FeatureHasher.transform()
. For example, try this in your code:
fh.transform(['A','B']).toarray()
# array([[ 1.], [-1.]])
Upvotes: 1