Reputation: 17408
Looking at this code from here:
import numpy as np
from kmodes.kmodes import KModes
# random categorical data
data = np.random.choice(20, (100, 10))
km = KModes(n_clusters=4, init='Huang', n_init=5, verbose=1)
clusters = km.fit_predict(data)
# Print the cluster centroids
print(km.cluster_centroids_)
Does anyone happen to know how to save the "clustering model" and apply it to new data? Or in other words cluster previously unseen data? Thanks.
Upvotes: 1
Views: 329
Reputation: 7261
You can use pickle
for this task.
import pickle
with open('cluster_model.pickle', 'wb') as n:
pickle.dump(km, n)
When you want to use it on new data, simply:
with open('cluster_model.pickle', 'rb') as f:
km = pickle.load(f)
# If your new data is called "new_data", you can:
new_clusters = km.predict(new_data)
Upvotes: 2