cs0815
cs0815

Reputation: 17408

assign cluster membership to new data using kmodes

Looking at this code from here:

import numpy as np
from kmodes.kmodes import KModes

# random categorical data
data = np.random.choice(20, (100, 10))

km = KModes(n_clusters=4, init='Huang', n_init=5, verbose=1)
clusters = km.fit_predict(data)

# Print the cluster centroids
print(km.cluster_centroids_)

Does anyone happen to know how to save the "clustering model" and apply it to new data? Or in other words cluster previously unseen data? Thanks.

Upvotes: 1

Views: 329

Answers (1)

artemis
artemis

Reputation: 7261

You can use pickle for this task.

import pickle

with open('cluster_model.pickle', 'wb') as n:
    pickle.dump(km, n)

When you want to use it on new data, simply:

with open('cluster_model.pickle', 'rb') as f:
    km = pickle.load(f)

# If your new data is called "new_data", you can:
new_clusters = km.predict(new_data)

Upvotes: 2

Related Questions