Sushant Patekar
Sushant Patekar

Reputation: 473

TypeError: 'int' object is not iterable" and PCA Assertion Error in Python Clustering Function

I'm working on a Python function (cluster_articles) to perform document clustering and return a dictionary of results. However, I'm encountering the following test errors:

TypeError: 'int' object is not iterable (in test_number_of_observations_kmeans10 and possibly test_proper_dict_return) AssertionError: Assertion error at PCA explained value (in test_pca_explained)

import pickle
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA
from sklearn.metrics import completeness_score, v_measure_score

def cluster_articles(data):
    # K-Means on original data
    kmeans_100 = KMeans(n_clusters=10, random_state=2, tol=0.05, max_iter=50)
    kmeans_100.fit(data['vectors'])
    labels_100 = kmeans_100.labels_

    # PCA Dimensionality Reduction
    pca = PCA(n_components=10, random_state=2)
    reduced_data = pca.fit_transform(data['vectors'])

    # K-Means on reduced data
    kmeans_10 = KMeans(n_clusters=10, random_state=2, tol=0.05, max_iter=50)
    kmeans_10.fit(reduced_data)
    labels_10 = kmeans_10.labels_

    print(type(kmeans_10.n_iter_))  # Debugging output

    # Results Dictionary (Potential issue here)
    result = {
        'nobs_100': kmeans_100.n_iter_,
        'nobs_10': kmeans_10.n_iter_,
        'pca_explained': pca.explained_variance_ratio_[0],
        # ... rest of the results
    }
    return result 

Task and Data Description:

Goal: Cluster documents using K-Means (with and without PCA). Calculate metrics like completeness score, V-measure, and PCA explained variance.

Data Structure (data dictionary):

Relevant Packages:

Questions:

What I've Tried: Printing the type of kmeans_10.n_iter_ confirms it's an integer.

Additional Notes:

Thank you for your help!

Upvotes: 0

Views: 50

Answers (0)

Related Questions