dodo
dodo

Reputation: 319

multiple features in collaborative filtering- spark

I have a CSV file that looks like:

customer_ID, location, ....other info..., item-bought, score

I am trying to build a collaborative filtering recommender in Spark. Spark takes data of the form:

userID, itemID, value

but my data is longer, I want all user's info to be used instead of just userID. I tried grouping the columns in one column as:

(customerID,location,....),itemID,score

but the ALS.train gives me this error:

TypeError: int() argument must be a string or a number, not 'tuple'

How can I let spark take multiple key/values and not only three columns? thanks

Upvotes: 2

Views: 946

Answers (1)

Rohit Chatterjee
Rohit Chatterjee

Reputation: 3107

For each customer, identify the columns which you would like to use to distinguish these user-entities. Create a table (e.g. in SQL) in which each row contains the information for one user-entity, and use the row number in this table as the userID.

Do the same for your items if necessary, and provide these IDs to your classifier.

Upvotes: 1

Related Questions