Reputation: 23
I am trying to train my dataset using ALS to find latent factors. My dataset is of implicit ratings.
In depth, my database consist of three columns User, Item (Repositories) and Rating (Number of Stars (Implicit Rating)):
from pyspark.ml.recommendation import ALS
lines = spark.read.text("Dataset.csv").rdd
parts = lines.map(lambda row: row.value.split(","))
ratingsRDD = parts.map(lambda p: Row(userId=int(p[1]),repoId=int(p[2]),repoCount=float(p[3])))
ratings = spark.createDataFrame(ratingsRDD)
model = ALS.trainImplicit(ratings, rank=5,lambda_=0.01, alpha = 1.0, iterations =5)
I am getting this error:
AttributeError: type object 'ALS' has no attribute 'trainImplicit'
Upvotes: 1
Views: 2220
Reputation: 60319
You are trying to use the syntax from the old, Spark MLLib ALS (which works with RDDs, and not with dataframes) with the new, Spark ML ALS, which indeed doesn't have a trainImplicit
attribute (docs).
You should try something like:
als = ALS(rank=5, maxIter=5, alpha = 1.0, implicitPrefs=True, seed=0)
model = als.fit(ratings)
provided that your items are in a column named item
and your ratings in rating
- check the docs for further details, parameterization options, and examples.
Upvotes: 3