Gui Kham
Gui Kham

Reputation: 23

PySpark AttributeError: type object 'ALS' has no attribute 'trainImplicit'

I am trying to train my dataset using ALS to find latent factors. My dataset is of implicit ratings.

In depth, my database consist of three columns User, Item (Repositories) and Rating (Number of Stars (Implicit Rating)):

from pyspark.ml.recommendation import ALS

lines = spark.read.text("Dataset.csv").rdd
parts = lines.map(lambda row: row.value.split(","))

ratingsRDD = parts.map(lambda p: Row(userId=int(p[1]),repoId=int(p[2]),repoCount=float(p[3])))
ratings = spark.createDataFrame(ratingsRDD)

model = ALS.trainImplicit(ratings, rank=5,lambda_=0.01, alpha = 1.0, iterations =5)

I am getting this error:

AttributeError: type object 'ALS' has no attribute 'trainImplicit'

Upvotes: 1

Views: 2220

Answers (1)

desertnaut
desertnaut

Reputation: 60319

You are trying to use the syntax from the old, Spark MLLib ALS (which works with RDDs, and not with dataframes) with the new, Spark ML ALS, which indeed doesn't have a trainImplicit attribute (docs).

You should try something like:

als = ALS(rank=5, maxIter=5, alpha = 1.0, implicitPrefs=True, seed=0)
model = als.fit(ratings)

provided that your items are in a column named item and your ratings in rating - check the docs for further details, parameterization options, and examples.

Upvotes: 3

Related Questions