lrkwz
lrkwz

Reputation: 6513

Mahout for recommendation performance issue

I've build a simple web based (spring-boot) recommendation engine using mahout configured with:

All the beans are decorated with their caching counterparts.

Dataset is:

Read from a MySQLJDBCDataModel:

CREATE TABLE `taste_preferences` (
   `user_id` bigint(20) DEFAULT NULL,
   `item_id` int(11) NOT NULL DEFAULT '0',
   `preference` int(11) NOT NULL,
  `timestamp` datetime DEFAULT NULL,
  KEY `idx_taste_preferences_user_id` (`user_id`),
  KEY `idx_taste_preferences_item_id` (`item_id`),
  KEY `idx_taste_preferences_preference` (`preference`),
  KEY `idx_taste_preferences_distinct` (`user_id`,`item_id`,`preference`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 

In such a scenario I use a 0.003 sampling rate (I imagine this means using about 12K taste preferences).

In this way I still have 10/20" for the first recommendation for a given user.

How do you suggest to improve performances given the same hardware? Could be a FileDataModel faster?

Upvotes: 0

Views: 321

Answers (1)

lrkwz
lrkwz

Reputation: 6513

Okay performance now are definitively better! The key point is decorate the dataModel in ReloadFromJDBCDataModel()

DataModel currentDataModel() throws TasteException {
    DataModel datamodel = new ReloadFromJDBCDataModel(
            new MySQLJDBCDataModel(new ConnectionPoolDataSource(datasource), preferenceTable, userIDColumn,
                    itemIDColumn, preferenceColumn, timestampColumn));
    return datamodel;
}

dataModel in this scenario is read-only but this can be a non-issue with some autoreload magic behind the scenes.

For sake of completeness the significative parts of my configuration are:

UserSimilarity similarity(DataModel dataModel) throws TasteException {
    return new CachingUserSimilarity(new EuclideanDistanceSimilarity(dataModel, Weighting.WEIGHTED), dataModel);
}

UserNeighborhood userNeighborhood;

UserNeighborhood neighborhood(DataModel dataModel, UserSimilarity userSimilarity) throws TasteException {

    if (useThresholdUserNeighborhood) {
        logger.info("Using ThresholdUserNeighborhood - threshold value is {}", threshold);
        userNeighborhood = new CachingUserNeighborhood(
                new ThresholdUserNeighborhood(threshold, userSimilarity, dataModel), dataModel);
    } else {
        logger.info(
                "Using NearestNUserNeighborhood - neightborhood size is {}, min similarity is {}, sampling rate is {}",
                neighborhoodSize, minSimilarity, samplingRate);
        userNeighborhood = new CachingUserNeighborhood(new NearestNUserNeighborhood(neighborhoodSize, minSimilarity,
                userSimilarity, dataModel, samplingRate), dataModel);
    }
    return userNeighborhood;
}

@Bean
public Recommender buildRecommender(DataModel dataModel) throws TasteException {

    UserSimilarity userSimilarity = similarity(dataModel);
    return new CachingRecommender(
            new GenericUserBasedRecommender(dataModel, neighborhood(dataModel, userSimilarity), userSimilarity));
}

Upvotes: 1

Related Questions