Nelson
Nelson

Reputation: 301

Recommnederlab -Extract similarities from the recommender

I’m using recommenderlab to get recommendations both from UBCF and IBCF models and everything seems to be working fine (I got the recommendations and they seem to make sense). I would like to explain why each recommendation it’s being produced, so I would like to get the similarities between the users (UBCF) and between the items (IBCF).

I the IBCF recommender I can see that the similarities are stored into the recommender structure (aux_recommneder@model$sim) but I don’t know how to extract them properly.I want to choose a specific item and get the top x most similar item (used to build the recommendation).With the UBCF I want to choose a specific user and get they most similar users.

My IBCF recommender structure is the following:

   > str(aux_recommender)
    Formal class 'Recommender' [package "recommenderlab"] with 5 slots
      ..@ method  : chr "IBCF"
      ..@ dataType: atomic [1:1] realRatingMatrix
      .. ..- attr(*, "package")= chr "recommenderlab"
      ..@ ntrain  : int 7106
      ..@ model   :List of 9
      .. ..$ description         : chr "IBCF: Reduced similarity matrix"
      .. ..$ sim                 :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
      .. .. .. ..@ i       : int [1:2644] 12 105 649 705 1207 1282 555 62 365 485 ...
      .. .. .. ..@ p       : int [1:1323] 0 6 6 7 7 7 7 12 13 13 ...
      .. .. .. ..@ Dim     : int [1:2] 1322 1322
      .. .. .. ..@ Dimnames:List of 2
      .. .. .. .. ..$ : chr [1:1322] "1" "19" "22" "41" ...
      .. .. .. .. ..$ : chr [1:1322] "1" "19" "22" "41" ...
      .. .. .. ..@ x       : num [1:2644] 0.71 0.766 0.834 0.663 0.919 ...
      .. .. .. ..@ factors : list()
      .. ..$ k                   : num 2
      .. ..$ method              : chr "Pearson"
      .. ..$ normalize           : chr "Z-score"
      .. ..$ normalize_sim_matrix: logi FALSE
      .. ..$ alpha               : num 0.5
      .. ..$ na_as_zero          : logi FALSE
      .. ..$ minRating           : num 2
      ..@ predict :function (model, newdata, n = 10, data = NULL, type = c("topNList", ratings"), ...)  

In my UBCF I can't even spot where the similarities are stored(if they are at all).

My UBCF structure is:

> str(rec_ub)
Formal class 'Recommender' [package "recommenderlab"] with 5 slots
  ..@ method  : chr "UBCF"
  ..@ dataType: atomic [1:1] realRatingMatrix
  .. ..- attr(*, "package")= chr "recommenderlab"
  ..@ ntrain  : int 7106
  ..@ model   :List of 7
  .. ..$ description: chr "UBCF-Real data: contains full or sample of data set"
  .. ..$ data       :Formal class 'realRatingMatrix' [package "recommenderlab"] with 2 slots
  .. .. .. ..@ data     :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. .. .. ..@ i       : int [1:2103234] 0 1 2 3 4 5 6 7 8 9 ...
  .. .. .. .. .. ..@ p       : int [1:1323] 0 6908 8602 9037 9546 14311 17869 18006 23693 24432 ...
  .. .. .. .. .. ..@ Dim     : int [1:2] 7106 1322
  .. .. .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. .. .. ..$ : chr [1:7106] "10034" "10042" "10048" "10069" ...
  .. .. .. .. .. .. ..$ : chr [1:1322] "1" "19" "22" "41" ...
  .. .. .. .. .. ..@ x       : num [1:2103234] -0.371 0.465 -0.174 0.188 0.27 ...
  .. .. .. .. .. ..@ factors : list()
  .. .. .. ..@ normalize:List of 3
  .. .. .. .. ..$ method : chr "Z-score"
  .. .. .. .. ..$ row    : logi TRUE
  .. .. .. .. ..$ factors:List of 2
  .. .. .. .. .. ..$ means: Named num [1:7106] 2.48 1.57 2.2 1.82 2.63 ...
  .. .. .. .. .. .. ..- attr(*, "names")= chr [1:7106] "10034" "10042" "10048" "10069" ...
  .. .. .. .. .. ..$ sds  : Named num [1:7106] 1.287 0.928 1.134 0.934 1.377 ...
  .. .. .. .. .. .. ..- attr(*, "names")= chr [1:7106] "10034" "10042" "10048" "10069" ...
  .. ..$ method     : chr "Pearson"
  .. ..$ nn         : num 2
  .. ..$ sample     : logi FALSE
  .. ..$ normalize  : chr "Z-score"
  .. ..$ minRating  : num 2
  ..@ predict :function (model, newdata, n = 10, data = NULL, type = c("topNList", "ratings"), ...) 

What I need to know is why, for instance, item 102 was recommended to user 10034. In IBCF it should be because the item 102 it's similar to other items that the user rated highly (can be for instance item 1 and 250 if we are considering 2 neighbourhoods). Do I need to know what are these item? How can I know that item 102 was recommended because of item 1 and 250? I need the same for the users in UBCF model.

I'll appreciate some help.

Upvotes: 2

Views: 1037

Answers (1)

Tina
Tina

Reputation: 146

Not sure if it's too late, but you can exact the IBCF similarity using the following code:

similarity <- as.matrix(aux_recommender@model$sim)

Upvotes: 3

Related Questions