Reputation: 417
OK so I might be completely on the wrong path here. What I want to do is implement Bayesian Personalized Ranking in tensorflow.
If you are not familiar with BPR, the actual training procedure relies on online updates, kind of like SGD.
The problem that I am having is as follows. I need to define my own loss function, as such:
def objective(data, lam, item_biases, latent_items, latent_users):
user, rated_item, unrated_item = data
rated_item_bias = item_biases[rated_item]
unrated_item_bias = item_biases[unrated_item]
rated_latent_item = latent_items[rated_item]
unrated_latent_item = latent_items[unrated_item]
latent_user = latent_users[user]
rated_pred = rated_item_bias + tf.dot(rated_latent_item, latent_user)
unrated_pred = unrated_item_bias + tf.dot(unrated_latent_item, latent_user)
difference = rated_pred - unrated_pred
obj = tf.sigmoid(difference)
obj += lam * tf.reduce_sum(rated_item_bias**2)
obj += lam * tf.reduce_sum(unrated_item_bias**2)
obj += lam * tf.reduce_sum(rated_latent_item**2)
obj += lam * tf.reduce_sum(unrated_latent_item**2)
obj += lam * tf.reduce_sum(latent_user**2)
return obj
Note that this code might be buggy with tf types and such but that's not my concern here. As you can see, I have some trainable parameters (namely item_biases
, latent_items
, and latent_users
) that are tensorflow variables. I have some lam
hyperparameter. And I have my data. The data itself isn't conventional data. Rather, they are indices (corresponding to [user, seen item, unseen item]
triples) and I need to unpack these indices from the argument.
So my full dataset might be something like:
1 50 6
11 23 24
4 24 5
...
and each row might be one piece of data. Unfortunately, I'm not exactly sure how to "feed in" such data in a tf framework. My initial thought was to make data
a tf.placeholder
variable, because I'll be feeding in various values into it as I train. But of course, if data
is a Tensor, I can't just unpack it like it's some kind of tuple.
How should I proceed?
Upvotes: 0
Views: 144
Reputation: 11968
Usually hyper parameters are not fed as tensors since you modify them outside of the model and they are constant for the duration of the training. This also means they are written as constant into any model export you might do, to prevent accidents where you mismatch the paramters and the model. I usually pass them as flags to the model.
Another common pattern is to have input data as a dictionary of tensors (placeholders usually). This allows you to have different shapes for the different inputs.
You can also slice your tensors if you need it. For example data[:, 3]
will create a tensor with the third value from every entry in the data batch.
Upvotes: 1