Reputation: 498
I have plan to use contextual bandit of Vowpal Wabbit(VW) for building the recommend system.
I have M
(26
in this case) dimensional numerical feature of N
users, and have the feedback logs that contains information which user clicks which item(e.g. Ad). And the total number of valid actions slightly different each feedback logs (about 100
~150
). Only information from items(actions) is its unique ID.
So in this situation, I decided to use ADF learning mode (--cb_explore_adf
). But in the tutorial, it seems the VW only takes care of categorical data type not numerical. Anyway, I tried to set test data format like below.
shared |User feat_0=1.0 feat_1=0.00389094278216362 feat_2=0.004632890224456787 feat_3=0.003936515189707279 feat_4=0.0053831832483410835 ... feat_23=0.4192083477973938 feat_24=0.003969503100961447 feat_25=0.0038898871280252934
|Action item_id=hamny-kU9bbbbbak
|Action item_id=hamny-kU9bcxP9v1
...
|Action item_id=hamny-bbbbbcxP9v
|Action item_id=hamny-k7bbbbbcxd
|Action item_id=hamny-bbbbbbbbbc
|Action item_id=hamny-aaaaaaaaac
Above example asks CB model to produce pmf(predict) among 100 actions given 26D user context feature.
After getting prediction from model and reward, training data format would be ..
shared |User feat_0=1.0 feat_1=0.00389094278216362 feat_2=0.004632890224456787 feat_3=0.003936515189707279 feat_4=0.0053831832483410835 ... feat_23=0.4192083477973938 feat_24=0.003969503100961447 feat_25=0.0038898871280252934
|Action item_id=hamny-kU9bbbbbak
|Action item_id=hamny-kU9bcxP9v1
...
|Action item_id=hamny-bbbbbcxP9v
0:-1:0.57124 |Action item_id=hamny-k7bbbbbcxd
|Action item_id=hamny-bbbbbbbbbc
|Action item_id=hamny-aaaaaaaaac
I'm not sure it is proper format or not. But when I run some simulation for CTR, I got gives almost same result from CB model regardless of exploration option (e.g. epsilon
, bag
, softmax
etc.).
I just tried same logic in tutorial function(run_simulation
). The only differences are example: shared context, number of actions, and ADF.
Upvotes: 0
Views: 238
Reputation: 821
The VW text format is quite simple. When specifying features if you use a ':' and then a float afterwards it allows you to specify the feature's value. If there is no explicit value after a ':' the value is 1.
So when you supply a feature as feat_1=0.00389094278216362
it is a categorical feature of value 1. The important thing to note here is that if any part of that feature string changes it results in a completely different features (the entire string is hashed to determine its index) so feat_1=0.00389094278216363
(last character changed) is a completely different feature. There is no relation between the two.
You could try specifying the value like feat_1:0.00389094278216362
but I am not sure if that will really work. Perhaps if there is some sort of linear relationship of the feature with the outcome?
You could also try binning the features to some decimal place with rounding. So, feat_1=0.00389094278216362
may become feat_1=0.004
.
I am not sure of the theory behind what should be done here, but those are my thoughts of things you could try empirically.
Upvotes: 1