Reputation: 459
I would like to set a specific of time when training recommendation algorithm, for instance: I only want to train data of 6 months back or 3 months back. Does anybody know the way to config or I have to implement it? Thank you very much.
Upvotes: 0
Views: 158
Reputation: 459
I found that PredictionIO already support this issue:
This is my pseudo code:
def readEventTime(procTime : String) : Option[DateTime] = {
Option(if (procTime == null || (procTime != null && procTime == "")) null else DateTimeFormat.forPattern("yyyyMMddHHmmss").parseDateTime(procTime))
}
val itemsViewRDD: RDD[(String, Item)] = PEventStore.find(
appName = dsp.appName,
startTime = readEventTime(dsp.startTime),
untilTime = readEventTime(dsp.untilTime)
)(sc).map ...
Regards.
Upvotes: 0
Reputation: 5702
We are working on a more general method to solve this but there is an engine in the experimental branch that will retain a period of event data in the EventStore and discard the rest. This is the most performant way to do the training. I does mean that you need to run this engine periodically. https://github.com/PredictionIO/PredictionIO/tree/develop/examples/experimental/scala-cleanup-app
This will remove all old events including $set so be aware. If you $set a property and this event is removed, the property is no longer set.
In the future we plan to support a TTL for events so the event store itself ages events out automatically. We will also store the properties in a mutable store to remove them from the event stream. For now use the cleanup engine.
Upvotes: 0