Scalding: retaining all fields after groupBy

Question

I'm doing a groupBy for calculating a value, but it seems that when I group by, I lose all the fields that are not in the aggregation keys:

filtered.filterNot('site) {s:String => ...}
        .filterNot('date) {s:String => ...}
aggr = filtered.groupBy('id, 'contentHost) { group =>
    group.min('timestamp -> 'min)
    //how do I keep original fields? (eg: site, date)
}

aggr.store(Tsv(...)) //eg: field "site" won't be here

in pig, it would be something like this:

aggr = group filtered by concat('id, 'contentHost);

result = foreach aggr {
  generate flatten(filtered), //how to do this in scalding?
           min(filtered.timestamp) as min;
}

Scalding: retaining all fields after groupBy

Answers (1)

Related Questions