Reputation: 597
After reading the answers to this question I'm still a bit confused about the whole PackedSequence object thing. As I understand it, this is an object optimized for parallel processing of variable sized sequences in recurrent models, a problem to which zero padding is one [imperfect] solution. It seems that given a PackedSequence object, a Pytorch RNN will process each sequence in the batch to its end, and not continue to process the padding. So why is padding needed here? Why are there both a pack_padded_sequence() and pack_sequence() methods?
Upvotes: 4
Views: 3682
Reputation: 24701
Mostly for historical reasons; torch.nn.pack_padded_sequence()
was created before torch.nn.pack_sequence()
(the later appeared in 0.4.0
for the first time if I see correctly) and I suppose there was no reason to remove this functionality and break backward compatibility.
Furthermore, it's not always clear what's the best/fastest way to pad
your input and it highly varies on data you are using. When data was somehow padded beforehand (e.g. your data was pre-padded and provided to you like that) it is faster to use pack_padded_sequence()
(see source code of pack_sequence
, it's calculating length
of each data point for you and calls pad_sequence
followed by pack_padded_sequence
internally). Arguably pad_packed_sequence
is rarely of use right now though.
Lastly, please notice enforce_sorted
argument provided since 1.2.0
version for both of those functions. Not so long ago users had to sort their data (or batch) with the longest sequence first and shortest last, now it can be done internally when this parameter is set to False
.
Upvotes: 5