Reputation: 1
I used torchtext vocab to convert the text to index
For example 1 have 2 names
aaban aabharan
After vocab:
[0, 0, 1, 0, 2] [0, 0, 1, 3, 0, 4, 0, 2]
Since the length of longest name in my data is 24
After using
torch.nn.utils.rnn.pad_sequence([torch.tensor(name) for name in name], batch_first=True, padding_value = -1.)
I got
tensor([ 0, 0, 1, 0, 2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1]) tensor([ 0, 0, 1, 3, 0, 4, 0, 2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1])
But i want to create the tensors with length 50, as there might be longer names which might not be in training data, how can i do it
As in, how can i get the following,
tensor([ 0, 0, 1, 0, 2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,-1, -1, -1, -1, -1, -1, -1]) tensor([ 0, 0, 1, 3, 0, 4, 0, 2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,-1, -1, -1, -1, -1, -1,-1, -1, -1, -1 ])
I tried going through the dcoumentation for both tochtext.vocab and orch.nn.utils but i couldn't find a way
Upvotes: 0
Views: 58