a_jelly_fish
a_jelly_fish

Reputation: 480

Extract elements from .npy file, convert them to PyTorch tensors

I read a .npy file that contains just the labels for images. The labels are stored in dictionary format. I need to convert this to an array of Tensors. But I'm unable to extract elements one of the other from the object the file returns, which is numpy.ndarray type.


import numpy as np
data = np.load('/content/drive/My Drive/targets.npy')
print(data.item())


{0: array(5), 1: array(0), 2: array(4), 3: array(1), 4: array(9), 5: array(2), 6: array(1), 7: array(3)}

print(data[()].values())

dict_values([array(5), array(0), array(4), array(1), array(9), array(2), array(1), array(3)])

I would like to create an array of tensors instead.

Thanks in advance.

Upvotes: 0

Views: 2023

Answers (2)

a_jelly_fish
a_jelly_fish

Reputation: 480

The below worked for me, with guidance by @kmario23

import numpy as np
data = np.load('/content/drive/My Drive/targets.npy')
print(data.item())

{0: array(5), 1: array(0), 2: array(4), 3: array(1), 4: array(9), 5: array(2), 6: array(1), 7: array(3)}
# data is a 0-d numpy.ndarray that contains a dictionary. 

print(list(data[()].values()))

[array(5),
 array(0),
 array(4),
 array(1),
 array(9),
 array(2),
 array(1),
 array(3),
 array(1),
 array(4),
 array(3)]

# torch.Tensor(5) gives tensor([2.0581e-35, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00])
# torch.tensor(5) gives 5
# unsure of why the difference exists..

Labels = torch.stack([torch.tensor(i) for i in list_of_labels_array_form])

print(Labels)

tensor([5, 0, 4,  ..., 2, 5, 0])

Upvotes: 0

kmario23
kmario23

Reputation: 61325

Assuming your data is a dict:

In [59]: dct = {0: np.array([5]), 1: np.array([0]), 2: np.array([4]), 
                3: np.array([1]), 4: np.array([9]), 5: np.array([2]), 
                6: np.array([1]), 7: np.array([3])}

You can use numpy.concatenate() wrapped in torch.tensor() to get a tensor out of it:

In [63]: torch.tensor(np.concatenate(list(dct.values())))
Out[63]: tensor([5, 0, 4, 1, 9, 2, 1, 3])

Additionally, if you want both keys and values to be stacked in a single 2D tensor, then use torch.cat()

# tensor with just keys
In [86]: tk = torch.tensor(list(dct.keys()))
In [87]: tk
Out[87]: tensor([0, 1, 2, 3, 4, 5, 6, 7])

# tensor with just values
In [88]: tv = torch.tensor(np.concatenate(list(dct.values())))
In [89]: tv
Out[89]: tensor([5, 0, 4, 1, 9, 2, 1, 3])

# horizontally stack them into a single 2D tensor
In [85]: torch.cat((tk[:, None], tv[:, None]), dim=1)
Out[85]: 
tensor([[0, 5],
        [1, 0],
        [2, 4],
        [3, 1],
        [4, 9],
        [5, 2],
        [6, 1],
        [7, 3]])

After a series of comments, I have now understood your problem and here is the way to solve it:

In [48]: data_item = {0: np.array(5), 1: np.array(0), 2: np.array(4), 
                      3: np.array(1), 4: np.array(9), 5: np.array(2),
                      6: np.array(1), 7: np.array(3)}

# convert keys to an 1D tensor
In [53]: tk = torch.tensor(list(data_item.keys()))

In [54]: tk
Out[54]: tensor([0, 1, 2, 3, 4, 5, 6, 7])

Since you have the values as arrays of 0D (i.e. scalars), we need to extract the elements from them. For this, we can use lambda function alongside map, whose job is to apply the lambda function on the iterable (here: data_item.values()) and give us the elements. These can be passed to torch.tensor to get the desired 1D tensor.

# convert values to an 1D tensor
In [57]: tv = torch.tensor(list(map(lambda a: a.item(), data_item.values())))

In [58]: tv
Out[58]: tensor([5, 0, 4, 1, 9, 2, 1, 3])

# horizontally stack them into a single 2D tensor, if needed
In [85]: torch.cat((tk[:, None], tv[:, None]), dim=1)
Out[85]: 
tensor([[0, 5],
        [1, 0],
        [2, 4],
        [3, 1],
        [4, 9],
        [5, 2],
        [6, 1],
        [7, 3]])

Upvotes: 1

Related Questions