Leockl
Leockl

Reputation: 2156

Running GPU on multiple PyTorch tensor operators

I have the following PyTorch tensor:

X = np.array([[1, 3, 2, 3], [2, 3, 5, 6]])
X = torch.FloatTensor(X).cuda()

I was wondering if there is any difference (especially in the speed) between Scenario A or B below if I run the following multiple PyTorch operators in one line?

Scenario A:

X_sq_sum = (X**2).cuda().sum(dim = 1).cuda()

Scenario B:

X_sq_sum = (X**2).sum(dim = 1).cuda()

ie. Scenario A has two .cuda() whereas Scenario B only has one .cuda().

Many thanks in advance.

Upvotes: 0

Views: 460

Answers (1)

Dario Sučić
Dario Sučić

Reputation: 36

They will perform equally as the CUDA conversion is only done once.

As described in the docs, repeated .cuda() calls will be no-ops if the object is already in CUDA memory.

Upvotes: 2

Related Questions