Brian Formento
Brian Formento

Reputation: 781

where can I find the source code for torch.unique()?

I can only find in the pytorch source code (https://github.com/pytorch/pytorch/blob/2367face24afb159f73ebf40dc6f23e46132b770/torch/functional.py#L783) the following function call:

_VF.unique_dim() and torch._unique2()

but they don't point to anywhere else in the directory

Upvotes: 3

Views: 2933

Answers (1)

jodag
jodag

Reputation: 22244

Most of the pytorch backend code is implemented in C++ and/or CUDA. To see it you need to find the appropriate entrypoint in the source code. There are a couple ways to do this but the easiest I've found without downloading all the code yourself is to search for the keywords on github.

For example, if you go to github.com and the search for unique_dim repo:pytorch/pytorch, then click the "Code" tab on the left side you should quickly find the following.

From torch/jit/_builtins.py:103

 17: _builtin_ops = [
...
103:    (torch._VF.unique_dim, "aten::unique_dim"),

From this and further analysis of the code we can conclude that torch._VF.unique_dim is actually invoking the aten::unique_dim function from the ATen library.

Like most functions in ATen there are multiple implementations of this function. Most ATen functions are registered in aten/src/ATen/native/native_functions.yaml, generally the functions here will have a _cpu and _cuda version.

Going back to the search results we can find that the CUDA implementation is actually calling the function unique_dim_cuda at aten/src/ATen/native/cuda/Unique.cu:197

196: std::tuple<Tensor, Tensor, Tensor>
197: unique_dim_cuda(const Tensor& self, const int64_t dim, const bool sorted, const bool return_inverse, const bool return_counts) {
198:   return AT_DISPATCH_ALL_TYPES_AND2(kBool, kHalf, self.scalar_type(), "unique_dim", [&] {
199:     return unique_dim_cuda_template<scalar_t>(self, dim, false, return_inverse, return_counts);
200:   });
201: }

and the CPU implementation is calling the function unique_dim_cpu at aten/src/ATen/native/Unique.cpp:271

270: std::tuple<Tensor, Tensor, Tensor>
271: unique_dim_cpu(const Tensor& self, const int64_t dim, const bool sorted, const bool return_inverse, const bool return_counts) {
272:   return AT_DISPATCH_ALL_TYPES_AND2(at::ScalarType::BFloat16, at::ScalarType::Bool, self.scalar_type(), "unique_dim", [&] {
273:     // The current implementation using `dim` always sorts due to unhashable tensors
274:     return _unique_dim_cpu_template<scalar_t>(self, dim, false, return_inverse, return_counts);
275:   });
276: }

From this point you should be able to trace the function calls further down to see exactly what they are doing.

Following a similar string of searches you should find that torch._unique2 is implemented at aten/src/ATen/native/cuda/Unique.cu:188 and aten/src/ATen/native/Unique.cpp:264 for CUDA and CPU respectively.

Upvotes: 6

Related Questions