Reputation: 51
Say I just have random samples from the Distribution and no other data - e.g. a list of numbers - [1,15,30,4,etc.]
. What's the best way to estimate the distribution to draw more samples from it in pytorch?
I am currently assuming that all samples come from a Normal distribution and just using the mean and std of the samples to build it and draw from it. The function, however, can be of any distribution.
samples = torch.Tensor([1,2,3,4,3,2,2,1])
Normal(samples.mean(), samples.std()).sample()
Upvotes: 1
Views: 1323
Reputation: 24815
If you have enough samples (and preferably sample dimension is higher than 1
), you could model the distribution using Variational Autoencoder or Generative Adversarial Networks (though I would stick with the first approach as it's simpler).
Basically, after correct implementation and training you would get deterministic decoder able to decode hidden code you would pass it (say vector of size 10
taken from normal distribution) into a value from your target distribution.
Note it might not be reliable at all though, it would be even harder if your samples are 1D
only.
Upvotes: 1
Reputation: 1240
The best way depends on what you want to achieve. If you don't know the underlying distribution, you need to make assumptions about it and then fit a suitable distribution (that you know how to sample) to your samples. You may start with something simple like a Mixture of Gaussians (several normal distributions with different weightings).
Another way is to define a discrete distribution over the values you have. You will give each value the same probability, say p(x)=1/N. When you sample from it, you simply draw a random integer from [0,N) that points to one of your samples.
Upvotes: 0