Harry
Harry

Reputation: 65

The optimal grid size for 2D kernal density distribution in R

I am generating 2D kernal density distributions for every pair of numeric columns in a data set, using kde2d function in the MASS package.

This takes the following parameters:

kde2d(x, y, h, n=25, lims = c(range(x), range(y)))

where n is the "Number of grid points in each direction. Can be scalar or a length-2 integer vector".

I want to optimize the dimensions of the grid for every pair of columns. At the moment, I used a fixed dimensions of 10x10. Does anyone know a formula for optimizing the grid size so I can generate optimal density estimations for each pair of columns?

Thanks

Upvotes: 0

Views: 459

Answers (1)

AEF
AEF

Reputation: 5670

The parameter n in this function does not influence your density estimation but only the graphical representation, i.e. it should only depend on the size of the plot you want to create but not on the data.

On the other hand your density estimation is indeed influenced by the choice og bandwith h. To choose an optimal bandwith you will need to know (or assume) the distribution of your data

Upvotes: 1

Related Questions