Kevin T
Kevin T

Reputation: 151

Print histograms including variable name for all variables in R

I'm trying to generate a simple histogram for every variable in my dataframe, which I can do using sapply below. But, how can I include the name of the variable in either the title or the x-axis so I know which one I'm looking at? (I have about 20 variables.)

Here is my current code:

x = # initialize dataframe
sapply(x, hist)

Upvotes: 1

Views: 2576

Answers (2)

moooh
moooh

Reputation: 469

How about this? Assuming you have wide data you can transform it to long format with gather. Than a ggplot solution with geom_histogram and facet_wrap:

library(tidyverse)

# make wide data (20 columns)
df <- matrix(rnorm(1000), ncol = 20)
df <- as.data.frame(df)
colnames(df) <- LETTERS[1:20]

# transform to long format (2 columns)
df <- gather(df, key = "name", value = "value")

# plot histigrams per name
ggplot(df) +
  geom_histogram(aes(value)) +
  facet_wrap(~name, ncol = 5)

enter image description here

Upvotes: 1

lefft
lefft

Reputation: 2105

Here's a way to modify your existing approach to include column name as the title of each histogram, using the iris dataset as an example:

# loop over column *names* instead of actual columns
sapply(names(iris), function(cname){
  # (make sure we only plot the numeric columns)
  if (is.numeric(iris[[cname]]))
    # use the `main` param to put column name as plot title
    print(hist(iris[[cname]], main=cname))
})

After you run that, you'll be able to flip through the plots with the arrows in the viewer pane (assuming you're using R Studio).

Here's an example output: enter image description here

p.s. check out grid::grob(), gridExtra::grid.arrange(), and related functions if you want to arrange the histograms onto a single plot window and save it to a single file.

Upvotes: 2

Related Questions