Reputation: 1

How do I create a histogram in r for a 2 column data?

user_a - 3 user_b - 4 user_c - 1 user_d - 4 I want to show the distribution over number of tweets per author in r using a histogram. The original file has 1048575 such rows I did hist(df$twitter_count, nrow(df)) but I don't think its correct

Upvotes: 0

Answers (3)

LeMarque

Reputation: 783

Since you said, distribution for 'each user', I think it should be a bar blot:

require(data.table)
dat <- fread("
  user_a - 3
  user_b - 4
  user_c - 1
  user_d - 4"
)

barplot( names.arg = dat$V1, as.numeric(dat$V3) )

barplot

or if you are looking for histograms, then:

hist(as.numeric(dat$V3), xlab = "", main="Histogram")

histogram

Upvotes: 0

kangaroo_cliff

Reputation: 6222

It seems I have misunderstood the question. I think following could be what the OP is looking for.

df <- data.frame(user = letters, 
                 twitter_count = sample.int(200, 26))

ggplot(df, aes(user, twitter_count)) +
  geom_col()

Assuming you are looking for multiple histograms.

Replace user with respective variable name in your data.frame.

# Example data
df <- data.frame(user = iris$Species, 
                 twitter_count= round(iris[, 1]*10))

# Histograms using ggplot2 package
library(ggplot2)
ggplot(df, aes(x = twitter_count)) +
  geom_histogram() + facet_grid(.~user)

Best to use an alternative method to see the distributions of twitter counts if your data contain many twitter users.

Upvotes: 3

James Thomas Durant

Reputation: 305

If each row of the data.frame represents a user -

set.seed(1)
df <- data.frame(user = letters, twitter_count = rpois(26, lambda = 4) + 1)
hist(df$twitter_count)

Upvotes: 1

How do I create a histogram in r for a 2 column data?

Answers (3)

Related Questions