Reputation: 33
Below is the dataset. https://docs.google.com/spreadsheet/ccc?key=0AjmK45BP3s1ydEUxRWhTQW5RczVDZjhyell5dUV4YlE#gid=0
Code:
counts = table(finaldata$satjob, finaldata$degree) barplot(counts, xlab="Highest Degree after finishing 9-12th Grade",col = c("Dark Blue","Blueviolet","deepPink4","goldenrod"), legend =(rownames(counts)))
The below barplot is the result of the above code. https://docs.google.com/file/d/0BzmK45BP3s1yVkx5OFlGQk5WVE0/edit
Now, i want to create the plot for relative frequency table of "counts".
For creating a relative frequency table, I need the divide each cell of the column by the column total to get the relative frequency for that cell and so for others as well. How to go about doing it.
I have tried this formula counts/sum(counts) , but this is not working. counts[1:4]/sum(counts[1:4]), this gives me the relative frequency of the first column.
Help me obtain the same for other columns as well in the same table.
Upvotes: 0
Views: 8011
Reputation: 3622
I'm a big fan of plyr
& ggplot2
, so you may have to download a few packages for the below to work.
install.packages('ggplot2') # only have to run once
install.packages('plyr') # only have to run once
install.packages('scales') # only have to run once
library(plyr)
library(ggplot2)
library(scales)
# dat <- YOUR DATA
dat_count <- ddply(ft, .(degree, satjob), 'count')
dat_rel_freq <- ddply(dat, .(degree), transform, rel_freq = freq/sum(freq))
ggplot(dat_rel_freq, aes(x = degree, y = rel_freq, fill = satjob)) +
geom_bar(stat = 'identity') +
scale_y_continuous(labels = percent) +
labs(title = 'Highest Degree After finishing 9-12th Grade\n',
x = '',
y = '',
fill = 'Job Satisfaction')
Upvotes: 1