Reputation: 1273
I have a dataframe which is as follow:
Name Condition NumMessage
Table 1 NULL 80
Table 1 Fair 20
Table 1 Good 60
Table 1 Ideal 50
Table 1 Great 80
Table 2 NULL 80
Table 2 Fair 100
Table 2 Good 90
Table 2 Ideal 50
Table 2 Great 40
and so on. I tried to create a frequency table for the number of message for each table.
data = as.data.frame(prop.table(table(dataframe$Name)))
colnames(data) = c('Table Name', 'Frequency')
data
but this returns same frequency for all tables. For example, Table 1 contains total of 290 messages where Table 2 contains 360 messages. But the above code gives same frequency for both tables.
Also when I tried to get frequency of each condition for each table, I also got same numbers across tables.
prop.table(table(dataframe$Condition, dataframe$Name))
NULL | some value
Fair | some value
Good | some value
Ideal | some value
Great | some value
is this the correct way to get the frequency of total number of messages for each table and frequency of conditions for each table?
Upvotes: 0
Views: 607
Reputation: 1632
You can always tackle this with the package sqldf
.
library(sqldf)
Name<-c('Table1','Table1','Table1','Table1','Table1','Table2','Table2','Table2','Table2','Table2')
Cond<-c(NA,'Fair','Good','Ideal','Great',NA,'Fair','Good','Ideal','Great')
Msg<-c(80,20,60,50,80,80,100,90,50,40)
df<-data.frame(Name,Cond,Msg)
Your dataframe:
Name Cond Msg
1 Table1 <NA> 80
2 Table1 Fair 20
3 Table1 Good 60
4 Table1 Ideal 50
5 Table1 Great 80
6 Table2 <NA> 80
7 Table2 Fair 100
8 Table2 Good 90
9 Table2 Ideal 50
10 Table2 Great 40
Now simply use this statement for sum of messages for each table:
sqldf("select Name, sum(Msg) from df group by Name ")
Name sum(Msg)
1 Table1 290
2 Table2 360
If you want sum of messages for each condition then use:
sqldf("select Cond, sum(Msg) from df group by Cond ")
Cond sum(Msg)
1 <NA> 160
2 Fair 120
3 Good 150
4 Great 120
5 Ideal 100
Hope that helps.
Upvotes: 0
Reputation: 887223
We could try with acast
library(reshape2)
prop.table(acast(df1, Name~Condition, value.var='NumMessage', sum),1)
# Fair Good Great Ideal NULL
#Table 1 0.06896552 0.2068966 0.2758621 0.1724138 0.2758621
#Table 2 0.27777778 0.2500000 0.1111111 0.1388889 0.2222222
Upvotes: 3
Reputation: 93843
xtabs
is the base R way to get a summed contingency table.
prop.table(xtabs(NumMessage ~ ., data=df), 1)
# Condition
#Name Fair Good Great Ideal NULL
# Table1 0.06896552 0.20689655 0.27586207 0.17241379 0.27586207
# Table2 0.27777778 0.25000000 0.11111111 0.13888889 0.22222222
Upvotes: 3
Reputation: 304
If we call your dataset df, then perhaps this is what you are looking for?
df1 = subset(df, Name=='Table1')
df2 = subset(df, Name=='Table2')
prop.table(df1[,3])
prop.table(df2[,3])
aggregate(df1$NumMessage, list(df1$Name), sum)
aggregate(df1$NumMessage, list(df2$Name), sum)
Upvotes: 0