Reputation: 11
I have the following dataset (abbreviated below). On occasion, I want to run a t-test (or other test) on a subset of data, for instance, comparing dcxd
in data with d=1 & c=1
vs d=0 & c=0
. The closest I've come is using aggregate()
to provide the means for these, but have been unable to perform any tests on the data. Any ideas on how to achieve this?
(df <- read.table(header = TRUE, text = " exp n s d t dcxd brdud cod
1 1 966 0 1 1 44444 63248 20513
2 1 967 0 0 1 69124 165899 101382
3 1 968 0 0 1 126627 338462 195266
4 1 969 0 1 0 25517 10207 7655
5 1 970 0 0 0 62374 46278 28169
6 1 971 1 1 1 48366 73203 41830
7 1 972 1 0 1 78292 138790 65243
8 1 973 1 1 0 99379 49689 37267
9 1 974 1 0 0 52724 8787 1757
10 2 978 0 0 0 11686 6678 1669"))
# exp n s d t dcxd brdud cod
# 1 1 966 0 1 1 44444 63248 20513
# 2 1 967 0 0 1 69124 165899 101382
# 3 1 968 0 0 1 126627 338462 195266
# 4 1 969 0 1 0 25517 10207 7655
# 5 1 970 0 0 0 62374 46278 28169
# 6 1 971 1 1 1 48366 73203 41830
# 7 1 972 1 0 1 78292 138790 65243
# 8 1 973 1 1 0 99379 49689 37267
# 9 1 974 1 0 0 52724 8787 1757
# 10 2 978 0 0 0 11686 6678 1669
Upvotes: 1
Views: 55
Reputation: 3678
Here are two solutions :
create subsets of df
:
d1<-df[df$d==1 & df$s==1,]
d2<-df[df$d==0 & df$s==0,]
t.test(d1$dcxd,d2$dcxd)
or without subsets :
t.test(df[df$d==1 & df$s==1,'dcxd'],df[df$d==0 & df$s==0 ,'dcxd'])
Same results for both of them
Welch Two Sample t-test
data: df[df$d == 1 & df$s == 1, "dcxd"] and df[df$d == 0 & df$s == 0, "dcxd"]
t = 0.185, df = 2.759, p-value = 0.866
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-109662.0 122501.5
sample estimates:
mean of x mean of y
73872.50 67452.75
Upvotes: 1