doorguote
doorguote

Reputation: 413

How do I do a pairwise t-test on data based on paired interactions of one column (i.e. factor)?

I have a data set as a comma-delimeted .csv (here). In this data set, I have measured data ("value" column) which is factored both by the station at which it was measured ("Site") and the date it was taken. What I'm trying to do is run a pairwise t-test on the data values for each combination of Location ("INLET EAST"/"SF EAST 1", "INLET EAST"/"OUTLET EAST", etc.). I have no problem running those iterations manually, but I have no idea how to tell the t.test function how to pair each set of data based on common Date values. Anyone have any pointers? I wouldn't be opposed to any ideas on how to also streamline the iterative t.test process across the "Site" factor, either. Thanks for all your help over the few months I've been here.

For those for whom the link doesn't work, my data has this structure:

Date     Site        Slope    Location    Season    variable    value
15628    Inlet East  H        Inlet       W         TKN         1.92
15694    Inlet East  H        Inlet       W         TKN         0.98
15628    Outlet East L        Outlet      W         TKN         0.93

...etc.

Upvotes: 1

Views: 984

Answers (2)

albifrons
albifrons

Reputation: 305

here is a proposal in form of function, where I've read your file data.csv as "myData". You can change the name (or number) of the independent variable as you wish. In your case I would improve the function adding a control for normal distribution of the grouped data.

foo <- function(dataFrame, dataDepV, dataIndepV){
  if(is.character(dataDepV))   dataDepV   <- which(names(dataFrame)==dataDepV)
  if(is.character(dataIndepV)) dataIndepV <- which(names(dataFrame)==dataIndepV)
  allFactors <- unique(dataFrame[, dataIndepV])

  foo2 <- function(x)
  {
    group1 <- dataFrame[dataFrame[, dataIndepV] == x[1], dataDepV]
    group2 <- dataFrame[dataFrame[, dataIndepV] == x[2], dataDepV]
    myResult <- t.test(group1, group2)
    return(myResult)
  }

  myEndResult           <- combn(allFactors, 2, foo2)
  rownames(myEndResult) <- c("statistic","parameter","p.value","conf.int","estimate",
                             "null.value","alternative","method","data.name")
  colnames(myEndResult) <- combn(allFactors, 2, function(x) paste(x[1],"vs.",x[2]))
  return(myEndResult)

}

A <- foo(dataFrame=myData, dataDepV="value", dataIndepV="Site")

View(A) # should include the result dataframe that you wanted

View(A)

Upvotes: 0

sckott
sckott

Reputation: 5903

Here's an example of how to run a t-test for each species of the iris data set

library(plyr)
foo <- function(df) {
  t.test(df$Sepal.Length, df$Sepal.Width, data = df)
}
models <- dlply(iris, .(Species), foo)
models

Upvotes: 2

Related Questions