Describing expression over time in new column

Question

I have some gene expression data that look like this:

> d<-read.csv("gene_data.txt", header=TRUE, stringsAsFactors = FALSE)
> d

  gene_id     day_1     day_2     day_3     day_4
1  Gene_1 -3.836501 -4.643856 -5.058894 -5.058894
2  Gene_2 13.161867  6.740118 13.507918 13.349972
3  Gene_3 -6.643856  5.766860 -6.127014 -6.726967
4  Gene_4 -2.736966 -3.058894 -2.643856 -2.943416
5  Gene_5 -2.836501 -3.473931  3.643856 -4.321928
6  Gene_6  2.836501 -3.058894  3.836501 -5.643856
7  Gene_7 11.000232 11.353974 10.792245 10.309476

As you read the gene data left to right you can see that in some genes the expression is always negative, some the expression is always positive, and some it is mixed. I'd like to make a new column describing whether or not the genes are consistently positive, negative, or mixed. Something like this:

> d$new_column2<-c("down","up","mixed","down","mixed","mixed","up")
> d
  gene_id     day_1     day_2     day_3     day_4 new_column
1  Gene_1 -3.836501 -4.643856 -5.058894 -5.058894       down
2  Gene_2 13.161867  6.740118 13.507918 13.349972         up
3  Gene_3 -6.643856  5.766860 -6.127014 -6.726967      mixed
4  Gene_4 -2.736966 -3.058894 -2.643856 -2.943416       down
5  Gene_5 -2.836501 -3.473931  3.643856 -4.321928      mixed
6  Gene_6  2.836501 -3.058894  3.836501 -5.643856      mixed
7  Gene_7 11.000232 11.353974 10.792245 10.309476         up

except done automatically, not written in manually. So basically I'd like R to read the numbers across the row, and report whether or not the numbers are always consistently, positive, negative, or a mix of both. And I'd like to describe this behavior in a new column that matches my gene IDs.

Thanks for the help!

devmacrile · Accepted Answer

If you subset your data.frame to just the numeric data (i.e. columns 2 to 5 in this case), this should work for you:

df$new_column <- apply(df[,2:5], 1, function(x) {
    if(sign(max(x)) == sign(min(x))) {  # Then all same sign
        if(sign(max(x)) == 1) "up"  # Then all positive
        else "down" # All negative
    }
    else "mixed"  # Signs of max/min not equal, so mixed
})

Describing expression over time in new column

Answers (1)

Related Questions