Reputation: 329
I have a R script with following line:
KSData.dataset.abbrev = aggregate(log2FC ~ Kinase.Gene+Substrate.Gene+Substrate.Mod+Source, data=KSData.dataset.abbrev, FUN=mean)
KSData.dataset.abbrev looks like this:
Kinase.Gene Substrate.Gene Substrate.Mod Peptide p FC log2FC Source
364 ABL1 RBM39 Y95 YRSPYSGPK 0.019590948 1.6158045 0.692252615 PhosphoSitePlus
8 AKT1 AKT1S1 T246 LNTSDFQK 0.800879536 0.8909224 -0.166628324 PhosphoSitePlus
121 AKT1 EPHA2 S897 LPSTSGSEGVPFR 0.500658346 0.7052020 -0.503891606 PhosphoSitePlus
after using the code line above, the df looks similar to this:
Kinase.Gene Substrate.Gene Substrate.Mod Source log2FC
430 ABL1 RBM39 Y95 PhosphoSitePlus 0.6922526152
19 AKT1 PEA15 S116 PhosphoSitePlus 1.1782441053
80 AKT1 MDM2 S166 PhosphoSitePlus -0.7967537534
I have no clue, what exaclty this line do... Thanks for any help
Upvotes: 0
Views: 43
Reputation: 39657
I calculates the mean
of log2FC
for the unique groups combinations Kinase.Gene
, Substrate.Gene
, Substrate.Mod
and Source
.
Using a small data sample, you can see what aggregate
is doing:
(tt <- data.frame(a = 1:2, b=1:3, x=1:12))
# a b x
#1 1 1 1
#2 2 2 2
#3 1 3 3
#4 2 1 4
#5 1 2 5
#6 2 3 6
#7 1 1 7
#8 2 2 8
#9 1 3 9
#10 2 1 10
#11 1 2 11
#12 2 3 12
aggregate(x ~ a, data=tt, FUN=mean) #Average for the groups in col a
# a x
#1 1 6
#2 2 7
aggregate(x ~ a + b, data=tt, FUN=mean) #Average for the groups in col a and b
# a b x
#1 1 1 4
#2 2 1 7
#3 1 2 8
#4 2 2 5
#5 1 3 6
#6 2 3 9
Upvotes: 1