Chemist learns to code
Chemist learns to code

Reputation: 487

How to sum values using R based on the same row ID?

I am new to the R programming language. I wonder if anyone can help me with the following.

  1. I'd like to sum up all the values in the columns of "meta-TMT126", "meta-TMT-127", etc, if these rows have the same gene name from the data frame 1.
  2. put the summed values into data.frame 2.
  3. Make a plot (X, Y) with X values <- c(0.25, 0,5, 1, 2, 5, 10) and y values from summed.TMT126 to summed.TMT131 in the data.frame 2.
  4. Fit a curve with the following equation (Y=Bmax*X/(Kd + X)).
  5. list the calculated Kd value for each gene under the "Kd" column in the data.frame 2.

Any help would be highly appreciated. Thanks!

data.frame 1

No. Gene.Names meta.TMT126 meta.TMT127 meta.TMT128 meta.TMT129 meta.TMT131
11     CAMKK1    4072.936    9365.860    6849.890    8984.916    33329.95
12     CAMKK2       0.000    7274.314   11176.810   13428.840    23818.98
13     CAMKK2       0.000    2454.801   10336.700   14725.970    25393.29
14     CAMKK2       0.000    4632.481    7781.803   14236.950    17768.02
15     CAMKK2       0.000       0.000    9480.014   10525.650    16477.76
16       CDK1    7261.509   26724.260   31849.710   40430.470    49057.77
17       CDK1   11742.330   37562.090   62257.240   78345.980    78888.45
18       CDK1  110574.600  446760.000  451618.600  682500.800   567461.70
19       CDK1   36139.930   90902.490  178966.500  179064.500   167970.90
20       CDK1   10228.790   30630.880   45064.910   57638.250    60941.82
21       CDK1    3073.708    7608.870   11477.470   13113.130    16976.54
22       CDK1    5731.526   17815.080   23776.330   27493.160    20506.58
23       CDK1   14520.820   47537.810   75062.160   73013.450    92172.52
24       CDK1    9606.591   33498.880   43764.630   52139.970    49417.85
25       CDK1    5312.566   16361.420   26155.710   28099.830    32235.76
26       CDK1    2724.090    6696.917   10923.450   10441.160    13494.35
27       CDK1    3178.791    9800.487   16621.160   17990.620    20878.94
28       CDK1    1676.843    2900.603    5489.261    7645.588    35765.65

data.frame 2

    Gene.name Summed.TMT126 Summed.TMT127 Summed.TMT128 Summed.TMT129 Summed.TMT130 Summed.TMT131 Kd
8     CAMKK1            NA            NA            NA            NA            NA            NA NA
9     CAMKK2            NA            NA            NA            NA            NA            NA NA
10      CDK1            NA            NA            NA            NA            NA            NA NA

Upvotes: 1

Views: 3504

Answers (1)

akrun
akrun

Reputation: 887831

We can use aggregate from base R

df2 <-  aggregate(. ~ Gene.Names, df1, sum, na.rm = TRUE)

and plot with matplot

matplot(t(df2[-1]),  type = 'l')

If we want the plots separately, use facet_wrap after reshaping to 'long' format with tidyverse

library(dplyr)
library(tidyr)
library(ggplot2)
df2 %>%
    pivot_longer(cols = -Gene.Names) %>% 
    group_by(Gene.Names) %>%
    mutate(rn = row_number()) %>%
   ggplot(aes(x = rn, y = value, color = Gene.Names)) +
      geom_line() + 
      facet_wrap(~ Gene.Names)

Upvotes: 1

Related Questions