sikisis
sikisis

Reputation: 478

How to process multi columns data in data.frame with plyr

I am trying to solve the DSC(Differential scanning calorimetry) data with R but it seems that I ran into some troubles. All this used to be done in Origin or Qtiplot tediously in my lab.But I wonder if there is another way to do it in batch.But the result did not goes well. For example, maybe I have used the wrong colnames of my data.frame,the code

dat$0.5min
Error: unexpected numeric constant in "dat$0.5"

can not reach my data.

So below is the full description of my purpose, thank you in advance!

the DSC data is like this(I store the CSV file in my GoogleDrive Link ) :

T1      0.5min      T2      1min    
40.59   -0.2904 40.59   -0.2545
40.81   -0.281  40.81   -0.2455
41.04   -0.2747 41.04   -0.2389
41.29   -0.2728 41.29   -0.2361
41.54   -0.2553 41.54   -0.2239
41.8    -0.07   41.8    -0.0732
42.06   0.1687  42.06   0.1414
42.32   0.3194  42.32   0.2817
42.58   0.3814  42.58   0.3421
42.84   0.3863  42.84   0.3493
43.1    0.3665  43.11   0.3322
43.37   0.3438  43.37   0.3109
43.64   0.3265  43.64   0.2937
43.9    0.3151  43.9    0.2819
44.17   0.3072  44.17   0.2735
44.43   0.2995  44.43   0.2656
44.7    0.2899  44.7    0.2563
44.96   0.2779  44.96   0.245

in fact I have merge the data into a data.frame and hope I can adjust it and do something further. the command is:

dat<-read.csv("Book1.csv",header=F)
colnames(dat)<-c('T1','0.5min','T2','1min','T3','2min','T4','4min','T5','8min','T6','10min',
             'T7','20min','T8','ascast1','T9','ascast2','T10','ascast3','T11','ascast4',
             'T12','ascast5'
             )

so actually dat is a data.frame with 1163 obs. of 24 variables.

T1,T2,T3.....T12 means temperature that the samples were tested of DSC although in the same interval they do differ a little due to the unstability of the machine.

And the colname along T1~T12 is Heat Flow of different heat treatment durations that records by the machine and ascast1~ascast5 means nothing done to the sample to check the accuracy of the machine.

Now I need to do something like the following:

  1. for T1~T2 is in Celsius Degrees,I need to change them into Kelvin Degrees whichi means every data plus 273.16.

  2. Two temperature is chosen to compare the result that is Ts=180.25,Te=240.45(all is discussed in Celsius Degrees and I have seen it Qtiplot to make sure). To be clear I list the two temperature and the first 6 columns data.

    T1 0.5min T2 1min T3 2min T4 4min

    180.25 -0.01710000 180.25 -0.01780000 180.25 -0.02120000 180.25 -0.02020000

    . . . .

    . . . .

    240.45 0.05700000 240.45 0.04500000 240.45 0.05780000 240.45 0.05580000

That all Heat Flow in Ts should be the same that can be made 0 for convenience. So based on the different values Heat Flow of different times like 0.5min,1min,2min,4min,8min,10min,20min and ascas1~ascast5 all Heat Flow value should be minus the Heat Flow value in Ts.

  1. And for Heat Flow in Te, the value should be adjust to make sure that all the Heat Flow data are the same in Te. The purpose is like the following, (1) calculate mean of the 12 heat flow data in Te. Let's use Hmean for the mean heat flow.So Hmean is the value that all Heat Flow should be. (2) for data in column 0.5min,I use col("0.5min") to denote, and the lineal transform formula is like the following:

      col("0.5min")-[([0.05700000-(-0.01710000)]-Hmean)/(Te-Ts)]*(col(T1)-Ts)
    

Actually, [0.05700000-(-0.01710000)] is done in step 2,but I write it for your reference. And this formula is used for different pair of T1~T12 and columns,like (T1,0.5min),(T2, 1min),(T3,1min).....all is 12 pairs.

  1. Now we can plot the 12 pairs of data on the same plot with intervals from 180~240(also in Celsius Degrees) to magnify the details of differences between the different scans of DSC.

I have been stuck on this problems for 2 days , so I return to stackoverflow for help.

Thanks!

Upvotes: 0

Views: 202

Answers (1)

Avinash
Avinash

Reputation: 2561

I am assuming that your question was right in the beginning where you got the following error,

dat$0.5min
Error: unexpected numeric constant in "dat$0.5"

As I could not find a question in the rest of the steps. They just seemed like a step by step procedure of an experiment.

To fix that error, the problem is the column name has a number in it so to use the column name in the way you want (to reference a column), you should use "`", accent mark, symbol.

>dataF <- data.frame("0.5min"=1:10,"T2"=11:20,check.names = F)
> dataF$`0.5min`
 [1]  1  2  3  4  5  6  7  8  9 10

Based on comments adding more information,

You can add a constant to add to alternate columns in the following manner,

dataF <- data.frame(matrix(1:100,10,10))
const <- 237

> print(dataF)
   X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1   1 11 21 31 41 51 61 71 81  91
2   2 12 22 32 42 52 62 72 82  92
3   3 13 23 33 43 53 63 73 83  93
4   4 14 24 34 44 54 64 74 84  94
5   5 15 25 35 45 55 65 75 85  95
6   6 16 26 36 46 56 66 76 86  96
7   7 17 27 37 47 57 67 77 87  97
8   8 18 28 38 48 58 68 78 88  98
9   9 19 29 39 49 59 69 79 89  99
10 10 20 30 40 50 60 70 80 90 100

dataF[,seq(1,ncol(dataF),by = 2)] <- dataF[,seq(1,ncol(dataF),by = 2)] + const

> print(dataF)
    X1 X2  X3 X4  X5 X6  X7 X8  X9 X10
1  238 11 258 31 278 51 298 71 318  91
2  239 12 259 32 279 52 299 72 319  92
3  240 13 260 33 280 53 300 73 320  93
4  241 14 261 34 281 54 301 74 321  94
5  242 15 262 35 282 55 302 75 322  95
6  243 16 263 36 283 56 303 76 323  96
7  244 17 264 37 284 57 304 77 324  97
8  245 18 265 38 285 58 305 78 325  98
9  246 19 266 39 286 59 306 79 326  99
10 247 20 267 40 287 60 307 80 327 100

To generalize, we know that the columns of a dataframe can be referenced with a vector of numbers/column names. Most operations in R are vectorized. You can use column names or numbers based on the pattern you are looking for.

For example, I change the name of my first two columns and want to access just those I do this,

colnames(dataF)[c(1,2)] <- c("Y1","Y2")

#Reference all column names with "Y" in it. You can do any operation you want on this.
dataF[,grep("Y",colnames(dataF))]

   Y1 Y2
1  238 11
2  239 12
3  240 13
4  241 14
5  242 15
6  243 16
7  244 17
8  245 18
9  246 19
10 247 20

Upvotes: 1

Related Questions