Reputation: 77
I'm completely new to R and I'm not sure the best way to deal with this file so I'm really hoping someone can at least point me in the right direction. I've searched for other solutions and tried using grepl but can't seem to figure out the best way to only read some of the data. The file I'm trying to read in looks something like the text below:
##BLOCKS= 8
Plate: Plate01 1.3 PlateFormat Endpoint Absorbance Raw FALSE 1 1 630 1 12 96 1 8 None
Temperature(°C) 1 2 3 4 5 6 7 8 9 10 11 12
0.00 0.042 0.067 0.292 0.206 0.071 0.067 0.04 0.063 0.059 0.04 0.066 0.04
0.043 0.172 0.179 0.199 0.073 0.067 0.04 0.062 0.058 0.039 0.066 0.039
0.04 0.066 0.29 0.185 0.072 0.067 0.04 0.062 0.058 0.039 0.065 0.039
0.039 0.068 0.291 0.189 0.075 0.069 0.04 0.064 0.058 0.041 0.064 0.039
0.042 0.063 0.271 0.191 0.07 0.068 0.04 0.065 0.058 0.041 0.066 0.04
0.041 0.067 0.342 0.199 0.069 0.066 0.041 0.065 0.057 0.04 0.065 0.042
0.044 0.064 0.295 0.198 0.069 0.067 0.039 0.064 0.057 0.04 0.067 0.041
0.041 0.067 0.29 0.211 0.066 0.067 0.043 0.056 0.058 0.042 0.067 0.042
~End
Plate: Plate#1 1.3 PlateFormat Endpoint Absorbance Raw FALSE 1 1 630 1 12 96 1 8 None
Temperature(°C) 1 2 3 4 5 6 7 8 9 10 11 12
0.00 0.042 0.072 0.257 0.165 0.074 0.07 0.04 0.067 0.055 0.04 0.07 0.04
0.042 0.164 0.136 0.195 0.075 0.07 0.041 0.066 0.055 0.04 0.069 0.04
0.041 0.07 0.344 0.198 0.074 0.069 0.041 0.065 0.055 0.04 0.068 0.04
0.04 0.069 0.307 0.199 0.075 0.072 0.041 0.067 0.055 0.043 0.068 0.041
0.043 0.068 0.296 0.214 0.072 0.071 0.042 0.067 0.055 0.041 0.068 0.041
0.041 0.071 0.452 0.241 0.072 0.069 0.042 0.067 0.054 0.041 0.068 0.043
0.044 0.068 0.299 0.182 0.071 0.071 0.042 0.067 0.054 0.041 0.069 0.041
0.042 0.071 0.333 0.13 0.068 0.07 0.042 0.058 0.054 0.042 0.07 0.041
~End
I only want the columns/rows numbered 1-12 (next to Temperature) and the data under them. I'm new to R but do have some programming experience so I don't necessarily need someone to tell me exactly how to do this but if anyone could at least point me in the right direction of whatever functions I should be looking at I'd really appreciate the help!
Upvotes: 1
Views: 34
Reputation: 263451
Step 1: Get the data into R session with readLines
Lines <- readLines(textConnection("##BLOCKS= 8
Plate: Plate01 1.3 PlateFormat Endpoint Absorbance Raw FALSE 1 1 630 1 12 96 1 8 None
Temperature(°C) 1 2 3 4 5 6 7 8 9 10 11 12
0.00 0.042 0.067 0.292 0.206 0.071 0.067 0.04 0.063 0.059 0.04 0.066 0.04
0.043 0.172 0.179 0.199 0.073 0.067 0.04 0.062 0.058 0.039 0.066 0.039
0.04 0.066 0.29 0.185 0.072 0.067 0.04 0.062 0.058 0.039 0.065 0.039
0.039 0.068 0.291 0.189 0.075 0.069 0.04 0.064 0.058 0.041 0.064 0.039
0.042 0.063 0.271 0.191 0.07 0.068 0.04 0.065 0.058 0.041 0.066 0.04
0.041 0.067 0.342 0.199 0.069 0.066 0.041 0.065 0.057 0.04 0.065 0.042
0.044 0.064 0.295 0.198 0.069 0.067 0.039 0.064 0.057 0.04 0.067 0.041
0.041 0.067 0.29 0.211 0.066 0.067 0.043 0.056 0.058 0.042 0.067 0.042
~End
Plate: Plate#1 1.3 PlateFormat Endpoint Absorbance Raw FALSE 1 1 630 1 12 96 1 8 None
Temperature(°C) 1 2 3 4 5 6 7 8 9 10 11 12
0.00 0.042 0.072 0.257 0.165 0.074 0.07 0.04 0.067 0.055 0.04 0.07 0.04
0.042 0.164 0.136 0.195 0.075 0.07 0.041 0.066 0.055 0.04 0.069 0.04
0.041 0.07 0.344 0.198 0.074 0.069 0.041 0.065 0.055 0.04 0.068 0.04
0.04 0.069 0.307 0.199 0.075 0.072 0.041 0.067 0.055 0.043 0.068 0.041
0.043 0.068 0.296 0.214 0.072 0.071 0.042 0.067 0.055 0.041 0.068 0.041
0.041 0.071 0.452 0.241 0.072 0.069 0.042 0.067 0.054 0.041 0.068 0.043
0.044 0.068 0.299 0.182 0.071 0.071 0.042 0.067 0.054 0.041 0.069 0.041
0.042 0.071 0.333 0.13 0.068 0.07 0.042 0.058 0.054 0.042 0.07 0.041
~End"))
Steps 2 & 3: Build a conditional to include good data lines, and grouping
?strsplit
# Couldn't remember name of `substr`, figured the ?strsplit page would show link
start <- substr(Lines, 1,1) # 1st char was sufficient to build a rule
table(start)
#--- result ----
start
# ~ 0 P T # the 14 is the count of " " (just spaces)
2 14 1 2 2 2 2
#end table
goodL <- Lines[start %in% c(" ","T","0") ]
goodL # Look at result
group <- cumsum(substr(goodL , 1,4)=="Temp") #build grouping
group # check the grouping variable
[1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2
Step 4: Process the groups with lapply(split(goodL, group), function(x) ...
dfrms <- lapply(split(goodL, group),
function(x) read.table(text=substr(x,16, # stuff to right of 16th char
100),header=TRUE))
str(dfrms) # check result,,, not correct, need 12th entry
List of 2
$ 1:'data.frame': 8 obs. of 11 variables:
..$ X1 : num [1:8] 0.042 0.043 0.04 0.039 0.042 0.041 0.044 0.041
..$ X2 : num [1:8] 0.067 0.172 0.066 0.068 0.063 0.067 0.064 0.067
# -----snipped output
dfrms <- lapply(split(goodL, group), # will be a list of dataframes
function(x) read.table(text =substr(x, 16, 120), header=TRUE))
str(dfrms) # Looks good
List of 2
$ 1:'data.frame': 8 obs. of 12 variables:
..$ X1 : num [1:8] 0.042 0.043 0.04 0.039 0.042 0.041 0.044 0.041
..$ X2 : num [1:8] 0.067 0.172 0.066 0.068 0.063 0.067 0.064 0.067
..$ X3 : num [1:8] 0.292 0.179 0.29 0.291 0.271 0.342 0.295 0.29
#--- snippped output
I'd like to give credit to @G.Grothendieck for this strategy. Doing a search on "user:516548 readLines" will pull up many other elegant examples of a similar approach.
Upvotes: 3