Reputation: 358
Hello I'm loading a data file which is formated as a table separated with multispaces. Ordinarily it is easily loaded via read.table(data_file, sep = "", header = T, fill = T)
, but some values are not divided with spaces in case they are negative:
523.2 -166.1 1.62 0.079 0.0 0.0 0.0 2260 0
528.4 -168.6 -0.71-0.034 0.0 0.0 0.0 2284 0
533.9 -169.7 -1.75-0.085 0.0 0.0 0.0 2308 0
538.4 -169.5 -1.60-0.078 0.0 0.0 0.0 2333 0
543.3 -170.8 -2.83-0.137 0.0 0.0 0.0 2357 0
548.2 -171.8 -3.77-0.183 0.0 0.0 0.0 2381 0
552.8 -172.1 -3.87-0.187 0.0 0.0 0.0 2406 0
554.9 -172.5 -4.23-0.205 0.0 0.0 0.0 2430 0
Then whole part eg -3.77-0.183
is taken as a single value.
What is convenient way to cope with this without preliminary file conversion using other scripts.
Thanks in advance!
Upvotes: 0
Views: 77
Reputation: 78792
If it is a well-formatted (from a fixed-field perspective), then:
data <- read.fwf("fixed.dat", widths = c(6, 9, 10, 6, 12, 9, 9, 7, 9))
data
## V1 V2 V3 V4 V5 V6 V7 V8 V9
## 1 523.2 -166.1 1.62 0.079 0 0 0 2260 0
## 2 528.4 -168.6 -0.71 -0.034 0 0 0 2284 0
## 3 533.9 -169.7 -1.75 -0.085 0 0 0 2308 0
## 4 538.4 -169.5 -1.60 -0.078 0 0 0 2333 0
## 5 543.3 -170.8 -2.83 -0.137 0 0 0 2357 0
## 6 548.2 -171.8 -3.77 -0.183 0 0 0 2381 0
## 7 552.8 -172.1 -3.87 -0.187 0 0 0 2406 0
## 8 554.9 -172.5 -4.23 -0.205 0 0 0 2430 0
might work.
Upvotes: 2
Reputation: 887078
One way would be:
lines <- readLines("datN.txt") #read your data using `readLines`
lines1 <- gsub("(?<=[0-9])((-|\\s)[0-9]+)", " \\1", lines, perl=TRUE)
dat <- read.table(text=lines1, sep="", header=FALSE)
dat
# V1 V2 V3 V4 V5 V6 V7 V8 V9
#1 523.2 -166.1 1.62 0.079 0 0 0 2260 0
#2 528.4 -168.6 -0.71 -0.034 0 0 0 2284 0
#3 533.9 -169.7 -1.75 -0.085 0 0 0 2308 0
#4 538.4 -169.5 -1.60 -0.078 0 0 0 2333 0
#5 543.3 -170.8 -2.83 -0.137 0 0 0 2357 0
#6 548.2 -171.8 -3.77 -0.183 0 0 0 2381 0
#7 552.8 -172.1 -3.87 -0.187 0 0 0 2406 0
#8 554.9 -172.5 -4.23 -0.205 0 0 0 2430 0
str(dat)
#'data.frame': 8 obs. of 9 variables:
#$ V1: num 523 528 534 538 543 ...
#$ V2: num -166 -169 -170 -170 -171 ...
#$ V3: num 1.62 -0.71 -1.75 -1.6 -2.83 -3.77 -3.87 -4.23
#$ V4: num 0.079 -0.034 -0.085 -0.078 -0.137 -0.183 -0.187 -0.205
#$ V5: num 0 0 0 0 0 0 0 0
#$ V6: num 0 0 0 0 0 0 0 0
#$ V7: num 0 0 0 0 0 0 0 0
#$ V8: int 2260 2284 2308 2333 2357 2381 2406 2430
#$ V9: int 0 0 0 0 0 0 0 0
Upvotes: 2