Reputation: 569
I have a dataset that takes the form:
# prate [mm/day] from 4x Daily NOAA-CIRES 20th Century Reanalysis V2c
# grid point lon,lat = 22.500 63.808
1851 1.8 0.9 1.7 1.5 1.6 2.7 2.7 2.6 1.3 2.5 1.8 1.7
1852 2.2 1.6 0.9 1.4 1.6 2.5 2.4 2.0 1.8 2.3 1.9 1.5
...
I wish to extract the longitude and latitude from the comments at the top of this text file and add them repeatedly as two additional columns in this dataset. My output should therefore look like so:
# prate [mm/day] from 4x Daily NOAA-CIRES 20th Century Reanalysis V2c
# grid point lon,lat = 22.500 63.808
1851 1.8 0.9 1.7 1.5 1.6 2.7 2.7 2.6 1.3 2.5 1.8 1.7 22.500 63.808
1852 2.2 1.6 0.9 1.4 1.6 2.5 2.4 2.0 1.8 2.3 1.9 1.5 22.500 63.808
...
Does anyone have any ideas on how to achieve this?
Upvotes: 1
Views: 66
Reputation: 83215
Using:
dat <- read.table('dataset.txt', header = FALSE, skip = 2)
txt <- readLines('dataset.txt', n = 2)
llcols <- read.table(text = trimws(gsub('.*=','',txt[2])), header = FALSE)
names(llcols) <- c('lon','lat')
dat <- cbind(dat, llcols)
gives:
> dat
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 lon lat
1 1851 1.8 0.9 1.7 1.5 1.6 2.7 2.7 2.6 1.3 2.5 1.8 1.7 22.5 63.808
2 1852 2.2 1.6 0.9 1.4 1.6 2.5 2.4 2.0 1.8 2.3 1.9 1.5 22.5 63.808
Explanation:
dat <- read.table('dataset.txt', header = FALSE, skip = 2)
you read the data and exclude the two comment lines.txt <- readLines('dataset.txt', n = 2)
you read the two comment lines as text.trimws(gsub('.*=','',txt[2]))
you extract the lon/lat values.read.table
in a new dataframe.cbind
you combine the two dataframes into one. The lon/lat values will be repeated till the end of dat
.Reading a bunch of files can be done as follows:
filenames <- list.files(pattern = '.txt')
dflist <- lapply(filenames, function(x) {
dat <- read.table(x, header = FALSE, skip = 2)
txt <- readLines(x, n = 2)
llcols <- read.table(text = trimws(gsub('.*=','',txt[2])), header = FALSE)
names(llcols) <- c('lon','lat')
cbind(dat,llcols)
})
Upvotes: 2
Reputation: 569
I have now found the solution for applying this to lists as follows:
dat=apply(data.frame(list.files()), 1, read.table, header=F, skip=2)
txt=apply(data.frame(list.files()), 1, readLines, n=2)
llcols=lapply(txt, function(x) read.table(text =trimws(gsub('.*=','',txt[2])), header = FALSE))
names(LLCOLS)=c('lon','lat')
dat=lapply(dat, function(x) cbind(x, llcols))
Upvotes: 1