Reputation: 75
I have thousands of files I need to do this for so I am trying to avoid doing this manually for each file. The only identifying characteristics in these files is the file name so I need to create a column based on the file name so they can be identified when I combine the files later. The file name contains a placeholder then the boat name then the net number separated by an underscore. My data looks like this:
file name = 3_Whip_1.1.csv (Boat = Whip, Net = 1.1)
Time Pred
11:00 10.2
12:00 8.4
13:00 9.6
I would am trying to get the data to look like this:
Boat Net Time Pred
Whip 1.1 11:00 10.2
Whip 1.1 12:00 8.4
Whip 1.1 13:00 9.6
Any help would be greatly appreciated.
Upvotes: 2
Views: 3400
Reputation: 39154
The following code will work for one data frame. You probably can create a function to include these operations and loop through (or use apply
family function) a vector or list of your filenames. The list.files
function can show all filenames in one directory, which could be useful for your work.
# Create the example filename
filename <- "3_Whip_1.1.csv"
# Create example data frame
dat1 <- data.frame(Time = c("11:00", "12:00", "13:00"),
Pred = c(10.2, 8.4, 9.6),
stringsAsFactors = FALSE)
# Remove ".csv"
filename2 <- sub(".csv", "", filename)
# Split the string by "_"
filename_vec <- strsplit(filename2, split = "_")[[1]]
# Create columns to store the information
dat1$Boat <- filename_vec[2]
dat1$Net <- filename_vec[3]
# Change column order
dat1 <- dat1[, c("Boat", "Net", "Time", "Pred")]
Upvotes: 1
Reputation: 887118
We can use gsub
to return a substring of the 'filename' and split it into two columns with read.table
and cbind
with the original data
d1 <- read.table(text=gsub("^\\d+_|\\.[^.]+$", "", filename),
sep="_", col.names = c("Boat", "Net"))
cbind(d1, dat1)
# Boat Net Time Pred
#1 Whip 1.1 11:00 10.2
#2 Whip 1.1 12:00 8.4
#3 Whip 1.1 13:00 9.6
dat1 <- structure(list(Time = c("11:00", "12:00", "13:00"), Pred = c(10.2,
8.4, 9.6)), .Names = c("Time", "Pred"), class = "data.frame", row.names = c(NA,
-3L))
filename <- "3_Whip_1.1.csv"
Upvotes: 1