Reputation: 274
I need to split one large .csv file, with about 9 columns and more than 9,000 rows, into a separate .txt file for each row, and name each newly generated file by the name in its first column.
e.g. for the .csv file:
01001_r1 32.4327 -86.6190 0.65 0.20 0.15 1.33 5.47 8
01001_r2 32.4327 -86.6190 0.65 0.20 0.15 1.33 5.46 8
01001_r3 32.4327 -86.6190 0.80 0.15 0.05 1.33 5.23 10
01003_r1 30.4887 -87.6918 0.65 0.20 0.15 1.33 5.23 9
01003_r2 30.4887 -87.6918 0.80 0.15 0.05 1.33 5.25 9
01003_r3 30.4887 -87.6918 0.65 0.20 0.15 1.33 4.96 8
I would end up with 6 files, with one row each.
Columns in the output file need to be 'tab separated', and the file must not contain neither row or column names.
e.g the output files should look like this:
01001_r1 32.4327 -86.6190 0.65 0.20 0.15 1.33 5.47 8
This is where I've gotten this far:
#set 'working directory'
setwd('C:/Users/Data/soils_data/sitesoil_in')
#identify data frame from .csv file
sd <- read.csv('site_soil.csv', sep="\t", header=F, fill=F)
lapply(1:nrow(sd), function(i) write.csv(sd[i,],
file = paste0(sd[i,1], ".txt"),
row.names = F, header = F,
quote = F))
And this is what I get for each output file:
file name: 01001_r1
V1,V2,V3,V4,V5,V6,V7,V8,V9
01001_r1,32.4327,-86.619,0.65,0.2,0.15,1.33,5.47,8
I can't get it to eliminate the columns names or to separate the columns with tabs. I have tried with header = F, or col.names = F to eliminate headings, and sep = "\t" to separate the columns but it does not recognize the commands.
I would appreciate any help. Thanks, E.
Following all suggestions, this is the simpler code that would do the trick:
#set 'working directory'
setwd('C:/Users/Elena/Desktop/DayCent_muvp_MODEL/DayCent_SourceData/soils_data/sitesoil_in')
#identify data frame from .csv file
sd <- read.csv('site_soil.csv', sep="\t", header=F, fill=F)
lapply(1:nrow(sd),
function(i) write.table(sd[i,],
file = paste0(sd[i,1], ".txt",collapse = ""),
row.names = FALSE, col.names = FALSE,
sep = "\t"
))
Thank you all for your help. E.
Upvotes: 0
Views: 5014
Reputation: 3369
This may work for what you are trying to accomplish.
df <-read.csv(text = "01001_r1,32.4327,-86.6190,0.65,0.20,0.15,1.33,5.47,8
01001_r2,32.4327,-86.6190,0.65,0.20,0.15,1.33,5.46,8
01001_r3,32.4327,-86.6190,0.80,0.15,0.05,1.33,5.23,10
01003_r1,30.4887,-87.6918,0.65,0.20,0.15,1.33,5.23,9
01003_r2,30.4887,-87.6918,0.80,0.15,0.05,1.33,5.25,9
01003_r3,30.4887,-87.6918,0.65,0.20,0.15,1.33,4.96,8",
stringsAsFactors = FALSE,
header = FALSE)
apply(df, 1, function(x){write.table(t(x),
file = paste0(x[1],".txt"),
sep = "\t",
quote = FALSE,
col.names = FALSE,
row.names = FALSE)})
Upvotes: 1
Reputation: 359
I adjusted your code:
lapply(1:nrow(sd),
function(i) write.table(sd[i,],
file = paste0(sd[i,1],".txt",collapse = ""),
row.names = FALSE,
sep = "\t"
))
Upvotes: 1
Reputation: 3555
Try this
dat <-"01001_r1,32.4327,-86.6190,0.65,0.20,0.15,1.33,5.47,8
01001_r2,32.4327,-86.6190,0.65,0.20,0.15,1.33,5.46,8
01001_r3,32.4327,-86.6190,0.80,0.15,0.05,1.33,5.23,10
01003_r1,30.4887,-87.6918,0.65,0.20,0.15,1.33,5.23,9
01003_r2,30.4887,-87.6918,0.80,0.15,0.05,1.33,5.25,9
01003_r3,30.4887,-87.6918,0.65,0.20,0.15,1.33,4.96,8
"
df <- read.delim(file = textConnection(dat), sep = ',', header = FALSE)
df
# V1 V2 V3 V4 V5 V6 V7 V8 V9
# 1 01001_r1 32.4327 -86.6190 0.65 0.20 0.15 1.33 5.47 8
# 2 01001_r2 32.4327 -86.6190 0.65 0.20 0.15 1.33 5.46 8
# 3 01001_r3 32.4327 -86.6190 0.80 0.15 0.05 1.33 5.23 10
# 4 01003_r1 30.4887 -87.6918 0.65 0.20 0.15 1.33 5.23 9
# 5 01003_r2 30.4887 -87.6918 0.80 0.15 0.05 1.33 5.25 9
# 6 01003_r3 30.4887 -87.6918 0.65 0.20 0.15 1.33 4.96 8
output_file_base <- "soil_"
output_file_ext <- ".tsv"
for(i in seq(nrow(df))){
output_file <- paste0(output_file_base, as.character(i), output_file_ext)
dfi <- df[i, ]
write.table(x = dfi, file = output_file, sep = '\t', quote = FALSE, col.names = FALSE, row.names = FALSE)
}
Output:
$ cat soil_6.tsv
01003_r3 30.4887 -87.6918 0.65 0.2 0.15 1.33 4.96 8
Upvotes: 1