Janet
Janet

Reputation: 225

How to import multiple .csv files that contain the same information at once?

I try to import multiple csv files at once, but my csv files have the same exact format (variables), so when i use the code found here, i can not distinguish my datasets.

### the code i used 
temp = list.files(pattern="*.csv", full.names=TRUE)
myfiles = lapply(temp, read_csv,)

This code works fine but i can not distinguish my csv files. Is there anyway to use the same code or maybe another way so i can import multiples csv files but can see the name of the csv file attached to the datasets imported?

# this is an example of my output
 myfiles
[[1]]
# A tibble: 10 x 2
      mm     prob
   <dbl>    <dbl>
 1     0 0.0002  
 2     2 0.000300
 3     3 0.00580 
 4     4 0.007   
 5     5 0.006   
 6     8 0.02    
 7    10 0.032   
 8    12 0.015   
 9    13 0.045   
10    15 0.051   

[[2]]
# A tibble: 10 x 2
      mm    prob
   <dbl>   <dbl>
 1     1 0.002  
 2     2 0.003  
 3     3 0.00580
 4     4 0.007  
 5     5 0.006  
 6     6 0.01   
 7     7 0.03   
 8     8 0.011  
 9     9 0.02   
10    10 0.04   

[[3]]
# A tibble: 11 x 2
      mm   prob
   <dbl>  <dbl>
 1     0 0.0001
 2     4 0.0004
 3     5 0.0005
 4     8 0.007 
 5    10 0.0075
 6    15 0.03  
 7    20 0.042 
 8    23 0.05  
 9    25 0.052 
10    27 0.064 
11    30 0.071 

[[4]]
# A tibble: 10 x 2
      mm     prob
   <dbl>    <dbl>
 1     0 0.0002  
 2     2 0.000300
 3     3 0.00580 
 4     4 0.007   
 5     5 0.006   
 6     8 0.02    
 7    10 0.032   
 8    12 0.015   
 9    13 0.045   
10    15 0.051   

# my  csv files have different name g1_a.csv, g2_b.csv, g3_c.csv ...

The desired output would look something like


 myfiles
[[1]]
# name of the file attached to the dataset
#g1_a
# A tibble: 10 x 2
      mm     prob
   <dbl>    <dbl>
 1     0 0.0002  
 2     2 0.000300
 3     3 0.00580 
 4     4 0.007   
 5     5 0.006   
 6     8 0.02    
 7    10 0.032   
 8    12 0.015   
 9    13 0.045   
10    15 0.051   

[[2]]
#g2_b
# A tibble: 10 x 2
      mm    prob
   <dbl>   <dbl>
 1     1 0.002  
 2     2 0.003  
 3     3 0.00580
 4     4 0.007  
 5     5 0.006  
 6     6 0.01   
 7     7 0.03   
 8     8 0.011  
 9     9 0.02   
10    10 0.04   

[[3]]
#g3_c
# A tibble: 11 x 2
      mm   prob
   <dbl>  <dbl>
 1     0 0.0001
 2     4 0.0004
 3     5 0.0005
 4     8 0.007 
 5    10 0.0075
 6    15 0.03  
 7    20 0.042 
 8    23 0.05  
 9    25 0.052 
10    27 0.064 
11    30 0.071 

Thank you in advance for your help.

Upvotes: 1

Views: 205

Answers (4)

GuedesBF
GuedesBF

Reputation: 9858

maybe you should try:

filenames = list.files(pattern=".csv", full.names=TRUE)
myfiles = lapply(filenames, read_csv)

# i added this line and it is working
myfiles = setNames(myfiles, basename(filenames))

names(myfiles)<-str_remove(names(myfiles), '.csv')

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388797

You can use sapply with simplfy = FALSE which will give the names to the list directly.

temp = list.files(pattern="*.csv", full.names=TRUE)
result <- sapply(temp, read.csv, simplify = FALSE)

Upvotes: 2

David J. Bosak
David J. Bosak

Reputation: 1624

There is also a package called libr that is designed for this situation exactly. It will load a directory of data sets into a list, with each list item named according to the file name. It is very easy to use. Here is an example:

library(libr)

libname(dat, "<directory>", "csv")

Your datasets will be loaded into the variable named "dat". You can then also load them into the workspace with the following command:

lib_load(dat)

The datasets will be loaded with a two-level syntax, like: dat.g1_a, dat.g2_b, dat.g3_c, etc. so it is easy to reference them.

When you are done, just unload them, and it will clean up the workspace:

lib_unload(dat)

Upvotes: 2

GordonShumway
GordonShumway

Reputation: 2056

Just add this line at the end of your code:

myfiles <- setNames(myfiles, basename(temp))

Upvotes: 3

Related Questions