Reputation: 319
I have a folder of .csv files that I'd like to read and convert to a dataframe.
I have attempted a function to do this:
function read_CSV_all(name_in::String)
folder = joinpath(@__DIR__, "../validation", name_in)
files = glob("*.csv",folder)
dfs = CSV.read.(files,DataFrame)
df = vcat(dfs...)
return df
end
but I get an error
ERROR: LoadError: ArgumentError: column(s) ... are missing from argument(s) 4
The columns in each .csv file I am ready from all have different lengths- could this be the issue? My function works when I return df (the array of dataframes), but I would like a single dataframe containing all the columns of the .csv files I am reading.
Upvotes: 3
Views: 151
Reputation: 69949
The columns (...) all have different lengths- could this be the issue?
No. vcat
does a vertical concatenation so number of rows in each individual data frame does not matter.
column(s) ... are missing from argument(s) 4
This error message tells you that the data frames you are trying to join do not have the same columns. In order to allow for non-matching columns in data frames you concatenate vertically do:
vcat(dfs..., cols=:union)
or
reduce(vcat, dfs, cols=:union)
which might be better if you have a huge number of data frames (to avoid splatting).
Upvotes: 4