qwebfub3i4u
qwebfub3i4u

Reputation: 319

Reading and joining dataframes in Julia

I have a folder of .csv files that I'd like to read and convert to a dataframe.

I have attempted a function to do this:

function read_CSV_all(name_in::String)
    folder = joinpath(@__DIR__, "../validation", name_in)
    files = glob("*.csv",folder)
    dfs = CSV.read.(files,DataFrame)
    df = vcat(dfs...)
    return df
end

but I get an error

ERROR: LoadError: ArgumentError: column(s) ... are missing from argument(s) 4

The columns in each .csv file I am ready from all have different lengths- could this be the issue? My function works when I return df (the array of dataframes), but I would like a single dataframe containing all the columns of the .csv files I am reading.

Upvotes: 3

Views: 151

Answers (1)

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69949

The columns (...) all have different lengths- could this be the issue?

No. vcat does a vertical concatenation so number of rows in each individual data frame does not matter.

column(s) ... are missing from argument(s) 4

This error message tells you that the data frames you are trying to join do not have the same columns. In order to allow for non-matching columns in data frames you concatenate vertically do:

vcat(dfs..., cols=:union)

or

reduce(vcat, dfs, cols=:union)

which might be better if you have a huge number of data frames (to avoid splatting).

Upvotes: 4

Related Questions