Reputation: 794
How I do a make this first row as my column names instead of x1, x2....xn?
Thanks
I am able to load the data with XLSX.readtable
, but unable to convert it to DataFrame
:
dir = expanduser("~/Downloads/")
using DataFrames, XLSX
f = dir * "data.xlsx"
d = XLSX.readtable(f, "dados", "A:C", first_row = 5, header = false)
df = DataFrame(d)
ArgumentError: 'Tuple{Vector{Any}, Vector{Symbol}}' iterates 'Vector{Any}' values, which doesn't satisfy the Tables.jl `AbstractRow` interface
This fails at the last step. The spreadsheet looks like pretty standard data. If I dump the data, I get:
Tuple{Vector{Any}, Vector{Symbol}}
1: Array{Any}((3,))
1: Array{Any}((225,))
1: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 731672
2: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 731702
3: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 731733
4: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 731763
5: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 731794
...
221: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 738368
222: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 738399
223: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 738429
224: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 738460
225: Dates.Date
instant: Dates.UTInstant{Dates.Day}
periods: Dates.Day
value: Int64 738490
2: Array{Any}((225,))
1: Float64 329.16091
2: Float64 303.23791
3: Float64 284.96296
4: Float64 283.10436
5: Float64 286.08795
...
221: Missing missing
222: Missing missing
223: Missing missing
224: Missing missing
225: Missing missing
3: Array{Any}((225,))
1: Float64 189.49076
2: Float64 191.64219
3: Float64 194.163
4: Float64 194.49731
5: Float64 198.85504
...
221: Missing missing
222: Missing missing
223: Missing missing
224: Missing missing
225: Missing missing
2: Array{Symbol}((3,))
1: Symbol A
2: Symbol B
3: Symbol C
The spreadsheet may be downloaded at the link I have indicated in the comment section.
Upvotes: 2
Views: 1375
Reputation: 13750
There is XLSX.readtable, which has signature
readtable(filepath, sheet, [columns]; [first_row], <args omitted for brevity>)
You can pass first_row
to tell it where the data begins.
Upvotes: 1
Reputation: 42214
You should have read this file using CSV.read
which by default processes column names. However having a DataFrame
such as this it is still possible to rename columns.
However let's assume you have such data:
julia> df = DataFrame([["a" "b"];[3 7]], :auto)
2×2 DataFrame
Row │ x1 x2
│ Any Any
─────┼──────────
1 │ a b
2 │ 3 7
You can use rename!
to assign the column names:
julia> rename!(df, Symbol.(Vector(df[1,:])))[2:end,:]
1×2 DataFrame
Row │ a b
│ Any Any
─────┼──────────
1 │ 3 7
Upvotes: 3