Reputation: 1681
I just found this package FreqTables, which allows one to easily construct frequency tables from DataFrames (I'm using DataFrames.jl).
The following lines of code return me a frequency table:
df = CSV.read("exampledata.csv")
freqtable(df,:col_name)
My question is how to turn the output into a dataframe again. The output from the frequency table function seems to be a NamedArray, which I haven't been able turn into a dataframe.
Upvotes: 3
Views: 1825
Reputation: 17
The way to convert Named matrix to DataFrame is different for one or more columns.
For one column frequency,
using DataFrames, FreqTables
df = DataFrame(A = [2,2,2,2,5,5,5], B = [1,1,1,6,6,6,6])
ft = freqtable(df, :A)
DataFrame(A = names(ft)[1], Freq=ft)
If freqtable is obtained from more than 2 columns,
ft = freqtable(df, :A, :B)
df_ft = DataFrame(ft |> Array, names(ft)[2] .|> string)
df_ft[:, :row] = names(ft)[1]
It is noted that changing index is not allowed in julia. See this post. Is it possible to set a chosen column as index in a julia dataframe?
Upvotes: 0
Reputation: 4827
I found this solution to work for me:
using DataFrames, FreqTables
ft = freqtable(df, :A)
df = DataFrame(A = [2,2,2,2,5,5,5], B = [1,1,1,6,6,6,6])
DataFrame(A = names(ft)[1], Freq = ft)
Result:
2×2 DataFrame
│ Row │ A │ Freq │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 2 │ 4 │
│ 2 │ 5 │ 3 │
Upvotes: 3
Reputation: 69819
This does not directly answer your question, but freq tables you can alternatively just write:
julia> df = DataFrame(A = [2,2,2,2,5,5,5])
7×1 DataFrame
│ Row │ A │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 2 │
│ 2 │ 2 │
│ 3 │ 2 │
│ 4 │ 2 │
│ 5 │ 5 │
│ 6 │ 5 │
│ 7 │ 5 │
julia> combine(groupby(df, :A), nrow => :Freq)
2×2 DataFrame
│ Row │ A │ Freq │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 2 │ 4 │
│ 2 │ 5 │ 3 │
to get the same
Upvotes: 4