TaylorR
TaylorR

Reputation: 4013

Julia DataFrames describe did not output full

I am new to Julia and DataFrames.

I tried to run some scripts as attached. enter image description here

I tried some commands and works as expected but this describe does not output as some tutorials mentioned that should include mean, top 25%, etc.

Did I miss anything here?

Upvotes: 1

Views: 611

Answers (1)

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69819

describe function takes a keyword argument stats that specifies which statistics should be calculated. Check out the help for describe to get a full list. For example if you set stats to :all all summary statistics are calculated. Here is an example (the output is a bit wide so you have to scroll horizontally the listing to see all the columns):

julia> df = DataFrame(a=1:3, b='a':'c')
3×2 DataFrame
│ Row │ a     │ b    │
│     │ Int64 │ Char │
├─────┼───────┼──────┤
│ 1   │ 1     │ 'a'  │
│ 2   │ 2     │ 'b'  │
│ 3   │ 3     │ 'c'  │

julia> describe(df)
2×8 DataFrame
│ Row │ variable │ mean   │ min │ median │ max │ nunique │ nmissing │ eltype   │
│     │ Symbol   │ Union… │ Any │ Union… │ Any │ Union…  │ Nothing  │ DataType │
├─────┼──────────┼────────┼─────┼────────┼─────┼─────────┼──────────┼──────────┤
│ 1   │ a        │ 2.0    │ 1   │ 2.0    │ 3   │         │          │ Int64    │
│ 2   │ b        │        │ 'a' │        │ 'c' │ 3       │          │ Char     │

julia> describe(df, stats=:all)
2×13 DataFrame
│ Row │ variable │ mean   │ std    │ min │ q25    │ median │ q75    │ max │ nunique │ nmissing │ first │ last │ eltype   │
│     │ Symbol   │ Union… │ Union… │ Any │ Union… │ Union… │ Union… │ Any │ Union…  │ Nothing  │ Any   │ Any  │ DataType │
├─────┼──────────┼────────┼────────┼─────┼────────┼────────┼────────┼─────┼─────────┼──────────┼───────┼──────┼──────────┤
│ 1   │ a        │ 2.0    │ 1.0    │ 1   │ 1.5    │ 2.0    │ 2.5    │ 3   │         │          │ 1     │ 3    │ Int64    │
│ 2   │ b        │        │        │ 'a' │        │        │        │ 'c' │ 3       │          │ 'a'   │ 'c'  │ Char     │

Also note that if your terminal is narrow displaying of some of the columns might be suppressed to fit the screen width, e.g. I paste the result of the last command on a narrow terminal:

julia> describe(df, stats=:all)
2×13 DataFrame. Omitted printing of 6 columns
│ Row │ variable │ mean   │ std    │ min │ q25    │ median │ q75    │
│     │ Symbol   │ Union… │ Union… │ Any │ Union… │ Union… │ Union… │
├─────┼──────────┼────────┼────────┼─────┼────────┼────────┼────────┤
│ 1   │ a        │ 2.0    │ 1.0    │ 1   │ 1.5    │ 2.0    │ 2.5    │
│ 2   │ b        │        │        │ 'a' │        │        │        │

Note that now you are informed that printing of 6 columns was omitted. This should not be a problem in Jupyter Notebook though.

Upvotes: 1

Related Questions