Biarys
Biarys

Reputation: 1173

Easy way to print out dask series/dataframe?

In pandas, there are lots of methods like head, tail, loc, iloc that can be used to see the data inside, but whenever I call one of these methods on dask, all I get is:

Dask DataFrame Structure:
              Close
npartitions=1
               bool
                ...
Dask Name: try_loc, 9 tasks

regardless of whether I call .compute() prior. How can I see data inside of a dask dataframe/series?

I am using Visual Studio Code 1.38.1, python 3.7, dask 2.13.0

Upvotes: 3

Views: 2390

Answers (1)

MRocklin
MRocklin

Reputation: 57271

head, tail, and compute all return normal Pandas dataframes that should print in a familiar way to the screen. Here is a simple example:

In [1]: import dask                                                             

In [2]: df = dask.datasets.timeseries()                                         

In [3]: df                                                                      
Out[3]: 
Dask DataFrame Structure:
                   id    name        x        y
npartitions=30                                 
2000-01-01      int64  object  float64  float64
2000-01-02        ...     ...      ...      ...
...               ...     ...      ...      ...
2000-01-30        ...     ...      ...      ...
2000-01-31        ...     ...      ...      ...
Dask Name: make-timeseries, 30 tasks

In [4]: df.head()                                                               
Out[4]: 
                       id     name         x         y
timestamp                                             
2000-01-01 00:00:00  1014  Michael  0.326006 -0.247279
2000-01-01 00:00:01  1001    Laura  0.429982 -0.545960
2000-01-01 00:00:02  1003      Bob -0.454010  0.096530
2000-01-01 00:00:03   964    Wendy  0.939114  0.826197
2000-01-01 00:00:04  1008   Xavier  0.035316  0.793430

Upvotes: 2

Related Questions