BallerNacken
BallerNacken

Reputation: 355

Pandas extract daily data and write to csv

I have a pandas dataframe looking like this:

Index                   Stat    value1  value2  value3  value4  value5  value6 
2016-11-01 00:00:00.000 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-01 00:00:00.100 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-01 00:00:00.200 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-01 00:00:00.300 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-01 00:00:00.400 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-02 00:00:00.000 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-02 00:00:00.100 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-02 00:00:00.200 Gard    0.15    0.25    0.01    6.08    31.0    0.291719042916
2016-11-02 00:00:00.300 Gard    0.15    0.25    0.01    6.08    31.0    0.291719042916

Of course this is just a snippet, the whole dataframe has about 4.3 million rows.

I would like to extract each line that correponds to a date. So all lines that have the timestamp date 2016-11-01 into one file and 2016-1-02 into another file. So two files looking like this:

Index                   Stat    value1  value2  value3  value4  value5  value6
2016-11-01 00:00:00.000 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-01 00:00:00.100 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-01 00:00:00.200 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-01 00:00:00.300 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-01 00:00:00.400 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642

And:

Index                   Stat    value1  value2  value3  value4  value5  value6
2016-11-02 00:00:00.000 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-02 00:00:00.100 Gard    0.08    0.24    0.09    6.08    18.4    0.268514431642
2016-11-02 00:00:00.200 Gard    0.15    0.25    0.01    6.08    31.0    0.291719042916
2016-11-02 00:00:00.300 Gard    0.15    0.25    0.01    6.08    31.0    0.291719042916

I tried to use groupby in the following command:

grouped_df = df.groupby(df.index.date)["Stat","value1","value2","value3","value4","value5","value6"]

But I don't get any output or error. It runs but nothing happens. Am I doing anything wrong? Is this even the correct function to use? Or is there an easier, better way?

Upvotes: 1

Views: 375

Answers (1)

jezrael
jezrael

Reputation: 862511

I think you need groupby with apply and custom function with to_csv:

f = lambda x: x.to_csv(r'd:/folder/{}.csv'.format(x.name))
df.groupby(df.index.date).apply(f)

Upvotes: 2

Related Questions