Jesse Blocher
Jesse Blocher

Reputation: 523

pandas data format to preserve DateTimeIndex

I do a lot of work with data that has DateTime indexes and multi-indexes. Saving and reading as a .csv is tedious because every time I have to reset_index and name it "date" then when I read again, I have to convert the date back to a datetime and set the index. What format will help me avoid this? I'd prefer something open source - for instance I think SAS and Stata will do this, but they are proprietary.

Upvotes: 1

Views: 1386

Answers (1)

tobsecret
tobsecret

Reputation: 2522

feather was made for this: https://github.com/wesm/feather

Feather provides binary columnar serialization for data frames. It is designed to make reading and writing data frames efficient, and to make sharing data across data analysis languages easy. This initial version comes with bindings for python (written by Wes McKinney) and R (written by Hadley Wickham).

Feather uses the Apache Arrow columnar memory specification to represent binary data on disk. This makes read and write operations very fast. This is particularly important for encoding null/NA values and variable-length types like UTF8 strings.

Feather is a part of the broader Apache Arrow project. Feather defines its own simplified schemas and metadata for on-disk representation.

Feather currently supports the following column types:

A wide range of numeric types (int8, int16, int32, int64, uint8, uint16, uint32, uint64, float, double). Logical/boolean values. Dates, times, and timestamps. Factors/categorical variables that have fixed set of possible values. UTF-8 encoded strings. Arbitrary binary data.

Upvotes: 3

Related Questions