pga
pga

Reputation: 137

How to specify type of input data for Pandas DataFrame

I want to convert existing Python list into Pandas DataFrame object. How to specify data format for each column and define index column?

Here is sample of my code:

import pandas as pd

data = [[1444990457000286208, 0, 286],
       [1435233159000067840, 0, 68],
       [1431544002000055040, 1, 55]]
df = pd.DataFrame(data, columns=['time', 'value1', 'value2'])

In above example I need to have the following types for existing columns:

Additionally time column should be used as index column.

By default all three columns are int64 and I can't find how to specify column types during DataFrame object create.

Thanks!

Upvotes: 5

Views: 2830

Answers (2)

kikocorreoso
kikocorreoso

Reputation: 4219

You can use the dtype keyword in the pd.DataFrame object constructor. Docs. Please see @alex answer.

To use a specific column as index you can use the set_index method of the dataframe instance.

Upvotes: 1

Alex
Alex

Reputation: 19104

value2 is already of the correct dtype.

For time you can convert to datetimes with to_datetime and then set the index with set_index.

For value1 you can cast to bool with astype.

df['time'] = pd.to_datetime(df['time'])
df = df.set_index('time')
df['value1'] = df['value1'].astype(bool)

Upvotes: 4

Related Questions