OPM_OK
OPM_OK

Reputation: 190

Constructing Dictionary - List not same length

I was wondering if it possible to create a dictionary and convert it into a Pandas dataframe where each dictionary key has an array of values, but the array will vary in length.

e.g. col3 only has 2 values and all other lists have 3 values. Can I somehow put NaN to "fill" in the missing values and not get an error?

col1 = ["Bottom", "sss", "ddd"]
col2 = ["boo", "sss", "foo"]
col3 = [999, 89]

d = {"Type": col1, "Style": col2, "Profit": col3}
df = pd.DataFrame.from_dict(d)

Upvotes: 3

Views: 92

Answers (3)

BENY
BENY

Reputation: 323226

Doing with

df=pd.DataFrame([col1,col2,col3],index=['T','S','P']).T
df
Out[165]: 
        T    S     P
0  Bottom  boo   999
1     sss  sss    89
2     ddd  foo  None

Another option

pd.Series(d).apply(pd.Series).T
Out[174]: 
     Type Style Profit
0  Bottom   boo    999
1     sss   sss     89
2     ddd   foo    NaN

Upvotes: 1

jpp
jpp

Reputation: 164643

A dictionary isn't strictly required. Using itertools.zip_longest:

from itertools import zip_longest

df = pd.DataFrame(list(zip_longest(col1, col2, col3)),
                  columns=['Type', 'Style', 'Profit'])

print(df)

     Type Style  Profit
0  Bottom   boo   999.0
1     sss   sss    89.0
2     ddd   foo     NaN

Notice the pd.DataFrame constructor is smart enough to convert numeric series to numeric, even though each tuple in the input list of tuples contains mixed types.

Upvotes: 1

Tim
Tim

Reputation: 2843

Sure - you can fill the missing values with numpy.nan:

import numpy as np

col1 = ["Bottom", "sss", "ddd"]
col2 = ["boo", "sss", "foo"]
col3 = [999, 89, np.nan]

d = {"Type": col1, "Style": col2, "Profit": col3}
df = pd.DataFrame.from_dict(d)

Output

   Profit Style    Type
0   999.0   boo  Bottom
1    89.0   sss     sss
2     NaN   foo     ddd

Upvotes: 0

Related Questions