IronMan18
IronMan18

Reputation: 57

How to make a Pandas DataFrame from a dictionary that has values that consists of arrays?

Currently I have a dictionary that has an array (with many lists) per key. I want to turn this dictionary into a DataFrame. This is what the dictionary looks like currently:

enter image description here

However, the way I want it formatted is as so:

title                                  A       C      mC      G      T
000805ef-05c0-40d8-be27-a88ad2932a68  255     255     0      255    255
000805ef-05c0-40d8-be27-a88ad2932a68  255     20      235    255    255
......
00723b5f-95a1-44c4-93b5-ba4adb8ea5b7  255     255     0      255    255

I tried just simply using pd.DataFrame(dict) but that didn't work. I also tried pd.DataFrame.from_dict(dict) and that didn't work either. I think I need to manipulate the dictionary more for how I want it to look in the data frame before converting. Any help would be greatly appreciated!

Upvotes: 0

Views: 264

Answers (1)

perl
perl

Reputation: 9941

You can vstack the values, and repeat the keys:

# dictionary with numpy arrays as values
d = {
    'a': np.ones((2, 3), dtype=np.int),
    'b': np.eye(3, dtype=np.int),
}

df = pd.DataFrame(
    np.vstack(list(d.values())),
    index = np.repeat(
        list(d.keys()),
        [len(x) for x in d.values()]
    ),
    columns = ['x', 'y', 'z']
)

df

Output:

   x  y  z
a  1  1  1
a  1  1  1
b  1  0  0
b  0  1  0
b  0  0  1

Upvotes: 1

Related Questions