waffl
waffl

Reputation: 5511

Convert pandas dataframe to list of tuples - ('Row', 'Column', Value)

There are a few other questions regarding the same subject, but the format desired is different in all.

I am trying to build a heatmap visualization using holoviews and bokeh

My data is being read in as an Excel file into a dataframe to something along the lines of:

    Foo    Bar    Bash    Baz   ...
A   1      2      3       4
B   2      1      0       3
C   0      0      2       0
D   2      3      5       1
...

The documentation says The data for a HeatMap may be supplied as 2D tabular data with one or more associated value dimensions.

Plotting the dataframe itself doesn't work, I feel like I need to get my data into a form like:

[('A', 'Foo', 1), ('A', 'Bar', 2), ('A', 'Bash', 3), ('A', 'Baz', 4), ('B', 'Foo', 1)...]

Is there a faster way to do this than manually iterating through the entire dataframe and building it manually?

Upvotes: 1

Views: 2527

Answers (3)

Draghi Puterity
Draghi Puterity

Reputation: 1

With iterators and list comprehention:

my_list = []
for row in df.iterrows():
    my_list.extend([(row[0], i, v) for i, v in row[1].iteritems()])

Upvotes: 0

jezrael
jezrael

Reputation: 862711

You can reshape first by stack and then convert to tuples:

tups = [tuple(x) for x in df.stack().reset_index().values.tolist()]

Another similar solution is create 3 levels MultiIndex:

tups = df.stack().to_frame().set_index(0, append=True).index.tolist()

Or zip 3 separately arrays with numpy.repeat, numpy.tile and ravel:

a = np.repeat(df.index, len(df.columns))
b = np.tile(df.columns, len(df))
c = df.values.ravel()

tups = list(zip(a,b,c))

Upvotes: 1

jpp
jpp

Reputation: 164683

Using pd.DataFrame.to_dict:

res = df.to_dict('index')

{'A': {'Bar': 2, 'Bash': 3, 'Baz': 4, 'Foo': 1},
 'B': {'Bar': 1, 'Bash': 0, 'Baz': 3, 'Foo': 2},
 'C': {'Bar': 0, 'Bash': 2, 'Baz': 0, 'Foo': 0},
 'D': {'Bar': 3, 'Bash': 5, 'Baz': 1, 'Foo': 2}}

Then via a list comprehension:

lst = [(k, a, b) for k, v in res.items() for a, b in v.items()]

[('A', 'Foo', 1),
 ('A', 'Bar', 2),
 ('A', 'Bash', 3),
 ...
 ('D', 'Baz', 1)]

Upvotes: 1

Related Questions