bfh0474
bfh0474

Reputation: 33

How to partially transpose a CSV table in Python

I'm reading in a csv file with weekly sales data by store by product across many weeks and trying to partially transpose the data so that each row represents a weekly transaction by product, store, week

Going From This:

Product,Store,9/1/18,9/8/18,9/15/18,9/22/18
vacuum,123,1,5,3,3
toaster,456,5,7,4,10

To This:

Product,Store,Week,Sales
vacuum,123,9/1/18,1
vacuum,123,9/8/18,5
vacuum,123,9/15/18,3
vacuum,123,9/22/18,3
toaster,456,9/1/18,5
toaster,456,9/8/18,7
toaster,456,9/15/18,4
toaster,456,9/22/18,10...

I'm brand new to Python (2 days old) and I semi-understand a full transpose using zip_longest/reader/writer, but cannot figure out how to do a partial version

Upvotes: 3

Views: 635

Answers (3)

rahlf23
rahlf23

Reputation: 9019

You can use df.pivot():

df.pivot(index='Product', columns='Store').stack(level=[1,0]).reset_index()

For example:

import pandas as pd

df = pd.read_csv('test.csv')

df = df.pivot(index='Product', columns='Store').stack(level=[1,0]).reset_index()

df.columns = ['Product','Store','Week','Sales']

Which gives:

   Product  Store     Week  Sales
0  toaster    456   9/1/18    5.0
1  toaster    456   9/8/18    7.0
2  toaster    456  9/15/18    4.0
3  toaster    456  9/22/18   10.0
4   vacuum    123   9/1/18    1.0
5   vacuum    123   9/8/18    5.0
6   vacuum    123  9/15/18    3.0
7   vacuum    123  9/22/18    3.0

Upvotes: 2

Scott Boston
Scott Boston

Reputation: 153550

Also,

df.set_index(['Product','Store']).stack().reset_index()

Output:

   Product  Store  level_2   0
0   vacuum    123   9/1/18   1
1   vacuum    123   9/8/18   5
2   vacuum    123  9/15/18   3
3   vacuum    123  9/22/18   3
4  toaster    456   9/1/18   5
5  toaster    456   9/8/18   7
6  toaster    456  9/15/18   4
7  toaster    456  9/22/18  10

With cleaned up column naming,

(df.set_index(['Product','Store'])
   .rename_axis('Week', axis=1)
   .stack()
   .rename('Sales')
   .reset_index())

Output:

   Product  Store     Week  Sales
0   vacuum    123   9/1/18      1
1   vacuum    123   9/8/18      5
2   vacuum    123  9/15/18      3
3   vacuum    123  9/22/18      3
4  toaster    456   9/1/18      5
5  toaster    456   9/8/18      7
6  toaster    456  9/15/18      4
7  toaster    456  9/22/18     10

Upvotes: 5

It_is_Chris
It_is_Chris

Reputation: 14113

You can use melt to do this:

df.melt(id_vars=['Product', 'Store',],
                       value_vars=['9/1/18', '9/8/18','9/15/18','9/22/18'], 
                       var_name='week', value_name='Sales')

    Product Store   week    Sales
0   vacuum  123  9/1/18     1
1   toaster 456  9/1/18     5
2   vacuum  123  9/8/18     5
3   toaster 456  9/8/18     7
4   vacuum  123  9/15/18    3
5   toaster 456  9/15/18    4
6   vacuum  123  9/22/18    3
7   toaster 456  9/22/18    10

Upvotes: 3

Related Questions