user16024912
user16024912

Reputation:

How to rearrange a pandas table where data is split between multiple columns

I have a pandas dataframe that contains table data taken from online, where there are pairs of data split into multiple columns like this every 2 columns:

| name   | price | name.1 | price.1| name.2 | price.2|
|--------|-------|--------|--------|--------|--------|
| orange | 2.2   | banana | 3.1    | grapes | 1.8    |
| apple  | 2.5   | mango  | 4.2    | kiwi   | 2.4    |

is there a way to rearrange the table to end up with this?:

| name   | price |
|--------|-------|
| orange | 2.2   |
| apple  | 2.5   |
| banana | 3.1   |
| mango  | 4.2   |
| grapes | 1.8   |
| kiwi   | 2.4   |

Upvotes: 0

Views: 77

Answers (3)

Henry Ecker
Henry Ecker

Reputation: 35646

Try pd.wide_to_long:

new_df = pd.wide_to_long(df.rename(columns={'name': 'name.0',
                                            'price': 'price.0'})
                         .reset_index(),
                         stubnames=['name', 'price'],
                         sep='.',
                         i='index',
                         j='level_1').reset_index(drop=True)

new_df:

     name  price
0  orange    2.2
1   apple    2.5
2  banana    3.1
3   mango    4.2
4  grapes    1.8
5    kiwi    2.4

df constructor:

import pandas as pd

df = pd.DataFrame({
    'name': ['orange', 'apple'],
    'price': [2.2, 2.5],
    'name.1': ['banana', 'mango'],
    'price.1': [3.1, 4.2],
    'name.2': ['grapes', 'kiwi'],
    'price.2': [1.8, 2.4]
})

Upvotes: 2

sammywemmy
sammywemmy

Reputation: 28699

You could also try the pivot_longer function from pyjanitor:

# pip install pyjanitor
import janitor
import pandas as pd
df.pivot_longer(names_to = ('name', 'price'), 
                names_pattern = ('name', 'price')
                )
 
     name  price
0  orange    2.2
1   apple    2.5
2  banana    3.1
3   mango    4.2
4  grapes    1.8
5    kiwi    2.4

Your columns have a pattern (some start with name, others with price), so you pass that pattern to names_pattern, and pass the new column names to names_to.

Upvotes: 1

perl
perl

Reputation: 9941

An alternative with np.reshape (as this DataFrame is so conveniently structured):

pd.DataFrame(
    df.values.reshape(-1, 2),
    columns=['name', 'price'])

Output:

     name price
0  orange   2.2
1  banana   3.1
2  grapes   1.8
3   apple   2.5
4   mango   4.2
5    kiwi   2.4

Upvotes: 4

Related Questions