Zipped Columns in Pandas DataFrame

Question

I have two lists I want to create a pandas Dataframe with 3 columns whereby one of the columns contains a column generated by zipping two of the list. I tried the following

import pandas as pd
import numpy as np

S_x = [80, 90, 100, 200, 300, 600, 800, 900, 1000, 1200]
S_y = [800, 1000, 1200, 450, 80, 100, 60, 300, 700, 900]
S_z=list(zip(S_x,S_y))

frame4 = pd.DataFrame(np.column_stack([S_x, S_y,S_z]), columns=["Recovered Data", "Percentage Error","Zipped"])

In the column S_z I want the elements to be tuples as they appear in list S_z while the first two columns they should be the way they are. When I run my code I get the error

ValueError: Shape of passed values is (4, 10), indices imply (3, 10)

I don't know what I am making wrong. Am using Python 3.x

Bruno Mello · Accepted Answer

When you use np.column_stack, it automatically unzip your S_z and thus np.column_stack([S_x, S_y,S_z]) become of shape (10, 4) instead. Do like this:

frame4 = pd.DataFrame({"Recovered Data": S_x, "Percentage Error": S_y,"Zipped": S_z})

Zipped Columns in Pandas DataFrame

Answers (2)

Related Questions