Reputation: 308
I have two lists I want to create a pandas Dataframe with 3 columns whereby one of the columns contains a column generated by zipping two of the list. I tried the following
import pandas as pd
import numpy as np
S_x = [80, 90, 100, 200, 300, 600, 800, 900, 1000, 1200]
S_y = [800, 1000, 1200, 450, 80, 100, 60, 300, 700, 900]
S_z=list(zip(S_x,S_y))
frame4 = pd.DataFrame(np.column_stack([S_x, S_y,S_z]), columns=["Recovered Data", "Percentage Error","Zipped"])
In the column S_z I want the elements to be tuples as they appear in list S_z while the first two columns they should be the way they are. When I run my code I get the error
ValueError: Shape of passed values is (4, 10), indices imply (3, 10)
I don't know what I am making wrong. Am using Python 3.x
Upvotes: 0
Views: 1995
Reputation: 4618
When you use np.column_stack, it automatically unzip your S_z and thus np.column_stack([S_x, S_y,S_z])
become of shape (10, 4) instead. Do like this:
frame4 = pd.DataFrame({"Recovered Data": S_x, "Percentage Error": S_y,"Zipped": S_z})
Upvotes: 2
Reputation: 323316
IIUC
frame=pd.DataFrame(zip(S_x, S_y, S_z), columns=["Recovered Data", "Percentage Error","Zipped"])
Recovered Data Percentage Error Zipped
0 80 800 (80, 800)
1 90 1000 (90, 1000)
2 100 1200 (100, 1200)
3 200 450 (200, 450)
4 300 80 (300, 80)
5 600 100 (600, 100)
6 800 60 (800, 60)
7 900 300 (900, 300)
8 1000 700 (1000, 700)
9 1200 900 (1200, 900)
Upvotes: 0