Reputation: 796
I need to create a dataframe and convert it to CSV so the output will look like this:
People,Age,Pets,Pet Age
Tom,24,Dog,5
Jim,30,Cat,10,
Sally,21,Dog,1
, ,Dog,3
, ,Cat,15
, ,Horse,10
As you can see, there are more pets than people, the relationships between the objects aren't important. The output when changed to Excel should look like:
_______________________________
| Person | Age | Pets | Pet Age |
|-------------------------------|
| Tom | 24 | Dog | 5 |
|-------------------------------|
| Jim | 30 | Cat | 10 |
|-------------------------------|
| Sally | 21 | Dog | 1 |
|-------------------------------|
| | | Cat | 15 |
|-------------------------------|
| | | Horse| 10 |
---------------------------------
My code so far is:
df = pd.DataFrame({
"People": [Tom, Jim, Sally],
"Age": [24, 30, 21],
"Pets": [Dog, Cat, Dog, Dog, Cat, Horse],
"Pet Age": [5, 10, 1, 3, 15, 10]
})
But it's giving me: ValueError: arrays must all be same length
Any help is much appreciated, thanks.
Upvotes: 2
Views: 789
Reputation: 41347
Instead of the DataFrame()
constructor, you can use DataFrame.from_dict()
with orient='index'
:
data = {
'People': ['Tom', 'Jim', 'Sally'],
'Age': [24, 30, 21],
'Pets': ['Dog', 'Cat', 'Dog', 'Dog', 'Cat', 'Horse'],
'Pet Age': [5, 10, 1, 3, 15, 10],
}
df = pd.DataFrame.from_dict(data, orient='index').T
# People Age Pets Pet Age
# 0 Tom 24 Dog 5
# 1 Jim 30 Cat 10
# 2 Sally 21 Dog 1
# 3 None None Dog 3
# 4 None None Cat 15
# 5 None None Horse 10
To write as csv:
df.to_csv('pets.csv', index=False)
# People,Age,Pets,Pet Age
# Tom,24,Dog,5
# Jim,30,Cat,10
# Sally,21,Dog,1
# ,,Dog,3
# ,,Cat,15
# ,,Horse,10
Upvotes: 5