Mick
Mick

Reputation: 796

Create Pandas Dataframe with different sized columns

I need to create a dataframe and convert it to CSV so the output will look like this:

People,Age,Pets,Pet Age
Tom,24,Dog,5
Jim,30,Cat,10,
Sally,21,Dog,1
     ,  ,Dog,3
     ,  ,Cat,15
     ,  ,Horse,10

As you can see, there are more pets than people, the relationships between the objects aren't important. The output when changed to Excel should look like:

 _______________________________
| Person | Age | Pets | Pet Age |
|-------------------------------|
|  Tom   | 24  | Dog  |  5      |
|-------------------------------|
|  Jim   | 30  | Cat  |  10     |
|-------------------------------|
|  Sally | 21  | Dog  |  1      |
|-------------------------------|
|        |     | Cat  |  15     |
|-------------------------------|
|        |     | Horse|  10     |
---------------------------------

My code so far is:

df = pd.DataFrame({
    "People": [Tom, Jim, Sally],
    "Age": [24, 30, 21],
    "Pets": [Dog, Cat, Dog, Dog, Cat, Horse],
    "Pet Age": [5, 10, 1, 3, 15, 10]

})

But it's giving me: ValueError: arrays must all be same length

Any help is much appreciated, thanks.

Upvotes: 2

Views: 789

Answers (1)

tdy
tdy

Reputation: 41347

Instead of the DataFrame() constructor, you can use DataFrame.from_dict() with orient='index':

data = {
    'People': ['Tom', 'Jim', 'Sally'],
    'Age': [24, 30, 21],
    'Pets': ['Dog', 'Cat', 'Dog', 'Dog', 'Cat', 'Horse'],
    'Pet Age': [5, 10, 1, 3, 15, 10],
}

df = pd.DataFrame.from_dict(data, orient='index').T

#   People   Age   Pets  Pet Age
# 0    Tom    24    Dog        5
# 1    Jim    30    Cat       10
# 2  Sally    21    Dog        1
# 3   None  None    Dog        3
# 4   None  None    Cat       15
# 5   None  None  Horse       10

To write as csv:

df.to_csv('pets.csv', index=False)

# People,Age,Pets,Pet Age
# Tom,24,Dog,5
# Jim,30,Cat,10
# Sally,21,Dog,1
# ,,Dog,3
# ,,Cat,15
# ,,Horse,10

Upvotes: 5

Related Questions