AmiB
AmiB

Reputation: 41

Pandas DataFrames in a loop, df.to_csv()

I am trying to write a df to a csv from a loop, each line represents a df, but I am finding some difficulties once the headers are not equal for all dfs, some of them have values for all dates and others no.

I am writing the df using a function similar to this one:

def write_csv():
    for name, df in data.items():
        df.to_csv(meal+'mydf.csv', mode='a')

and it creates a csv for each meal (lunch an dinner) each df is similar to this:

Name    Meal    22-03-18    23-03-18    25-03-18        
Peter   Lunch   12          10          9

or:

Name    Meal    22-03-18    23-03-18    25-03-18        
Peter   Dinner  12          10          9

I was trying to use pandas concatenate, but I am not finding a way to implement this in the function. My goal is to have the headers with all the dates (as the example of desired output), independent if the DataFrame appended to the csv have or not values in all dates.

Actual output:
Name    Meal    22-03-18    23-03-18    25-03-18        
Peter   Lunch   12          10          9       
Mathew  Lunch   12          11          11         10     9
Ruth    Lunch   9           9           8          9    
Anna    Lunch   10          12          11         13     10


output with headers:
Name    Meal    22-03-18    23-03-18    25-03-18           
Peter   Lunch   12          10          9       
Name    Meal    21-03-18    22-03-18    23-03-18    24-03-18    25-03-18
Mathew  Lunch   12          11          11          10          9
Name    Meal    21-03-18    22-03-18    24-03-18    25-03-18    
Ruth    Lunch   9           9           8           9   
Name    Meal    21-03-18    22-03-18    23-03-18    24-03-18    25-03-18
Anna    Lunch   10          12          11          13          10



Output desired:
Name    Meal    21-03-18    22-03-18    23-03-18    24-03-18    25-03-18
Peter   Lunch   12          10          9   
Mathew  Lunch               12          11          11           10
Ruth    Lunch   9           9           8           9
Anna    Lunch   10          12          11          13           10

Upvotes: 1

Views: 15753

Answers (3)

AmiB
AmiB

Reputation: 41

Using the following logic(@saucoide) I get my desired output.

it was necessary to create an empty df, than populate it, then groupby meal and print to csv.

main_df= pd.DataFrame()

    for name, df in data.items():
        main_df = pd.concat([main_df, df])  

    main_df_group = main_df.groupby('Meal')
    for name, group in main_df_group:
        mydf_group = group

        mydf_group.to_csv(meal+ ...)

Upvotes: 0

LogCapy
LogCapy

Reputation: 467

You could use the header = False flag for to_csv after the first iteration.

def write_csv():
    for i, (name, df) in enumerate(data.items()):
        df.to_csv('mydf.csv', mode='a', header=(i==0))

Upvotes: 2

saucoide
saucoide

Reputation: 11

can you try something like this? not sure if is exactly what you want, but it will concatenate dataframes without fully overlapping columns

def write_csv():
    df2 = pd.DataFrame()
    for name, df in data.items():
        df2 = df2.append(df)
    df2.to_csv('mydf.csv')

Upvotes: 1

Related Questions