Pandas | Save to CSV and use columns to populate rows in desired order

Question

I'm working with an API request that allows you to pass certain fields. Sometimes these fields do not contain any information and therefore the request skips it. For example:

Get request for A, B, C.

 1. A returns 1,2,3,4
 2. B returns 1,3,4 
 3. C returns 1,2,4

My issue is that I need to save all these different responses into 1 CSV, create the columns dynamically based on the results that I get from the call and populate the CSV with the results.

Also, I want to write to the CSV each response in a new row into the appropriate column, making my CSV look like this:

 1. => 1,2,3,4
 2. => 1,NULL,3,4 
 3. => 1,2,NULL,4

Any suggestions on how to achieve this? This is what I have so far:

def get_ad_insights(**kwargs):
    ad_ids = [A,B,C]

    df_headers_saved = []

    for i in ad_ids:
        # Calling API and saving results in variable:
        insights = Ad(i).get_insights(
            params=*kwargs['params']*,
            fields=*kwargs['fields']*)

        # Creating a DataFrame:
        df = pd.DataFrame(insights)

        # Saving headers in variable:
        df_headers = df.columns.to_list()
        print(df_headers)

        # Saving DataFrame to a CSV:
        if not os.path.isfile(*kwargs['local_filepath'*]):
            df.to_csv(*kwargs['local_filepath']*, index=False)
            df_headers_saved = df_headers
            print('created')
            time.sleep(1)
        else:
            if df_headers == df_headers_saved:
                df.to_csv(*kwargs['local_filepath']*, index=False, mode='a', header=False)
                time.sleep(1)
                print('appended no update to header')
            elif len(df_headers_saved) > len(df_headers):
                df.to_csv(*kwargs['local_filepath']*, index=False, mode='a', header=False)
                time.sleep(1)
                print('appended no update to header, update based on column')
            else:
                df.to_csv(*kwargs['local_filepath']*, index=False, mode='a', header=False)
                df_headers_saved = df_headers
                time.sleep(1)
                print('appended update to header')

Just a note, the columns from the request come in the same order always, so sorting is not an issue. The issue is how to populate to the appropriate column.

Thank you in advance!

simpleApp · Accepted Answer

your response seems like showing the object type AdsInsights, one way could be to remove this object and then convert it to JSON. once you have the dictionary, then pandas will take care even if you have a different keys based on the request which was made. sample example is:

import json
import pandas as pd
res_data1='[ { "account_name": "Fob","date_stop": "2021-06-16", "inline_link_clicks": "7", "spend": "14.73" }]'
res_data2='[ { "account_name1": "Fob","date_stop": "2021-06-16", "inline_link_clicks": "7", "spend": "14.73" }]'
res_data3='[ { "account_name": "Fob", "action_type": "1", "date_start": "2021-06-16", "date_stop": "2021-06-16", "inline_link_clicks": "7", "spend": "14.73" }]'
res_list=[res_data1,res_data2,res_data3]
df=pd.DataFrame()
for each_response in res_list:
    a=json.loads(each_response.replace('',""))
    for each_entry in a:
        df=df.append(each_entry,ignore_index=True)
df

as a response is not the same all the time(i.e some keys are missing, it will be filled with NaN. or can be replaced with 'NULL' or any other value.

import numpy as np
df.replace({np.NaN: 'NULL'})

or

df.replace({np.NaN: None})

Pandas | Save to CSV and use columns to populate rows in desired order

Answers (1)

Related Questions