jason
jason

Reputation: 4449

Google sheets APIv4 comment data to pandas df, not values

I'm getting the notes for ranges A1:E7 of a sheet. There are notes in B1, E1, D4 and B7.

result = gsheets.service.spreadsheets().get(spreadsheetId=key, fields="sheets/data/rowData/values/note").execute()
data=result['sheets'][0]['data'][0]

produces the following dictionary:

{u'rowData': [
        {u'values': [{}, {u'note': u'B1 notes'}, {}, {}, {u'note': u'E1 notes'}]},
        {}, 
        {}, 
        {u'values': [{}, {}, {}, {u'note': u'D4 notes'}]},
        {}, 
        {}, 
        {u'values': [{}, {u'note': u'B7 notes'}]}
        ]
}

Now how do I get this into a 7x5 dataframe that mimics the range A1:E7? I want to use '' for the blank cells.

Upvotes: 1

Views: 137

Answers (1)

Peter Leimbigler
Peter Leimbigler

Reputation: 11105

This solution is a bit hard to read, but it works in my tests. The first step is to build a temporary DataFrame from the top-level list of dicts, where each (possibly empty) dict represents a row in the original spreadsheet.

Assuming your dictionary is named d:

import pandas as pd
from pandas.io.json import json_normalize

temp = pd.DataFrame.from_dict(d['rowData']) 
temp
                                                     values
0  [{}, {'note': 'B1 notes'}, {}, {}, {'note': 'E1 notes'}]
1                                                       NaN
2                                                       NaN
3                        [{}, {}, {}, {'note': 'D4 notes'}]
4                                                       NaN
5                                                       NaN
6                                [{}, {'note': 'B7 notes'}]

# JSON-normalize each non-null row
res = (pd.DataFrame(temp['values'].map(lambda x:
                                       json_normalize(x).values.flatten() 
                                       if x is not np.nan else [np.nan]) 
                                  .values
                                  .tolist()
                   )
      ).fillna('')

res.index = range(1, res.shape[0]+1)
res.columns = list('ABCDE')

res
  A         B C         D         E
1    B1 notes              E1 notes
2                                  
3                                  
4                D4 notes          
5                                  
6                                  
7    B7 notes                      

Upvotes: 1

Related Questions