Jim Evil Chippy Perry
Jim Evil Chippy Perry

Reputation: 51

Python 3 - write from constructed Dictionary to excel using pandas

Hello you Pythonic lovers.

I have run into quite an interesting little issue that I have not been able to resolve due to my inexperience. I am constructing a dictionary in python based on a set of answers in a graph database and I have run into an interesting dilemma. (I am running Python 3

When all is said and done, I receive the following example output in my excel file (this is from column 0 , every entry is a row:

ACTUAL EXCEL FORMAT:

0/{'RecordNo': 0}
1/{'Dept': 'DeptName'}
2/{'Option 1': 'Option1Value'}
3/{'Option 2': 'Option2Value'}
4/{'Question1': 'Answer1'}
5/{'Question2': 'Answer2'}
6/{'Question3': 'Answer3'}

etc..

Expected EXCEL format:

0/Dept, Option 1, Option 2, Question 1, Question 2, Question 3
1/DeptName, Option1Value, Option2Value, Answer1, Answer2, Answer3

The keys of the dictionary are supposed to be the headers and the values, the contents of every row, but for some reason, it's writing it out as the key and value when I use the following output code:

EXCEL WRITER CODE:

ReportDF = pd.DataFrame.from_dict(DomainDict)
WriteMe = pd.ExcelWriter('Filname.xlsx')
ReportDF.to_excel(WriteMe, 'Sheet1')
try:
    WriteMe.save()
    print('Save completed')
except:
    print('Error in saving file')

To build the dictionary, I use the following code: EDIT (Removed sub-addition of dictionary entries, as it is the same and will be streamlined into a function call once the primary works).

DICTIONARY PREP CODE:

for Dept in Depts:
ABBR = Dept['dept.ABBR']
#print('Department: ' + ABBR)
Forests = getForestDomains(Quarter,ABBR)
for Forest in Forests:
    DictEntryList = []
    DictEntryList.append({'RecordNo': DomainCount})
    DictEntryList.append({'Dept': ABBR})
    ForestName = Forest['d.DomainName']
    DictEntryList.append({'Forest ': ForestName})
    DictEntryList.append({'Domain': ''})
    AnswerEntryList = []

    QList = getApplicableQuestions(str(SA))
    for Question in QList:
        FAnswer = ''
        QDesc = Question['Question']
        AnswerResult = getAnswerOfQuestionForDomainForQuarter(QDesc, ForestName, Quarter)
        if AnswerResult:
            for A in AnswerResult:
                if(str(A['Answer']) != 'None'):
                    if(isinstance(A, numbers.Number)):    
                        FAnswer = str(int(A['Answer']))
                    else:
                        FAnswer = str(A['Answer'])
                else:
                    FAnswer = 'Unknown'
        else:
            print('GOBBLEGOBBLE')
            FAnswer = 'Not recorded'
        AnswerEntryList.append({QDesc: FAnswer})

    for Entry in AnswerEntryList:
        DictEntryList.append(Entry)

    DomainDict[DomainCount] = DictEntryList
    DomainCount+= 1

print('Ready to export')

If anyone could assist me in getting my data to export into the proper format within excel, it would be greatly appreciated.

EDIT: Print of the final dictionary to be exported to excel:

{0: [{'RecordNo': 0}, {'Dept': 'Clothing'}, {'Forest ': 'my.forest'}, {'Domain': 'my.domain'}, {'Question1': 'Answer1'}, {'Question2': 'Answer2'}, {'Question3': 'Answer3'}], 1: [{...}]}

Upvotes: 0

Views: 1229

Answers (1)

cmaher
cmaher

Reputation: 5215

The problem in writing to Excel is due to the fact that the values in the final dictionary are lists of dictionaries themselves, so it may be that you want to take a closer look at how you're building the dictionary. In its current format, passing the final dictionary to pd.DataFrame.from_dict results in a DataFrame that looks like this:

#                             0
# 0            {u'RecordNo': 0}
# 1      {u'Dept': u'Clothing'}
# 2  {u'Forest ': u'my.forest'}
# 3   {u'Domain': u'my.domain'}
# 4  {u'Question1': u'Answer1'}
# 5  {u'Question2': u'Answer2'}
# 6  {u'Question3': u'Answer3'}

So each value in the DataFrame row is itself a dict. To fix this, you can flatten/merge the inner dictionaries in your final dict before passing it into a DataFrame:

modified_dict = {k:{x.keys()[0]:x.values()[0] for x in v} for k, v in final_dict.iteritems()}
# {0: {'Domain': 'my.domain', 'RecordNo': 0, 'Dept': 'Clothing', 'Question1': 'Answer1', 'Question3': 'Answer3', 'Question2': 'Answer2', 'Forest ': 'my.forest'}}

Then, you can pass this dict into a Pandas object, with the additional argument orient=index (so that the DataFrame uses the keys in the inner dicts as columns) to get a DataFrame that looks like this:

ReportDF = pd.DataFrame.from_dict(modified_dict, orient='index')
#       Domain  RecordNo      Dept Question1 Question3 Question2    Forest 
# 0  my.domain         0  Clothing   Answer1   Answer3   Answer2  my.forest

From there, you can write to Excel as you had indicated.

Edit: I can't test this without sample data, but from the look of it you can simplify your Dictionary Prep by building a dict instead of a list of dicts.

for Dept in Depts:
    ABBR = Dept['dept.ABBR']
    Forests = getForestDomains(Quarter,ABBR)
    for Forest in Forests:
        DictEntry = {}
        DictEntry['RecordNo'] = DomainCount
        DictEntry['Dept'] = ABBR
        DictEntry['Forest '] = Forest['d.DomainName']
        DictEntry['Domain'] = ''

        QList = getApplicableQuestions(str(SA))
        for Question in QList:
            # save yourself a line of code and make 'Not recorded' the default value
            FAnswer = 'Not recorded'
            QDesc = Question['Question']
            AnswerResult = getAnswerOfQuestionForDomainForQuarter(QDesc, ForestName, Quarter)
            if AnswerResult:
                for A in AnswerResult:
                    # don't convert None to string and then test for inequality to 'None'
                    # if statements evaluate None as False already
                    if A['Answer']:
                        if isinstance(A, numbers.Number):    
                            FAnswer = str(int(A['Answer']))
                        else:
                            FAnswer = str(A['Answer'])
                    else:
                        FAnswer = 'Unknown'
            else:
                print('GOBBLEGOBBLE')
            DictEntry[QDesc] = FAnswer

        DomainDict[DomainCount] = DictEntry
        DomainCount += 1

print('Ready to export')

Upvotes: 1

Related Questions