Adam Szymański
Adam Szymański

Reputation: 365

Python - Iterating through rows of excel files with Pandas

I have problem with iterating through rows in excel file.

import os
import pandas as pd
import json

for file in os.listdir("./python_files"):
    if file.endswith(".xlsx"):
        df = pd.read_excel(os.path.join("./python_files", file)) 
        CRD_Array = df.iloc[:,1].values
        for single_CRD in CRD_Array:           
            with open("{}.json".format(single_CRD), 'w') as json_file:
                row_iterator = 0
                data = {}
                data['header']=[]
                data['header'].append({'Organization CRD#':  '{}'.format(df.iloc[row_iterator,1])})
                json.dump(data, json_file)
                row_iterator = row_iterator + 1
  

          

How u can see my script is

  1. Reading files from python_files folder
  2. Then it's reading second column with CRD number which returns an array of CRDs
  3. Then it's looping CRD array
  4. In that loop It's trying to save .json file with "hedar" field

What I get in output now

File name 172081.json

{"header": [{"Organization CRD#": "172081"}

File name 534123.json

{"header": [{"Organization CRD#": "172081"}

File name 184521.json

{"header": [{"Organization CRD#": "172081"}

I looks like df.iloc [row_iterator, 1] isn't changing row property despite adding +1 for each loop repeat

Can somebody help?

Edit: Excel file example-

enter image description here

What I want to achieve

File name 172081.json

{"header": [{"Organization CRD#": "172081"}

File name 534123.json

{"header": [{"Organization CRD#": "534123"}

File name 184521.json

{"header": [{"Organization CRD#": "184521"} 

Upvotes: 0

Views: 289

Answers (1)

Agnes Kis
Agnes Kis

Reputation: 499

In the for loop you are increasing the row_iterator, but in the first line after open you always set it back to 0. You need to take that line out from the loop. Like this:

row_iterator = 0

for single_CRD in CRD_Array: 
          
    with open("{}.json".format(single_CRD), 'w') as json_file:
       data = {}
       data['header']=[]
       data['header'].append({'Organization CRD#': '{}'.format(df.iloc[row_iterator,1])})
       json.dump(data, json_file)
       row_iterator = row_iterator + 1

Upvotes: 2

Related Questions