Iterate over list (2 dataframes in 1 list)

Question

I am importing 2 data frames at the same time:

import pandas as pd
import numpy as np
import time
import glob
import os

msci_folder = 'C:/Users/Mike/Desktop/docs'
mscifile = glob.glob(msci_folder + "/*.csv")

dfs = []
for file in mscifile:
    df = pd.read_csv(file)
    dfs.append(df)

Now I want to apply codes which I was using for every individual data frame, but I get error:

AttributeError: 'list' object has no attribute 'loc'

I try:

for i, df in enumerate(dfs):
    dfs = dfs.loc[dfs['URI'] == '/ID']
    dfs.TIMESTAMP = dfs.TIMESTAMP.apply(lambda x: '%.3f' % x)
    dfs.insert(0, 'Date', 0)
    dfs['Date'] = [x[:8] for x in dfs['TIMESTAMP']]
    dfs.to_csv('C:/Users/Mike/Desktop/docs/test.csv', index=False)

Quang Hoang · Accepted Answer

In your second for loop:

for i, df in enumerate(dfs):
    dfs = dfs.loc[dfs['URI'] == '/ID']
    dfs.TIMESTAMP = dfs.TIMESTAMP.apply(lambda x: '%.3f' % x)
    dfs.insert(0, 'Date', 0)
    dfs['Date'] = [x[:8] for x in dfs['TIMESTAMP']]
    dfs.to_csv('C:/Users/Mike/Desktop/docs/test.csv', index=False)

you used dfs, which is the initial list you created, hence the error. You should change every instance of dfs inside that for loop with df.

for i, df in enumerate(dfs):
    # dfs = dfs.loc[dfs['URI'] == '/ID']
    df = df.loc[df['URI'] == '/ID']

    # ... and so on

    # save to different files
    df.to_csv(f'C:/Users/Mike/Desktop/docs/test_{i}.csv', index=False)

Iterate over list (2 dataframes in 1 list)

Answers (1)

Related Questions