pod100
pod100

Reputation: 45

Why is my pandas dataframe only showing results for one dataset?

I have two datasets in CSV format that I want to loop through and show results for in one pandas dataframe. Right now my results in df are only showing for the B dataset but if I try to print "data" it shows my two CSV datasets that I have below.

The test CSV datasets are:

Date fastMA slowMA Ticker
01/01/2021 1 5 A
02/01/2021 2 5 A
03/01/2021 4 5 A
04/01/2021 5 5 A
05/01/2021 6 5 A
06/01/2021 7 5 A
07/01/2021 4.5 5 A
08/01/2021 4 5 A
09/01/2021 3 5 A
Date fastMA slowMA Ticker
01/01/2021 1 5 B
02/01/2021 2 5 B
03/01/2021 4 5 B
04/01/2021 5 5 B
05/01/2021 6 5 B
06/01/2021 7 5 B
07/01/2021 4.5 5 B
08/01/2021 4 5 B
09/01/2021 3 5 B

ticker = csv.reader(open('gdrive/My Drive/data/test/test_tickers/test_tickers.csv'))
for symbols in ticker:

  tickers = symbols
  data = pd.read_csv('gdrive/My Drive/data/test/test_data/{}.csv'.format(symbols))

  i = 1
  j = len(data)
  in_trade = 0
  dates = []
  symbol = []

  while i < j:
    long = data["fastMA"][i] > data["slowMA"][i]
    close = data["fastMA"][i] < data["slowMA"][i] 

    if long and in_trade == 0:
      in_trade = 1
      symbol.append(data["Ticker"][i])
      dates.append(data["Date"][i])
      i += 1
    elif long and in_trade == 1:
      in_trade = 1
      i += 1
    elif close and in_trade == 1:
      in_trade = 0
      i += 1
    else:
      i += 1
d = {'Date':dates,'Ticker':symbol}
d
df = pd.DataFrame(d, columns=('Date','Ticker'))
df

Do I need to add in something else?

EDIT:

The output I'm getting now in df:

Date Ticker
05/01/2021 B

My desired output in df:

Date Ticker
05/01/2021 A
05/01/2021 B

Upvotes: 3

Views: 222

Answers (1)

Martin Evans
Martin Evans

Reputation: 46779

For each ticker row that is read in, you were blanking dates and symbol, they need to be outside of your loop. If you restructure a bit, you should get your desired output:

import csv
import pandas as pd

with open('test_tickers.csv', newline='') as f_tickers:    
    csv_tickers = csv.reader(f_tickers)
    dates = []
    symbols = []

    for symbol in csv_tickers:
        data = pd.read_csv(f'{symbol[0]}.csv')

        i = 1
        j = len(data)
        in_trade = 0

        while i < j:
            long = data["fastMA"][i] > data["slowMA"][i]
            close = data["fastMA"][i] < data["slowMA"][i] 

            if long and in_trade == 0:
                in_trade = 1
                symbols.append(data["Ticker"][i])
                dates.append(data["Date"][i])
            elif long and in_trade == 1:
                in_trade = 1
            elif close and in_trade == 1:
              in_trade = 0
            
            i += 1

d = {'Date' : dates, 'Ticker' : symbols}
df = pd.DataFrame(d, columns=('Date','Ticker'))

print(df)

Giving:

         Date Ticker
0  05/01/2021      A
1  05/01/2021      B

(This assumes test_tickers.csv contains two lines containing A and B and the A.csv and B.csv contains the data in your question)

Upvotes: 1

Related Questions