Kumara Prasanna J
Kumara Prasanna J

Reputation: 23

How to append dataframes inside a for loop in Python

I have been trying to append the DataFrame in the four loop, for loop works fine, however it is not appending the data frames, any help would be much appreciated.

   symbols = ['MSFT', 'GOOGL', 'AAPL']
   apikey = 'CR*****YDA'
   for s in symbols:
     print(s)
     url = "https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=%s&apikey=%s"  % (s, apikey)
     stockdata = urllib.request.urlopen(url)
     data = stockdata.read().decode()
     js = json.loads(data)
     a = pd.DataFrame(js['Time Series (Daily)']).T
     b = pd.DataFrame()
     print(b)
     b = b.append(a, ignore_index=True)
     print(b)
     print("loop successful")

print("run successfull")

Outputs:

MSFT
Empty DataFrame
Columns: []
Index: []
     1. open   2. high    3. low  4. close  5. volume
0   107.4600  107.9000  105.9100  107.7100   37427587
1   105.0000  106.6250  104.7600  106.1200   28393015
..       ...       ...       ...       ...        ...
99  109.2700  109.6400  108.5100  109.6000   19662331

[100 rows x 5 columns]
loop successful
GOOGL
Empty DataFrame
Columns: []
Index: []
      1. open    2. high     3. low   4. close 5. volume
0   1108.5900  1118.0000  1099.2800  1107.3000   2244569
1   1087.9900  1100.7000  1083.2600  1099.1200   1244801
..        ...        ...        ...        ...       ...
99  1244.1400  1257.8700  1240.6800  1256.2700   1428992

[100 rows x 5 columns]
loop successful
AAPL
Empty DataFrame
Columns: []
Index: []
     1. open   2. high    3. low  4. close 5. volume
0   157.5000  157.8800  155.9806  156.8200  33751023
1   154.2000  157.6600  153.2600  155.8600  29821160
..       ...       ...       ...       ...       ...
99  217.1500  218.7400  216.3300  217.9400  20525117

[100 rows x 5 columns]
loop successful
run successfull

Upvotes: 2

Views: 5359

Answers (3)

SafeDev
SafeDev

Reputation: 672

The problem is that you kept erasing the value of b with an empty DataFrame. So you have to define b as a DataFrame before the for loop.

symbols = ['MSFT', 'GOOGL', 'AAPL']
apikey = 'CR*****YDA'
b = pd.DataFrame()
for s in symbols:
  print(s)
  url = "https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=%s&apikey=%s"  % (s, apikey)
  stockdata = urllib.request.urlopen(url)
  data = stockdata.read().decode()
  js = json.loads(data)
  a = pd.DataFrame(js['Time Series (Daily)']).T
  print(b)
  b = b.append(a, ignore_index=True)
  print(b)
  print("loop successful")

print("run successfull")

Upvotes: 1

jpp
jpp

Reputation: 164623

The immediate problem is you define b as an empty dataframe within each iteration of your for loop. Instead, define it once before your for loop begins:

b = pd.DataFrame()
for s in symbols:
    # some code
    a = pd.DataFrame(js['Time Series (Daily)']).T
    b = b.append(a, ignore_index=True)

But appending dataframes in a loop is not recommended. It requires unnecessary copy operations and is inefficient. The docs recommend using pd.concat on an iterable of dataframes:

list_of_dfs = []
for s in symbols:
    # some code
    list_of_dfs.append(pd.DataFrame(js['Time Series (Daily)']).T)

b = pd.concat(list_of_dfs, ignore_index=True)

Upvotes: 2

Tim
Tim

Reputation: 3407

Moving the following code

b = pd.DataFrame()

to outside of the loop would fix your problem. Right now, 'b' is re-initialized as empty dataframe every loop.

Upvotes: 0

Related Questions