Reputation: 1844
I have a data
which I am trying to store in pandas
dataFrame. But, it is appearing in a weird way. I know I am doing something wrong
Can somebody help me in finding whats wrong.
Code
root@optstra:~# cat pandas_1.py
import pandas as pd
import numpy as np
numberOfRows = 1
SYMBOL = 'ABB'
volume_increasing = True
price_increase = True
OI_CHANGE = True
closedAboveYesterday = False
Above_22SMA = False
data_frame = pd.DataFrame(index=np.arange(0, numberOfRows), columns=('SYMBOL','Volume', 'Price', 'OI','OHLC','22SMA') )
for x in range(0,numberOfRows):
data_frame.loc[x] = [{SYMBOL,volume_increasing,price_increase,OI_CHANGE,closedAboveYesterday,Above_22SMA} for n in range(6)]
print(data_frame)
Output
root@optstra:~# python3 pandas_1.py
SYMBOL Volume Price OI OHLC 22SMA
0 {False, True, ABB} {False, True, ABB} {False, True, ABB} {False, True, ABB} {False, True, ABB} {False, True, ABB}
If I change the line which writes the data to data frame as follows
for x in range(0,numberOfRows):
data_frame.loc[x] = [(SYMBOL,volume_increasing,price_increase,OI_CHANGE,closedAboveYesterday,Above_22SMA) for n in range(6)]
Output changes to
root@optstra:~# python3 pandas_1.py
SYMBOL ... 22SMA
0 (ABB, True, True, True, False, False) ... (ABB, True, True, True, False, False)
Upvotes: 1
Views: 118
Reputation: 146
It seems to me you're not quite indexing the dataframe properly. You can either do this:
for x in range(0, numberOfRows):
data_frame['SYMBOL'][x] = SYMBOL
data_frame['Volume'][x] = volume_increasing
data_frame['Price'][x] = price_increase
data_frame['OI'][x] = OI_CHANGE
data_frame['OHLC'][x] = closedAboveYesterday
data_frame['22SMA'][x] = Above_22SMA
which will give you your desired output, alternatively you can use dictionaries and avoid the for loop altogether:
columns = ['SYMBOL','Volume', 'Price', 'OI','OHLC','22SMA']
data = {'SYMBOL': 'AAB',
'Volume': True,
'Price': True,
'OI': True,
'OHLC': False,
'22SMA': False}
data_frame = pd.DataFrame(data=data, index=np.arange(0, 1), columns=columns)
Upvotes: 0
Reputation: 863166
Updating an empty frame (e.g. using loc one-row-at-a-time) is inefficient.
So better/faster is create list by append with DataFrame
contructor:
data = []
for x in np.arange(numberOfRows):
row = [SYMBOL,volume_increasing,price_increase,OI_CHANGE,closedAboveYesterday,Above_22SMA]
data.append(row)
c = ('SYMBOL','Volume', 'Price', 'OI','OHLC','22SMA')
data_frame = pd.DataFrame(data, columns=c)
list comprehension alternative
:
data = [[SYMBOL,volume_increasing,price_increase,OI_CHANGE,closedAboveYesterday,Above_22SMA] for x in np.arange(numberOfRows)]
Upvotes: 2
Reputation: 588
Why don't you try this-- not sure if it's exactly what you're looking for since you took that part out in your edit:
for x in range(0,numberOfRows):
data_frame.loc[x] = [SYMBOL,volume_increasing,price_increase,OI_CHANGE,closedAboveYesterday,Above_22SMA]
Output:
SYMBOL Volume Price OI OHLC 22SMA
0 ABB True True True False False
Upvotes: 2