Reputation: 1039
I'm trying to use pandas to create a ledger of activity. My object will have a pandas DataFrame that will track balances and transactions associated to that object.
I'm struggling how to append single rows of data to that pandas dataframe as orders get associated to that object. It seems like the most common answer is to "only create the frame once you have all the data", however I can't do that. I want to have the ability to compute on-the-fly as I'm adding in new data.
Here's my associated code (which fails):
self.ledger = pd.DataFrame(data={'entry_date' : [pd.Timestamp('1900-01-01')],
'qty' : [np.float64(startingBalance)],
'element_type' : [pd.Categorical(["startingBalance"])],
'avail_bal' : [np.float64(startingBalance)],
'firm_ind' : True,
'deleted_ind' : False,
'ord_id' : ["fooA"],
'parent_ord_id' : ["fooB"] },
columns=ledgerColumnList
)
self.ledger.iloc[-1] = dict({'entry_date' : ['1900-01-02'],
'qty' : [startingBalance],
'element_type' : ["startingBalance"],
'avail_bal' : [startingBalance],
'firm_ind' : [True],
'deleted_ind' : [False],
'ord_id' : ["foofa"],
'parent_ord_id' : ["foofb"] })
Here's the error I'm getting:
File "C:\Users\MyUser\My Documents\Workspace\myscript.py", line 135, in __init__
'parent_ord_id' : ["foofb"] })
File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 117, in __setitem__
self._setitem_with_indexer(indexer, value)
File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 492, in _setitem_with_indexer
setter(item, v)
File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 422, in setter
s._data = s._data.setitem(indexer=pi, value=v)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 2843, in setitem
return self.apply('setitem', **kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 2823, in apply
applied = getattr(b, f)(**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 636, in setitem
values, _, value, _ = self._try_coerce_args(self.values, value)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 2066, in _try_coerce_args
raise TypeError
TypeError
Thoughts?
1) How can I do this in Pandas?
or
2) Is there something better I should be using that would give me the built-in calculation tools of pandas but would be more well-suited to my little-at-a-time data needs?
Upvotes: 4
Views: 1855
Reputation: 2253
You can also try to create a new dataframe for the new data, and then use concat
.
For illustration purposes, let's take a simple dataframe:
import pandas as pd
df = pd.DataFrame({'a':[0,1,2],'b':[3,4,5]}
print df
>> a b
0 0 3
1 1 4
2 2 5
Let's say you have new data coming in, with values a=4
and b=7
. Create a new dataframe containing only the new data:
newresults = {'a':[4],'b':[7]}
_dfadd = pd.DataFrame(newresults)
print _dfadd
>> a b
0 4 7
Then concatenate:
df = pd.concat([df,_dfadd]).reset_index(drop=True)
print df
>> a b
0 0 3
1 1 4
2 2 5
3 4 7
Upvotes: 2
Reputation: 6581
You can also use df.loc[]
df = pd.DataFrame({'A': [1,2,3,4], 'B': [5,6,7,8], 'C': [9,10,11,12]})
df
A B C
0 1 5 9
1 2 6 10
2 3 7 11
3 4 8 12
new_row = pd.DataFrame({'A': [35], 'B': [27], 'C': [43]})
new_row
A B C
0 35 27 43
df.loc[4] = new_row.loc[0]
df
A B C
0 1 5 9
1 2 6 10
2 3 7 11
3 4 8 12
4 35 27 43
Upvotes: 3
Reputation: 15433
One way is to use pandas.DataFrame.append()
:
self.ledger = pd.DataFrame(data={'entry_date' : [pd.Timestamp('1900-01-01')],
'qty' : [np.float64(startingBalance)],
'element_type' : [pd.Categorical(["startingBalance"])],
'avail_bal' : [np.float64(startingBalance)],
'firm_ind' : [True],
'deleted_ind' : [False],
'ord_id' : ["fooA"],
'parent_ord_id' : ["fooB"] },
columns=ledgerColumnList)
df = pd.DataFrame(data={'entry_date' : [pd.Timestamp('1900-01-02')],
'qty' : [np.float64(startingBalance)],
'element_type' : ["startingBalance"],
'avail_bal' : [np.float64(startingBalance)],
'firm_ind' : [True],
'deleted_ind' : [False],
'ord_id' : ["foofa"],
'parent_ord_id' : ["foofb"] },
columns=ledgerColumnList)
self.ledger.append(df)
Upvotes: 1