Reputation:
I'm trying to create a simple backtester on python which allows me to assess the performance of a trading strategy. As part of the backtester, I need to record the transactions which occur. E.g.,
Day Stock Action Quantity
1 AAPL BUY 20
2 CSCO SELL 30
2 AMZN SELL 50
During the trading simulation, I'll need to add more transactions.
What's the most efficient way to do this. Should I create a transactions
list at the start of the simulation and append lists such as [5, 'AAPL', 'BUY', 20]
as I go. Should I instead use a dictionary, or a numpy array? Or just a Pandas DataFrame directly?
Thanks,
Jack
Upvotes: 1
Views: 538
Reputation: 402403
list.append
operations are amortised constant time operations, because it just involves shifting pointers around.
OTOH, numpy.ndarray
and pd.DataFrame
objects are internally represented as arrays in C, and those are immutable. Each time you "append" to an array/dataframe, you have to reallocate new memory for an entire copy of the old data plus the appended, and ends up being linear in complexity.
So, as @ayhan said in a comment, accumulate your data in a list
, and then load into a dataframe once you're done.
Upvotes: 2