Reputation: 21
After watching "Google IO 2009: Building scalable, complex apps on App Engine" I performed some tests to help understand the impact on list de-serialization, but the results are quite surprising. Below are the test descriptions.
TEST 1
- Fetch Single Row
- Table size: 500 Columns vs List of 500 (both contain 500 rows)
Table:ChartTestDbRdFt500C500R <-- 500 Columns x 500 Rows
OneRowCol Result <-- Fetching one row
[0] 0.02 (52) <-- Test 0, time taken = 0.02, CPU usage = 52
[1] 0.02 (60)
[2] 0.02 (56)
[3] 0.01 (46)
[4] 0.02 (57)
Table:ChartTestDbRdFt500L500R <-- List of 500 x 500 Rows
OneRowLst Result
[0] 0.01 (40)
[1] 0.02 (38)
[2] 0.01 (42)
[3] 0.05 (154)
[4] 0.01 (41)
TEST 2
- Fetch All Rows
- Table size: 500 Columns vs List of 500 (both contain 500 rows)
Table:ChartTestDbRdFt500C500R
AllRowCol Result
[0] 11.54 (32753)
[1] 10.99 (31140)
[2] 11.07 (31245)
[3] 11.55 (37177)
[4] 10.96 (34300)
Table:ChartTestDbRdFt500L500R
AllRowLst Result
[0] 7.46 (20872)
[1] 7.02 (19632)
[2] 6.8 (18967)
[3] 6.33 (17709)
[4] 6.81 (19006)
TEST 3
- Fetch Single Row
- Table size: 4500 Columns vs List of 4500 (both contain 10 rows)
Table:ChartTestDbRdFt4500C10R
OneRowCol Result
[0] 0.15 (419)
[1] 0.15 (433)
[2] 0.15 (415)
[3] 0.23 (619)
[4] 0.14 (415)
Table:ChartTestDbRdFt4500L10R
OneRowLst Result
[0] 0.08 (212)
[1] 0.16 (476)
[2] 0.07 (215)
[3] 0.09 (242)
[4] 0.08 (217)
CONCLUSION
Fetching a list of N items is actually quicker than N columns. Does anyone know why this is the case? I thought there is a performance hit on list de-serialization? Or did I performed my tests incorrectly? Any insight will be helpful, thanks!
Upvotes: 1
Views: 239
Reputation: 62593
BigTable is a column-oriented database.
That means that fetching a 'row' of N columns is in fact N different read operations, all on the same index.
Upvotes: 1