Reputation: 85
I am trying to add large number of (chunks) items to list. List can only have up to 536,870,912 items. And I have dataset of 1 billion lines of code and I am getting memory error, As I am trying to append 1 billion items to a list.
list = ["1","2","3","4",... 1 billion]
window_size = 4
db_chunk_hash = []
for i in range(len(list) - window_size + 1):
c = list[i: i + window_size]
i = bytes(str(c), encoding = 'utf8')
db_chunk_hash.append(i)
How to solve this problem?
Upvotes: 2
Views: 194
Reputation: 860
You can use the Python generator function, that way you only use the data as needed and it is a better way on the memory.
def chunks(lst, window_size):
for i in range(len(lst) - window_size + 1):
c = lst[i: i + window_size]
yield bytes(str(c), encoding='utf8')
window_size = 4
db_chunk_hash = chunks(list_t, window_size)
Upvotes: 1