Reputation: 11
I am trying to enter large number of data (13 Million rows) into Firebase Firestore, but it is taking forever to finish.
Currently,I am Inserting the data row by row using python, I tried to use multi-treading, but it is still very slow and it is not efficient (I have to stay connected into the Internet)
so, is there another way to insert a file into Firebase (a more efficient way to (batch) insert the data)?
This is the data format
[
{
'010045006031': {
'FName':'Ahmed',
'LName':'Aline'
}
},
{
'010045006031': {
'FName':'Ali',
'LName':'Adel'
}
},
{
'010045006031': {
'FName':'Osama',
'LName':'Luay'
}
}
]
This is the code that I am using
import firebase_admin
from firebase_admin import credentials, firestore
def Insert2DB(I):
doc_ref = db.collection('DBCoolect').document(I['M'])
doc_ref.set({"FirstName": I['FName'], "LastName": I['LName']}
cred = credentials.Certificate("ServiceAccountKey.json")
firebase_admin.initialize_app(cred)
db = firestore.client()
List = []
#List are read from File
List.append({'M': random(),'FName':'Ahmed','LName':'Aline'})
List.append({'M': random(),'FName':'Ali','LName':'Adel'})
List.append({'M': random(),'FName':'Osama','LName':'Luay'})
for item in List:
Insert2DB(item)
Thanks a lot ...
Upvotes: 1
Views: 2993
Reputation: 1
Yes there is a way to bulk data add into firebase using python
def process_streaming(self, response):
for response_line in response.iter_lines():
if response_line:
json_response = json.loads(response_line)
sentiment = self.sentiment_model(json_response["data"]["text"])
self.datalist.append(self.post_process_data(json_response, sentiment))
It extract data from response and save it into list. if we want to wait to loop for minutes and upload first objects of list into daatabase then add belove code after if loop.
if (self.start_time + 300 < time.time()):
print(f"{len(self.datalist)} data send to database")
self.batch_upload_data(self.datalist)
self.datalist = []
self.start_time = time.time() + 300
Final function look like this.
def process_streaming(self, response):
for response_line in response.iter_lines():
if response_line:
json_response = json.loads(response_line)
sentiment = self.sentiment_model(json_response["data"]["text"])
self.datalist.append(self.post_process_data(json_response, sentiment))
if (self.start_time + 300 < time.time()):
print(f"{len(self.datalist)} data send to database")
self.batch_upload_data(self.datalist)
self.datalist = []
self.start_time = time.time() + 300
Upvotes: 0
Reputation: 317750
Firestore does not offer any way to "bulk update" documents. They have to be added individually. There is a facility to batch write, but that's limited to 500 documents per batch, and that's not likely to speed up your process by a large amount.
If you want to optimize the rate at which documents can be added, I suggest reading the documentation on best practices for read and write operations and designing for scale. All things considered, however, there is really no "fast" way to get 13 million documents into Firestore. You're going to be writing code to add each one individually. Firestore is not optimized for fast writes. It's optimized for fast reads.
Upvotes: 3