Reputation: 25
I am doing some script with an interface and i met some issue with very poor performance.
I wanted to create many entries per one column(129), and columns should be 7. I created a dictionary for this purpose, so there are empty entries created with dynamic variable name.
To these empty entries i wanted to insert a text depending on what is provided in specific cells (Country / Location).
The task is accomplished, however, it takes a looooot of time to load data, and i don't know what can be done to make it faster.. i cannot let it be so slow, especially not at this stage of that project.
Function for getting information from extract file:
def Load_Neighborhood_Details(self, inx, what_return):
file_path = r'C:\Users\krzysztof-wirkus\Desktop\Od nowa\extracts\neighborhood_details.csv'
file = pd.read_csv(file_path, encoding = "ISO-8859-1")
country = file[file['Country Name'] == self.p0.country_input.get()]
location = country[country['Location Name'] == self.p0.location_input.get()].iloc[inx]
neighborhood_name = location['Neighborhood Name']
dwelling = location['Dwelling']
furniture = location['Furnished Unfurnished Indicator']
item_cost_category = location['Item Cost Category Name']
neigh_class = location['Class']
try:
start_date = location['Start Date'].split()[0]
except:
start_date = ""
try:
end_date = location['End Date'].split()[0]
except:
end_date = ""
if what_return == "neighborhood_name":
return neighborhood_name
elif what_return == "dwelling":
return dwelling
elif what_return == "furniture":
return furniture
elif what_return == "item_cost_category":
return item_cost_category
elif what_return == "neigh_class":
return neigh_class
elif what_return == "start_date":
return start_date
elif what_return == "end_date":
return end_date
My poor performance loop:
for i in range(2, 131):
self.p3.dict['neighborhood_details_name_entry_' + str(i)].insert(tk.END, self.Load_Neighborhood_Details(i-2, "neighborhood_name"))
self.p3.dict['neighborhood_details_dwelling_entry_' + str(i)].insert(tk.END, self.Load_Neighborhood_Details(i-2, "dwelling"))
self.p3.dict['neighborhood_details_furniture_entry_' + str(i)].insert(tk.END, self.Load_Neighborhood_Details(i-2, "furniture"))
self.p3.dict['neighborhood_details_item_cost_category_entry_' + str(i)].insert(tk.END, self.Load_Neighborhood_Details(i-2, "item_cost_category"))
self.p3.dict['neighborhood_details_class_entry_' + str(i)].insert(tk.END, self.Load_Neighborhood_Details(i-2, "neigh_class"))
self.p3.dict['neighborhood_details_start_date_entry_' + str(i)].insert(tk.END, self.Load_Neighborhood_Details(i-2, "start_date"))
self.p3.dict['neighborhood_details_end_date_entry_' + str(i)].insert(tk.END, self.Load_Neighborhood_Details(i-2, "end_date"))
Upvotes: 0
Views: 239
Reputation: 452
I'm not sure but the problem could be that you continue to open your csv file. Try to change
def Load_Neighborhood_Details(self, inx, what_return):
file_path = r'path/to/file'
file = pd.read_csv(file_path, encoding = "ISO-8859-1")
country = file[file['Country Name'] == self.p0.country_input.get()]
[...]
with:
def Load_Neighborhood_Details(self, inx, what_return, file):
country = file[file['Country Name'] == self.p0.country_input.get()]
[...]
then:
file_path = 'path/to/file'
file = pd.read_csv(file_path, encoding = "ISO-8859-1")
for i in range(2, 131):
self.p3.dict['neighborhood_details_name_entry_' + str(i)].insert(tk.END, self.Load_Neighborhood_Details(i-2, "neighborhood_name", file))
[...]
(Note the file object passed to the function Load_Neighborhood_Details())
I hope it can help!
Upvotes: 2
Reputation: 392
Your code have very bad performance because you are opening the csv files multiple times, you should open just one time (if possible) and pass as argument of your function, that will help the performance.
Upvotes: 2