Reputation: 21
I have a very simple small database, 2 of tables are:
Node (Node_ID, Node_name, Node_Date
) : Node_ID is primary key
Citation (Origin_Id, Target_Id
) : PRIMARY KEY (Origin_Id, Target_Id)
each is FK in Node
Now I write a query that first find all citations that their Origin_Id has a specific date and then I want to know what are the target dates of these records.
I'm using sqlite in python the Node table has 3000 record and Citation has 9000 records, and my query is like this in a function:
def cited_years_list(self, date):
c=self.cur
try:
c.execute("""select n.Node_Date,count(*) from Node n INNER JOIN
(select c.Origin_Id AS Origin_Id, c.Target_Id AS Target_Id, n.Node_Date AS
Date from CITATION c INNER JOIN NODE n ON c.Origin_Id=n.Node_Id where
CAST(n.Node_Date as INT)={0}) VW ON VW.Target_Id=n.Node_Id
GROUP BY n.Node_Date;""".format(date))
cited_years=c.fetchall()
self.conn.commit()
print('Cited Years are : \n ',str(cited_years))
except Exception as e:
print('Cited Years retrival failed ',e)
return cited_years
Then I call this function for some specific years, But it's crazy slowwwwwwwww :( (around 1 min for a specific year) Although my query works fine, it is slow. would you please give me a suggestion to make it faster? I'd appreciate any idea about optimizing this query :)
I also should mention that I have indices on Origin_Id and Target_Id, so the inner join should be pretty fast, but it's not!!!
Upvotes: 1
Views: 902
Reputation: 5836
Instead of COUNT(*) use MAX(n.Node_Date)
SQLite doesn't keep a counter on number of tables like mysql does but instead it scans all your rows everytime you call COUNT meaning extremely slow.. yet you can use MAX() to fix that problem.
Upvotes: 1
Reputation: 663
If this script runs over a period of time, you may consider loading the database into memory. Since you seem to be coding in python, there is a connection function called connection.backup that can backup an entire database into memory. Since memory is much faster than disk, this should increase speed. Of course, this doesn''t do anything to optimize the statement itself, since I don't have enough of the code to evaluate what it is you are doing with the code.
Upvotes: 1