user502083
user502083

Reputation: 21

How Optimize sql query make it faster

I have a very simple small database, 2 of tables are:

Node (Node_ID, Node_name, Node_Date) : Node_ID is primary key
Citation (Origin_Id, Target_Id) : PRIMARY KEY (Origin_Id, Target_Id) each is FK in Node

Now I write a query that first find all citations that their Origin_Id has a specific date and then I want to know what are the target dates of these records.

I'm using sqlite in python the Node table has 3000 record and Citation has 9000 records, and my query is like this in a function:

def cited_years_list(self, date):
    c=self.cur
    try:
    c.execute("""select n.Node_Date,count(*) from Node n INNER JOIN 
              (select c.Origin_Id AS Origin_Id, c.Target_Id AS Target_Id, n.Node_Date AS 
               Date from CITATION c INNER JOIN NODE n ON c.Origin_Id=n.Node_Id where
               CAST(n.Node_Date as INT)={0}) VW ON VW.Target_Id=n.Node_Id 
               GROUP BY n.Node_Date;""".format(date))
    cited_years=c.fetchall()
    self.conn.commit()
    print('Cited Years are : \n ',str(cited_years))
    except Exception as e:
       print('Cited Years retrival failed ',e)
   return cited_years

Then I call this function for some specific years, But it's crazy slowwwwwwwww :( (around 1 min for a specific year) Although my query works fine, it is slow. would you please give me a suggestion to make it faster? I'd appreciate any idea about optimizing this query :)

I also should mention that I have indices on Origin_Id and Target_Id, so the inner join should be pretty fast, but it's not!!!

Upvotes: 1

Views: 902

Answers (2)

SSpoke
SSpoke

Reputation: 5836

Instead of COUNT(*) use MAX(n.Node_Date)

SQLite doesn't keep a counter on number of tables like mysql does but instead it scans all your rows everytime you call COUNT meaning extremely slow.. yet you can use MAX() to fix that problem.

Upvotes: 1

Varriount
Varriount

Reputation: 663

If this script runs over a period of time, you may consider loading the database into memory. Since you seem to be coding in python, there is a connection function called connection.backup that can backup an entire database into memory. Since memory is much faster than disk, this should increase speed. Of course, this doesn''t do anything to optimize the statement itself, since I don't have enough of the code to evaluate what it is you are doing with the code.

Upvotes: 1

Related Questions