Reputation: 3
I start by saying that I'm a beginner with C# and SQL, so sorry if my code contains some rubbish.
I've a sqlite table which is a Japanese Dictionary.
It has the following rows:
ID: an unique, autoincremental integer;
word: the word in Japanese "ideograms" (es.: 元気);
reading: the phonetic reading of the word (es.: げんき);
definition: the dictionary definition for the word;
A word may appear more than once inside the dictionary.
Sometimes there may be multiple entries where both "word" and "reding" are the same, and I need to retrieve all of them:
ID - word - reading - definition
1 - 愛 - あい - Love;
2 - 愛 - あい - The title of a 1800 book by ... ;
This is how I retrieve the result:
void Search(string _word, string _reading) //es.: Search(愛, あい);
{
...
using(var conn = new SQLiteConnection("Data Source=dictsdb.sqlite;Version=3;"))
{
conn.Open();
string sql = "SELECT * FROM dictionary WHERE word = '" + _word + "' AND reading = '" + _reading + "'";
using(var command = new SQLiteCommand(sql, conn))
{
using(var reader = command.ExecuteReader(CommandBehavior.CloseConnection))
{
while(reader.Read())
{
(...)
}
}
}
conn.Close();
}
...
}
The dictionary is not sorted in a particular way, so I can not use optimized search algorithms.
I can not sort the dictionary either because I need it in its actual order.
The slow part is the command.ExecuteReader()
.
I suppose because for each search it must check all the entries of the dictionary, which is huge.
Any suggestion to make it faster?
EDIT: Thank you all for your suggestions! The index trick solved the issue and now is teen times faster! Thank to the other suggestions too, like the parametrized input one!
Upvotes: 0
Views: 2270
Reputation: 21
First off, congrats on starting your first C#/SQLite project, best of luck mate. I would recommend looking into restructuring your data a bit. For example:
1) Table Structure. It sounds like there is a many to one relation on your word -> reading definitions. Meaning, you have many reading that may use the same work multiple times. You could then restructure your table structure to represent something like this:
JD_Word JD_Reading
|------------| |------------|
| ID | |----->| ID |
| Word | | | Reading |
| Reading ID |--| |------------|
| Definition |
|------------|
2) Indexing. In general, it may be a good idea to create a simple index. As an overview, and index reduced the number of data pages your database has to scan. In SQLite, your primary key should be automatically indexed. More on that here. You could create a multi column index on your JD_Reading Table:
Current Structure:
CREATE INDEX dictionary_word ON dictionary(word);
Current Structure w/Muti-Column
CRAETE INDEX dictionary_word ON dictionary(word, reading);
3NF (See first point):
CREATE INDEX dictionary_word ON JD_Word(Word);
CREATE INDEX dictionary_reading ON JD_Reading(Reading);
3) There are many other options! Check out this post highlighting some C#/SQLite performance issues. Leave a comment if you have any questions mate. Happy Programming.
Upvotes: 2
Reputation: 8726
To speed up the query, add an index to the dictionary
table, include at least the word
column (reference):
CREATE INDEX IdxDictionary ON dictionary(word);
Depending on the number of readings per word, it may be worth comparing performance to that of a multi-column index:
CREATE INDEX IdxDictionary ON dictionary(word, reading);
The multi-column index will take up more space than the first one; if storage is of no concern, you can just create the second one.
When we assume that _word
and _reading
are both user input, they need to be parametrized to prevent users from hacking your database:
string sql = "SELECT * FROM dictionary WHERE word=@word AND reading=@reading";
using(var command = new SQLiteCommand(sql, conn))
{
command.Parameters.AddWithValue("@word", _word);
command.Parameters.AddWithValue("@reading", _reading);
...
}
If the number of expected matches is high, you could further optimize: Fetch only the ID
and definition
columns, word
and reading
are redundant.
Upvotes: 1