Reputation: 3182
Is it bad to have text as a primary key in an SQLite database? I heard that it's bad for performance reasons, is this true? And will the rowid be used as the actual primary key in such a case?
Upvotes: 48
Views: 56278
Reputation: 340
There's nothing intrinsically wrong with using a text primary key. What makes a primary key work is that it is orderable and has unique values in the table; other than that, the type of the data doesn't strictly matter. However, when the data is text, often the "real-world" source of the data rules out the practicality of using it as the primary key. Using arbitrary, meaningless integers for the primary key means you don't have to worry about stuff like that.
That's probably the nicest thing about integer primary keys, arguably moreso than the speed of integer vs. string comparison. String comparison is generally a little more work for the computer than integer comparison, true, but that's unlikely to matter much in this context. SQLite creates an index for the primary key in any table, which means that even if you have a million entries, SQLite will only need to perform around 13 comparisons at worst to find the row (O(log n)). It would take a really unusual case for that to have major performance implications, I'd say.
On that note, something thing you might consider if you're planning on using a text primary key is to use SQLite's WITHOUT ROWID
feature. A table with a text primary key is unlikely to need a rowid column, because the rowid is essentially an integer primary key. WITHOUT ROWID
not only eliminates the rowid column, but also tells SQLite to base the search tree for the table itself on the primary key you specify instead of the rowid. Otherwise, it will create two search trees, the main search tree for the table itself using rowid keys and a separate search tree for the text primary key associating text values with rowids. This wastes space and adds needless overhead to lookups using the text primary key, presuming you have no need for the rowid.
SQLite's docs for WITHOUT ROWID
explain all this stuff. They give an example of a table storing word counts in a text corpus with the word as the primary key, which seems to me like a nice example of a situation where a text primary key makes sense.
Upvotes: 7
Reputation: 1919
Although this thread discusses INTEGER vs TEXT primary keys, for context, see Blob vs. Text for primary keys circa 2021 where SQLite creator Richard Hipp replies. I've pasted and emphasized the relevant portion of his reply below.
(2) By Richard Hipp (drh) on 2021-03-04 16:00:22 in reply to 1 [source]
Both approaches should work fine. Storing the hash as a BLOB might be very slightly faster, since (as you observe) there is less content, hence less file I/O.
The Fossil version control system does something very much like this. But it stores the hash as text rather than as a blob. Performance is not an issue, and text is easier for developers to deal with when debugging.
Upvotes: 3
Reputation: 6966
A field of type PRIMARY KEY implies comparing values. Comparing a number is simpler than comparing a text.
The reason is that there is a specific assembly instruction for 64 bit numeric comparison. This will always be much faster than comparing text which in theory can be unlimited in size.
Example comparing number:
CMP DX, 00 ; Compare the DX value with zero
JE L7 ; If yes, then jump to label L7
.
.
L7: ...
Read more about CMP
assembly instruction here: https://www.tutorialspoint.com/assembly_programming/assembly_conditions.htm
Knowing this allows us to know that numbers will always be more performative (at least in the computing we have today).
Upvotes: -5
Reputation: 352
Yes, if you use TEXT you get android.database.sqlite.SQLiteConstraintException: UNIQUE constraint failed: TableName.ColumnName (code 1555)
SQLite has session to insert and return the row ID of the last row inserted, if this insert is successful. else will return -1.
return is mapped to _ID , this is the reason they force you interface BaseColumns for the table
its strange that insert call has to return the rowid, instead of a boolean or so
I wish TEXT PRIMARY KEY capability was there in sqlite
Upvotes: -3
Reputation: 1143
In real world, using strings as primary key has a lot of benefits if we are talking about UUIDs. Being able to create entity "passport" exactly at the moment of its creation can massively simplify asynchronous code and/or distributed system (if we are talking about more complex mobile client / server architecture).
As to the performance, I did not find any measurable difference when running a benchmark to perform 10000 primary key lookups, as in reality, database indexes neither store nor compare strings when running indexed searches.
Upvotes: 41
Reputation: 152817
Is it bad to have text as a primary key in an SQLite database? I heard that it's bad for performance reasons, is this true?
From correctness point of view, TEXT PRIMARY KEY
is all right.
From performance point of view, prefer INTEGER
keys. But as with any performance issue, measure it yourself to see if there's a significant difference with your data and use cases.
And will the rowid be used as the actual primary key in such a case?
Only INTEGER PRIMARY KEY
gets aliased with ROWID
. Other kinds of primary keys don't, and there will be the implicit integer rowid unless WITHOUT ROWID
is specified. Reference.
Upvotes: 57
Reputation: 33505
Is it bad to have text as a primary key in an SQLite database? I heard that it's bad for performance reasons, is this true?
I never heard that somebody used string as primary key in table. For me (and I honestly hope also for others) very "ugly" practise with very low performance.
If you'll use string as primary key you needs to think about a "few" things:
Here, each row must have same format (readability issue of course) and also be unique. Oh! Here is next "piggy work" ->
you'll need to create some "unique string generator" that will generate unique1 string identificator2.
And also there are next issues is good to consider:
=
automatically harder and harder to compareIt's more complex theme but i would like to say that OK, for very small tables would be possible to use strings as primary key (if it makes a sence) but if you'll look at disadvantages it's much more better technique to use number as primary key for sure!
And what is conclusion?
I don't recommend you to use string as primary key. It has more disadvantages as advantages (it has really some advantage?).
Usage of number as primary key is much more better (I'm scared to say the best) practise.
And will the rowid be used as the actual primary key in such a case?
If you will use string as primary not.
1In real strings are rarely unique.
2Of course, you could say that you can create identificator from name of item in row, but it's once again spaghetti code (items can have same name).
Upvotes: -46