Reputation: 2108
I have a table in PostgreSQL which is under heavy load (reads). It is practically core table of an application. One column is used as a discriminator - column used by application, that determines type of entity (class) that represents given row. It has to be exactly one varchar column. Currently I do store full class name in it, like: "bank_statement_transaction"
.
When application selects all bank statement transactions, query is built like ... WHERE Discriminator = 'bank_statement_transaction'
. This brings more readability and clarity to data, structure and code.
Table contains currently 3M rows and counting, approximately 100k new rows monthly. Discriminator
was indexed during some performance tunings. I don't have any performance issues right now.
I am working on a new feature that requires some little refactoring and yeah I had an idea to change full class name (bank_statement_transaction
) to short unique codes (BST
)
I replicated dbo and changed full class name to code. With 3M rows, performance gain is barely measurable, same or 1-2 milliseconds faster.
Can anyone share experience with VARCHAR length impact on INDEX size and performance? On bigger data set? Is this change worth of it?
Upvotes: 2
Views: 2086
Reputation: 247270
If you index strings, the index will become larger if the strings are long. The fan-out will be less, so the index will become deeper.
With an index scan that searches for a few rows, this won't be noticable: reading a few blocks more and running comparisons on longer strings may be lost in the noise for any but the simplest queries. Still, you'll be faster with smaller strings.
Maybe the most noticeable effect will be that a smaller index needs less RAM for caching, so the number of disk reads should go down.
Upvotes: 5