kheya
kheya

Reputation: 7612

SQL Insert performance question

I have this table phonebook SQL Server 2005:

username(PK) Serial(PK) contact_name  contact_adr      contact_email  contact_phone 
bob          1           Steve         12 abc street    [email protected]   1234          
bob          2           John          34 xyz street    [email protected]    5345          
bob          3           Mark          98 ggs street    [email protected]    1234          
patrick      4           lily          77 fgs street    [email protected]    1234          
patrick      5           mily          76 fgs street    [email protected]    1234          
von          8           jim           6767 jsd way     [email protected]     4564          

Now you can see the phonebook stores all contacts of same user together. Storing this way has advantages which I can't avoid.

My question is: If I have 100 million entries in the table for all users, will my future insertion in the above table be very expensive?

Since SQL Engine needs to find the actual location where to enter the data (I mean under which username)

I tested with 1 million rows, I don't see noticeable issues.

I am asking if anyone has this experience or suggestions for me?

Thanks

Upvotes: 0

Views: 220

Answers (5)

Tim
Tim

Reputation: 5421

The approach that is optimal for an address book is a NOSQL hashed-table. There's no need for an index on the PK. The algorithm returns the "page" where the row identified by the PK can be found. The address book of the user is also stored with the user, as a denormalized relation. Insert overhead is negligible. Hashed-PK is optimized for insert/retrieval when the PK is known. Excellent for OLTP systems. Now if you want to do something like figure out who knows whom, so that a given user's contacts need to be related to the contacts of all other users, then you have a different can of worms. But a straightforward address-book application, where the contacts of a given user remain "private" to that user, then a hashed primary key system is superb.

Upvotes: 1

Mike M.
Mike M.

Reputation: 12511

You can't force the data to be stored together. Are you re-sequencing the Serial upon an insert? How are you ensuring the data is "stored together"?

If you mean putting all this data in one table, then it really depends on your index structure. The more indexes on the table, the more processing that takes place on very insert. Since user tables are usually heavily queried and rarely inserted (relatively), they are usually indexed heavily, in which case inserts can be slow. The answer, as with almost every DB question is: "It depends".

Upvotes: 0

Pablo Santa Cruz
Pablo Santa Cruz

Reputation: 181280

First of all, username doesn't seem to be a primary key for your table by itself. You will probably have to use it in combination with another field if you want it to work. At this point, I would rather use your serial column as primary key, and have an index on username to answer the query get bob's contacts efficiently.

You insert will certainly become slower as your table grow. But I don't think it will be too slow to avoid following this approach.

Upvotes: 0

darioo
darioo

Reputation: 47183

It depends on the underlying database. Every implementation has something different under its sleeves.

But! Performance will almost definitely suffer if you use indexes on that table and you have many, many, many, many rows inside of it.

Upvotes: 0

Sasha Kostic
Sasha Kostic

Reputation: 46

One of the first principles in db design is data non-redundancy: your db table design doesn't comply to that principle as you have same data repeated many times. A resonable solution would be to create separate table for users, a separate table for contacts and a table for realationship between users and contacts.

Upvotes: 0

Related Questions