Reputation: 185
We are creating a website where users can create a certain profile. At the moment we already have about 662000 profiles (records in our database). The user can link certain keywords (divided into 5 categories) to their profile. They can link up to about 1250 keywords per category (no, this isn't nonsense, for certain profiles this would actually make sense). At the moment we save these keywords into an array and insert the serialized array in the profile's record in the database.
When a different user uses the search function and searches for one of the keywords, an SQL query is executed with 'WHERE keyword LIKE %keyword%'. This means that is has to go to a pretty big number of records and go through the entire serialized array for each record. Adding an index to the keyword columns is pretty tricky, since they don't have a defined max lenght (this could be 22000+ chars!).
Is there any other more sensible and practical way to go about this?
Thanks!
Upvotes: 1
Views: 234
Reputation: 204756
Use a mapping table
user_keywords TABLE
--------------------
user_id INT
keyword_id INT
users TABLE
---------------------
id INT
name VARCHAR
...
keywords TABLE
---------------------
id INT
name VARCHAR
...
You could then return all users having a specific keyword in their profile like this
select u.*
from users u
inner join user_keywords uk on uk.user_id = u.id
inner join keywords k on uk.keyword_id = k.id
where k.name = 'keyword_name'
Upvotes: 4
Reputation: 534
Since you are dealing with a large data you should use NoSQL databases such as Hadoop/Hbase, Cassandra etc. You should also take a look at Lucene/Solr...
Upvotes: 0