stalepretzel
stalepretzel

Reputation: 15913

Count occurrences of a word in a row in MySQL

I'm making a search function for my website, which finds relevant results from a database. I'm looking for a way to count occurrences of a word, but I need to ensure that there are word boundaries on both sides of the word ( so I don't end up with "triple" when I want "rip").

Does anyone have any ideas?


People have misunderstood my question:

How can I count the number of such occurences within a single row?

Upvotes: 6

Views: 11078

Answers (9)

Akhil
Akhil

Reputation: 2602

create a user defined function like this and use it in your query

DELIMITER $$

CREATE FUNCTION `getCount`(myStr VARCHAR(1000), myword VARCHAR(100))
    RETURNS INT
    BEGIN
    DECLARE cnt INT DEFAULT 0;
    DECLARE result INT DEFAULT 1;

    WHILE (result > 0) DO
    SET result = INSTR(myStr, myword);
    IF(result > 0) THEN 
        SET cnt = cnt + 1;
        SET myStr = SUBSTRING(myStr, result + LENGTH(myword));
    END IF;
    END WHILE;
    RETURN cnt;    

    END$$

DELIMITER ;

Hope it helps Refer This

Upvotes: 1

user423443
user423443

Reputation: 19

You can overcome the issue of mysql's case-sensitive REPLACE() function by using LOWER().

Its sloppy, but on my end this query runs pretty fast.

To speed things along I retrieve the resultset in a select which I have declared as a derived table in my 'outer' query. Since mysql already has the results at this point, the replace method works pretty quickly.

I created a query similar to the one below to search for multiple terms in multiple tables and multiple columns. I obtain a 'relevance' number equivalent to the sum of the count of all occurrances of all found search terms in all columns searched

SELECT DISTINCT ( 
((length(x.ent_title) - length(replace(LOWER(x.ent_title),LOWER('there'),''))) / length('there')) 
+ ((length(x.ent_content) - length(replace(LOWER(x.ent_content),LOWER('there'),''))) / length('there'))
 + ((length(x.ent_title) - length(replace(LOWER(x.ent_title),LOWER('another'),''))) / length('another')) 
+ ((length(x.ent_content) - length(replace(LOWER(x.ent_content),LOWER('another'),''))) / length('another')) 
) as relevance, 
x.ent_type, 
x.ent_id, 
x.this_id as anchor,
page.page_name
FROM ( 
(SELECT 
'Foo' as ent_type, 
sp.sp_id as ent_id, 
sp.page_id as this_id, 
sp.title as ent_title, 
sp.content as ent_content,
sp.page_id as page_id
FROM sp
WHERE (sp.title LIKE '%there%' OR sp.content LIKE '%there%' OR sp.title LIKE '%another%' OR sp.content LIKE '%another%' ) AND (sp_content.title NOT LIKE '%goes%' AND sp_content.content NOT LIKE '%goes%')
) UNION (
  [search a different table here.....]
)
) as x
JOIN page ON page.page_id = x.page_id 
WHERE page.rstatus = 'ACTIVE'
ORDER BY relevance DESC, ent_title;

Hope this helps someone

-- Seacrest out

Upvotes: 1

Paul
Paul

Reputation: 2759

If you want a search I would advise something like Sphinx or Lucene, I find Sphinx (as an independent full text indexer) to be a lot easier to set up and run. It runs fast, and generates the indexes very fast. Even if you were using MyISAM I would suggest using it, it has a lot more power than a full text index from MyISAM.

It can also integrate (somewhat) with MySQL.

Upvotes: 0

wlk
wlk

Reputation: 5785

It depends on what DBMS you are using, some allow writing UDFs that could do this.

Upvotes: -3

Alison R.
Alison R.

Reputation: 4294

This is not the sort of thing that relational databases are very good at, unless you can use fulltext indexing, and you have already stated that you cannot, since you're using InnoDB. I'd suggest selecting your relevant rows and doing the word count in your application code.

Upvotes: 2

Slava Popov
Slava Popov

Reputation: 142

You can try this perverted way:

SELECT 
(LENGTH(field) - LENGTH(REPLACE(field, 'word', ''))) / LENGTH('word') AS `count`
ORDER BY `count` DESC
  • This query can be very slow
  • It looks pretty ugly
  • REPLACE() is case-sensitive

Upvotes: 1

Rishi Agarwal
Rishi Agarwal

Reputation: 1185

I have used the technique as described in the link below. The method uses length and replace functions of MySQL.

Keyword Relevance

Upvotes: 0

ʞɔıu
ʞɔıu

Reputation: 48406

Something like LIKE or REGEXP will not scale (unless it's a leftmost prefix match).

Consider instead using a fulltext index for what you want to do.

select count(*) from yourtable where match(title, body) against ('some_word');

Upvotes: 0

Alex
Alex

Reputation: 2726

Something like this should work:

select count(*) from table where fieldname REGEXP '[[:<:]]word[[:>:]]';

The gory details are in the MySQL manual, section 11.4.2.

Upvotes: 0

Related Questions