Reputation: 35
I created a table that is populated by the responses people stated as the first thing that came into their mind when they viewed a photo. I have ~1400 entries. Now, I want to see what is the most common description.
CREATE TABLE descript (
wordID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
wordText TEXT(50)
)
ENGINE=MyISAM;
INSERT INTO descript VALUES(0,"Big");
INSERT INTO descript VALUES(0,"blue");
INSERT INTO descript VALUES(0,"blue");
INSERT INTO descript VALUES(0,"fast");
INSERT INTO descript VALUES(0,"impressive");
INSERT INTO descript VALUES(0,"big");
INSERT INTO descript VALUES(0,"big");
INSERT INTO descript VALUES(0,"red");
INSERT INTO descript VALUES(0,"his");
INSERT INTO descript VALUES(0,"her");
INSERT INTO descript VALUES(0,"His");
INSERT INTO descript VALUES(0,"Black");
INSERT INTO descript VALUES(0,"black");
INSERT INTO descript VALUES(0,"black");
INSERT INTO descript VALUES(0,"blue");
INSERT INTO descript VALUES(0,"a black");
INSERT INTO descript VALUES(0,"his");
INSERT INTO descript VALUES(0,"her");
INSERT INTO descript VALUES(0,"pleasant");
INSERT INTO descript VALUES(0,"the fast");
INSERT INTO descript VALUES(0,"blue");
and on and on and on ....
I have to make it so it is lower case, which is done with this:
select LOWER(wordText) descript;
How do I go about having it count the most common answer and display it? I have a few stop words (that I do not want to be included in the count such as 'a" or 'the'. How do I go about having them not counted?
Upvotes: 3
Views: 263
Reputation: 1269463
The basic query is:
SELECT lower(wordText) as word, count(*)
FROM descript
GROUP BY lower(wordText)
ORDER BY count(*) DESC
LIMIT 1;
You can remove stop words using not in
, if you want to include them in the query:
SELECT lower(wordText) as word, count(*)
FROM descript
WHERE lower(wordText) not in ('a', 'the', . . . )
GROUP BY lower(wordText)
ORDER BY count(*) DESC
LIMIT 1;
Alternatively, if you have them in a table:
SELECT lower(sw.wordText) as word, count(*)
FROM descript d left join
stopwords sw
on d.wordText = sw.word
WHERE sw.word is not null
GROUP BY lower(sw.wordText)
ORDER BY count(*) DESC
LIMIT 1;
You can learn about the stop words included with MySQL here.
Upvotes: 1
Reputation: 37
As per getting the most frequent value you can use this query.
SELECT wordText, count(*) FROM descript GROUP BY wordText ORDER BY count(*) DESC LIMIT 1;
Upvotes: 0
Reputation: 715
IF you do a
SELECT COUNT(LOWER(wordText)) FROM descript GROUP BY LOWER(wordText);
you should be able to see how many of each word there are.
You can add an
ORDER BY
clause to arrange them based on the count of each result
Upvotes: 0