Yasin Yaqoobi
Yasin Yaqoobi

Reputation: 2040

Wordnet sqlite Synonyms and Samples

I am trying to get the list of Synonyms and Samples given the wordid. After a lot of trial and error I can get the samples for all the synsets but not the actual synonyms. Here is my query which gives me the following results.

select senses.wordid, senses.synsetid, senses.sensekey, synsets.definition FROMsenses LEFT OUTER JOINsynsetsON senses.synsetid = synsets.synsetid where senses.wordid = 79459

enter image description here

I know you can get the synonyms by submiting the synsetid back to the senses table which gives you unique wordid and sensekey which you can then join with the words table. My problem is I can't seem to build that query.

I would like to get these columns if possible. If not synsetid, lemma and definition would do. The current database is mySql but I am hoping the answer would also be applicable to sqlite, since I am using this for an android APP.

wordid, lemma, senseid, synsetid, definition

schema:

CREATE TABLE `synsets` (
  `synsetid` int(10) unsigned NOT NULL DEFAULT '0',
  `pos` enum('n','v','a','r','s') NOT NULL,
  `definition` mediumtext,
  PRIMARY KEY (`synsetid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;


CREATE TABLE `words` (
  `wordid` int(10) unsigned NOT NULL DEFAULT '0',
  `lemma` varchar(80) NOT NULL,
  `mantiq` varchar(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`wordid`),
  UNIQUE KEY `unq_words_lemma` (`lemma`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;


CREATE TABLE `senses` (
  `wordid` int(10) unsigned NOT NULL DEFAULT '0',
  `synsetid` int(10) unsigned NOT NULL DEFAULT '0',
  `senseid` int(10) unsigned DEFAULT NULL,
  `sensekey` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`wordid`,`synsetid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `samples` (
  `synsetid` int(10) unsigned NOT NULL DEFAULT '0',
  `sampleid` smallint(5) unsigned NOT NULL DEFAULT '0',
  `sample` mediumtext NOT NULL,
  PRIMARY KEY (`synsetid`,`sampleid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Link to database: https://cloud.generatedesign.com/index.php/s/LA2G8ZvqNClqHFN

Upvotes: 0

Views: 1122

Answers (2)

Uueerdo
Uueerdo

Reputation: 15951

I'm not sure I exactly understand the question, but wouldn't something like this work?

SELECT s1.wordid, s1.synsetid, s1.sensekey, synsets.definition
   , s2.wordid AS matchedWordID, w.*  -- Additional info not from question's query
FROM senses AS s1
   LEFT JOIN synsets ON s1.synsetid = synsets.synsetid
   LEFT JOIN senses AS s2 ON s1.synsetid = s2.synsetid AND s1.wordid <> s2.wordid
   LEFT JOIN words AS w ON s2.wordid = w.wordid
WHERE s1.wordid = 79459
;

Note: ... is just short hand for the list of fields you actually want.

Note#2: You can of course JOIN to samples using the synsets reference, but keep in mind the results would be repeated for every word pair and sample; and it is possible some word pairs may be repeated if they are synonyms in multiple meanings.

Upvotes: 2

Spade
Spade

Reputation: 2280

EDIT after schema publication

It appears that the senses table captures all relationships between each word and all its synsets and should be used in conjunction with inner join with words and synsets tables to to unravel all relationships

select sen.wordid,
         w.lemma,
         w.mantiq,
       sen.senseid,
       sen.synsetid,
       syn.definition,
from senses sen 
inner join words w on sen.wordid = w.wordid
inner join synsets syn on sen.synsetid = syn.synsetid
order by sen.wordid, sen.synsetid;

You do not need a LEFT JOIN since the fields you join upon do not appear to be nullable.

Upvotes: 0

Related Questions