Raghavan vmvs
Raghavan vmvs

Reputation: 1265

Add synonyms from qdap to a preexisting dataframe in R

I have created the following dataframe df in R

Sl NO  Word
1       get
2       Free
3       Joshi
4       Hello
5       New

I have used this code to get a list of synonyms but the same are in the form of a list

        library(qdap)
        synonyms(DF$Word)

I am getting a list of synonymous words for this. I Want to get the synonymous words for each word in the dataframe appended rowwise to the dataframe as separate columns.

  DF<-
          Sl NO   Word    Syn1          Syn2
          1       get     obtain        receive
          2       Free    independent   NA
          3       Joshi   NA            NA
          4       Hello   Greeting      NA
          5       New      Unused       Fresh

Is there an elegant way to obtain this.Are there other dictionaries that can be used for this.

Upvotes: 1

Views: 832

Answers (3)

Prem
Prem

Reputation: 11955

I am not sure how exactly would you like to add all synonyms of a word because when you run synonyms("get") it gives 75 definitions of get and I feel that the desired layout will not be of much help if you add all values of 75 definitions in a single row.

So in below solution I have selected the very first definition only.

library(qdap)
library(dplyr)
library(splitstackshape)

df %>%
  rowwise() %>%
  mutate(synonym_of_word = paste(synonyms(tolower(word))[[1]], collapse=",")) %>%
  cSplit("synonym_of_word", ",")

Sample data:

df <- structure(list(sl_no = 1:5, word = c("get", "Free", "Joshi", 
"Hello", "New")), .Names = c("sl_no", "word"), class = "data.frame", row.names = c(NA, 
-5L))

Upvotes: 0

hpesoj626
hpesoj626

Reputation: 3629

Here is another approach with splitstackshape::cSplit.

library(tidyverse)
library(qdap)
library(splitstackshape)

DF <- read.table(text = tt, header = T)
DF <- DF %>% mutate_at(vars(Word), tolower)
syns <- synonyms_frame(synonyms(tolower(DF$Word))) %>%
  mutate_at(vars(x), funs(str_remove(x, "\\..*"))) %>%
  mutate_at(vars(y), funs(str_extract(y, '[:alpha:]+'))) %>%
  group_by(x) %>%
  summarise(Syn = toString(y)) %>%
  rename(Word = x) %>% cSplit('Syn')

left_join(DF, syns)

Upvotes: 1

MKR
MKR

Reputation: 20095

One approach could be to use mapply and pass each word at a time to qdap::synonyms. The result from 'synonyms' can be collapsed in a column using paste0 function with collapse = "|". Now data is ready. Use tidyr::separate to separate columns into Syn1, Syn2 etc.

Note: synonyms is called with two arguments as return.list = FALSE, multiwords = FALSE

The below code has limit on maximum 10 synonyms but solution can be evolved to handle number dynamically.

library(tidyverse)
library(qdap)
df %>% 
mutate(Synonyms = 
mapply(function(x)paste0(
head(synonyms(x, return.list = FALSE, multiwords = FALSE),10), collapse = "|"), 
tolower(.$Word))) %>%
separate(Synonyms, paste("Syn",1:10), sep = "\\|", extra = "drop" )

Result:

# SlNO  Word    Syn 1         Syn 2       Syn 3        Syn 4   Syn 5     Syn 6       Syn 7           Syn 8     Syn 9      Syn 10
# 1    1   get  achieve       acquire      attain          bag   bring      earn       fetch            gain     glean     inherit
# 2    2  Free buckshee complimentary      gratis   gratuitous  unpaid footloose independent       liberated     loose uncommitted
# 3    3 Joshi                   <NA>        <NA>         <NA>    <NA>      <NA>        <NA>            <NA>      <NA>        <NA>
# 4    4 Hello                   <NA>        <NA>         <NA>    <NA>      <NA>        <NA>            <NA>      <NA>        <NA>
# 5    5   New advanced   all-singing all-dancing contemporary current different       fresh ground-breaking happening      latest

Data

df <- read.table(text = 
"SlNO  Word
1       get
2       Free
3       Joshi
4       Hello
5       New", 
header = TRUE, stringsAsFactors = FALSE)

Upvotes: 1

Related Questions