Reputation: 1265
I have created the following dataframe df in R
Sl NO Word
1 get
2 Free
3 Joshi
4 Hello
5 New
I have used this code to get a list of synonyms but the same are in the form of a list
library(qdap)
synonyms(DF$Word)
I am getting a list of synonymous words for this. I Want to get the synonymous words for each word in the dataframe appended rowwise to the dataframe as separate columns.
DF<-
Sl NO Word Syn1 Syn2
1 get obtain receive
2 Free independent NA
3 Joshi NA NA
4 Hello Greeting NA
5 New Unused Fresh
Is there an elegant way to obtain this.Are there other dictionaries that can be used for this.
Upvotes: 1
Views: 832
Reputation: 11955
I am not sure how exactly would you like to add all synonyms of a word because when you run synonyms("get")
it gives 75 definitions of get
and I feel that the desired layout will not be of much help if you add all values of 75 definitions in a single row.
So in below solution I have selected the very first definition only.
library(qdap)
library(dplyr)
library(splitstackshape)
df %>%
rowwise() %>%
mutate(synonym_of_word = paste(synonyms(tolower(word))[[1]], collapse=",")) %>%
cSplit("synonym_of_word", ",")
Sample data:
df <- structure(list(sl_no = 1:5, word = c("get", "Free", "Joshi",
"Hello", "New")), .Names = c("sl_no", "word"), class = "data.frame", row.names = c(NA,
-5L))
Upvotes: 0
Reputation: 3629
Here is another approach with splitstackshape::cSplit
.
library(tidyverse)
library(qdap)
library(splitstackshape)
DF <- read.table(text = tt, header = T)
DF <- DF %>% mutate_at(vars(Word), tolower)
syns <- synonyms_frame(synonyms(tolower(DF$Word))) %>%
mutate_at(vars(x), funs(str_remove(x, "\\..*"))) %>%
mutate_at(vars(y), funs(str_extract(y, '[:alpha:]+'))) %>%
group_by(x) %>%
summarise(Syn = toString(y)) %>%
rename(Word = x) %>% cSplit('Syn')
left_join(DF, syns)
Upvotes: 1
Reputation: 20095
One approach could be to use mapply
and pass each word at a time to qdap::synonyms
. The result from 'synonyms' can be collapsed in a column using paste0
function with collapse = "|"
. Now data is ready.
Use tidyr::separate
to separate columns into Syn1
, Syn2
etc.
Note: synonyms
is called with two arguments as return.list = FALSE, multiwords = FALSE
The below code has limit on maximum 10
synonyms but solution can be evolved to handle number dynamically.
library(tidyverse)
library(qdap)
df %>%
mutate(Synonyms =
mapply(function(x)paste0(
head(synonyms(x, return.list = FALSE, multiwords = FALSE),10), collapse = "|"),
tolower(.$Word))) %>%
separate(Synonyms, paste("Syn",1:10), sep = "\\|", extra = "drop" )
Result:
# SlNO Word Syn 1 Syn 2 Syn 3 Syn 4 Syn 5 Syn 6 Syn 7 Syn 8 Syn 9 Syn 10
# 1 1 get achieve acquire attain bag bring earn fetch gain glean inherit
# 2 2 Free buckshee complimentary gratis gratuitous unpaid footloose independent liberated loose uncommitted
# 3 3 Joshi <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 4 Hello <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 5 New advanced all-singing all-dancing contemporary current different fresh ground-breaking happening latest
Data
df <- read.table(text =
"SlNO Word
1 get
2 Free
3 Joshi
4 Hello
5 New",
header = TRUE, stringsAsFactors = FALSE)
Upvotes: 1