Reputation: 245
I have replicated the word embeddings code from Ben Schmidt's excellent tutorial on word2vec in R.
With a trained model, the code below finds ALL food terms in the corpus, and plots the terms close to the y-axis "salty" or x-axis "sweet".
tastes = model[[c("sweet","salty"),average=F]]
sweet_and_saltiness = model[1:3000,] %>% cosineSimilarity(tastes)
plot(sweet_and_saltiness,type='n')
text(sweet_and_saltiness,labels=rownames(sweet_and_saltiness))
This works great, but how can I specify the food words I want to plot? Let's say I only care about "salmon" and "tuna" and want to plot those two only?
I tried to filter it out (sweet_and_saltiness = sweet_and_saltiness[c("salmon","tuna")] but it didn't work.
My apologies for not providing a reproducible example, I'm not sure how I can do so as I'm using a trained model etc.
I found a similar question here on SO but it's for Python, not R.
Edit:
sweet_and_saltiness is a matrix that contains many terms. Below are some of them:
structure(c(0.528436401795892, 1, 0.563471203034216, 0.502205073864983,
0.0589914271300194, -0.0237981616846065, -0.0657883169365425,
-0.0558463233095463, 0.15991116770716, 0.13954111689771, 0.064859364561648,
0.0109053881576116, 0.387838863143423, 0.366834478524629, 0.349148925405899,
0.338632667643554), .Dim = c(8L, 2L), .Dimnames = list(c("very",
"sweet", "red", "rich", "olive_oil", "if_necessary", "until_done",
"15_minutes"), c("sweet", "salty")))
The figures are coordinates in the plot, for all the terms (red, rich, olive_oil, etc.). My question is, how can I exclude ALL words from the plot, and focus on the words I'm interested in? (Assuming the words are in the sweet_and_saltiness matrix.
Upvotes: 1
Views: 197
Reputation: 76402
An option with ggplot2
graphics could be the following.
Before plotting, with ggplot2
graphics the format should be a data.frame in the long format and the data is a matrix in wide format. See this post on how to reshape the data from wide to long format.
library(dplyr)
library(tidyr)
library(ggplot2)
wanted_fish <- c("red", "until_done")
sweet_and_saltiness %>%
as.data.frame() %>%
mutate(fish = rownames(.)) %>%
pivot_longer(-fish) %>%
filter(fish %in% wanted_fish) %>%
ggplot(aes(name, value)) +
geom_text(aes(label = fish)) +
theme_bw()
Upvotes: 1