Reputation: 33
I hope someone can help me.
I have this dataset, called "dialogue":
turn word freq
1 A 0.18181818
1 B 0.13636364
1 C 0.31818182
1 D 0.13636364
1 F 0.13636364
1 G 0.09090909
2 A 0.25000000
2 B 0.10000000
2 C 0.25000000
2 D 0.15000000
2 F 0.10000000
2 G 0.15000000
I want to create a function which plot the change of a sigle word frequency on turn. The function must have a string argument so that anyone who uses the function can plot only the word ("A", "B", "C" ...) he want.
I tried to write the function but plotting all the words at the same, separately, using this code:
plot_word_frq <- function(x){
df_x <- data.frame(dialogue)
ggplot(dialogue,
aes(x = turn, y = p, colour = word)) +
ggtitle("Change of Word Frequency") +
theme(plot.title = element_text(hjust = 0.5)) +
theme_bw() +
geom_point() +
labs(y = "Percentage of words") +
facet_wrap(~ word) +
scale_x_continuous(limits = c(0.5, 2.5)) +
scale_y_continuous(label = scales::percent) +
theme(legend.position = "none")
}
plot_word_frq(dialogue)
But what I want is that in the argument I can select a sigle word to plot. For example, the function should work in this way:
plot_word_frq(data=dialogue, word="B")
And automatically it returns to me only the plot of the word "B". How can I do this? If I want to use another similar dataset which have NA in the column "word", how can I remove the NA in my function?
Sorry for my bad english, I hope I was clear. Thank you.
Upvotes: 0
Views: 158
Reputation: 12440
The only thing you have to change is that your data should be subsetted before plotting. You can use the base data[data$word %in% word, ]
for this, or if you prefer dplyr
's filter
function. I'm using base here since both your column and function argument are called word
which causes trouble:
plot_word_frq <- function(data, word) {
ggplot(data[data$word %in% word, ],
aes(x = turn, y = freq, colour = word)) +
ggtitle("Change of Word Frequency") +
theme(plot.title = element_text(hjust = 0.5)) +
theme_bw() +
geom_point() +
labs(y = "Percentage of words") +
facet_wrap(~ word) +
scale_x_continuous(limits = c(0.5, 2.5)) +
scale_y_continuous(label = scales::percent) +
theme(legend.position = "none")
}
plot_word_frq(data = dialogue, word = "B")
plot_word_frq(data = dialogue, word = c("B", "G"))
To differentiate the plots you could use the title. Just replace the title line with ggtitle(paste("Change of Word Frequency (words: ", toString(word), ")")) +
Upvotes: 1