UzbeKistaN
UzbeKistaN

Reputation: 33

Create an R function which select a character element of a vector as input argument

I hope someone can help me.

I have this dataset, called "dialogue":

turn  word     freq
 1     A   0.18181818
 1     B   0.13636364
 1     C   0.31818182
 1     D   0.13636364
 1     F   0.13636364
 1     G   0.09090909
 2     A   0.25000000
 2     B   0.10000000
 2     C   0.25000000
 2     D   0.15000000
 2     F   0.10000000
 2     G   0.15000000

I want to create a function which plot the change of a sigle word frequency on turn. The function must have a string argument so that anyone who uses the function can plot only the word ("A", "B", "C" ...) he want.

I tried to write the function but plotting all the words at the same, separately, using this code:

plot_word_frq <- function(x){

  df_x <- data.frame(dialogue)


  ggplot(dialogue,
         aes(x = turn, y = p, colour = word)) +
    ggtitle("Change of Word Frequency") +
    theme(plot.title = element_text(hjust = 0.5)) +
    theme_bw() +
    geom_point() +
    labs(y = "Percentage of words") +
    facet_wrap(~ word) +
    scale_x_continuous(limits = c(0.5, 2.5)) +
    scale_y_continuous(label = scales::percent) + 
    theme(legend.position = "none")    
  }



plot_word_frq(dialogue)

But what I want is that in the argument I can select a sigle word to plot. For example, the function should work in this way:

plot_word_frq(data=dialogue, word="B")

And automatically it returns to me only the plot of the word "B". How can I do this? If I want to use another similar dataset which have NA in the column "word", how can I remove the NA in my function?

Sorry for my bad english, I hope I was clear. Thank you.

Upvotes: 0

Views: 158

Answers (1)

JBGruber
JBGruber

Reputation: 12440

The only thing you have to change is that your data should be subsetted before plotting. You can use the base data[data$word %in% word, ] for this, or if you prefer dplyr's filter function. I'm using base here since both your column and function argument are called word which causes trouble:

plot_word_frq <- function(data, word) {

  ggplot(data[data$word %in% word, ],
         aes(x = turn, y = freq, colour = word)) +
    ggtitle("Change of Word Frequency") +
    theme(plot.title = element_text(hjust = 0.5)) +
    theme_bw() +
    geom_point() +
    labs(y = "Percentage of words") +
    facet_wrap(~ word) +
    scale_x_continuous(limits = c(0.5, 2.5)) +
    scale_y_continuous(label = scales::percent) + 
    theme(legend.position = "none")  
}


plot_word_frq(data = dialogue, word = "B")

plot_word_frq(data = dialogue, word = c("B", "G"))

To differentiate the plots you could use the title. Just replace the title line with ggtitle(paste("Change of Word Frequency (words: ", toString(word), ")")) +

Upvotes: 1

Related Questions