Lisa
Lisa

Reputation: 959

Reformatting data in R for a line graph

I have data (part of a larger set) that looks like this: (I can't have more than one row per vowel because of the way the entire data frame is set up)

info.df <- data.frame(
    vowelFormantF2_90 = c(1117, 1433, 2392), 
    vowelFormantF3_90 = c(2820, 3062, 2670), 
    vowelFormantF2_50 = c(1016, 1313, 2241),
    vowelFormantF3_50 = c(2842, 3150, 3038),
    previousVowel = c("U", "U", "ae"))

The 50 and 90 correspond to time (the 50% point of the duration of the vowel comes before the 90% point of the duration of the vowel).

I want to plot time as the x-axis, and the formant value (the four digit number) as the y-axis. I want to group lines in color by F2 or F3 in the column name. The previousVowel column is also important because eventually I'll want to subset my data by vowel. I planned on using ggplot2, but I'm open to other plotting methods.

I thought about doing something like this:

time <- c(50,50,50,50,50,50)
formant <- c("F2","F2","F2","F3","F3","F3")
hz <- c(info.df$vowelFormantF2_50, info.df$vowelFormantF3_50)
newdataframe.df <- data.frame(time, formant, hz)

But this seems cumbersome as this data set grows and also won't account for the vowel itself.

Is there a way to format this data in the way I want?

Upvotes: 0

Views: 66

Answers (1)

jeremycg
jeremycg

Reputation: 24945

I'd use tidyr:

library(tidyr)
df <- info.df %>% gather(var, val, -vowel) %>%
            separate(var, into = c("formant", "time"))

which will give:

   vowel        formant time  val
1      U vowelFormantF2   90 1117
2      U vowelFormantF2   90 1433
3     ae vowelFormantF2   90 2392
4      U vowelFormantF3   90 2820
5      U vowelFormantF3   90 3062
6     ae vowelFormantF3   90 2670

You can add on:

library(dplyr)
df %>% mutate(formant = sub("vowelFormant", "", formant))

to remove the vowelFormant, and just have F2, F3 etc.

Upvotes: 1

Related Questions