Ash
Ash

Reputation: 3

Renaming data in R

I am fairly new to R and am trying to make some figures, but having trouble with renaming data. Basically, I had a super large data set from SPSS that I imported into R and created a smaller data table with one variable I am trying to look at. I was successful in getting my data into the long format, but my Time column is not represented the way I'd like.

When I got my data into the long format, I made a data Time column and the data in that column says TIME1COMPOSITE, TIME2COMPOSITE, TIME3COMPOSITE - which are the original column names from the SPSS file. I would prefer for it to instead read Time1, Time2, or Time3 (so that it can look better on the axis label for the graph I am making). Is there a simple way to do this? Either to rename the data points or to just rename the labels on the graph?

Here is an example of what my code looks like:

dt<- data.table(dt)

#Putting into long format

dt <- melt(dt, measure.vars = c("TIME1COMPOSITE", "TIME2COMPOSITE", "TIME3COMPOSITE"), variable.name = "Time", value.name = "CompositeScore")

#Computing means

dt[, meanCompositeScore:= mean(CompositeScore), by=c("Condition", "Time")]

#Plotting

plot <- ggplot(dt, aes(x=Time, y=meanCompositeScore, color=Condition)) + geom_point()

plot

Upvotes: 0

Views: 2983

Answers (1)

Joel Kandiah
Joel Kandiah

Reputation: 1525

The easiest method with the code you suggested have would be to change the column names at the beginning using the colnames() function.

colnames(dt) <- c("colname1","colname2", ...)

Another method using the tidy format would be to use the rename() function (from dplyr).

dt %>%
   rename(Time1 = TIME1COMPOSITE, Time2 = TIME2COMPOSITE, Time3 = TIME3COMPOSITE)

To change the names once the calculations have occurred you could convert the time to a factor and relabel them. We can use the as.factor() function to convert the array.


dt$Time <- as.factor(dt$Time)

revalue(dt$Time, c("Time1" = "TIME1COMPOSITE", "Time2" = "TIME2COMPOSITE", "Time3" = "TIME3COMPOSITE"))

To add the labels in the graph we can convert it to a factor and set the levels at the line we use the graph using the as.factor() function.

levels = c("Time1", "Time2", "Time3")

plot <- ggplot(dt, aes(x=as.factor(Time, levels = levels), y=meanCompositeScore, color=Condition)) + geom_point()

A final method would be to relabel the graph labels rather than the values using the scale_x_discrete() ggplot function.

plot <- ggplot(dt, aes(x=Time, y=meanCompositeScore, color=Condition)) + 
  geom_point() +
  scale_x_discrete(labels = c('Time1','Time2','Time3'))

Let me know if any method doesn't work for you and I will attempt to clarify the method or rectify the mistake.

Upvotes: 1

Related Questions