Lost
Lost

Reputation: 351

Using unique() when naming dataframe columns

I am having trouble naming a data frame I reshaped. Using just reshape, I get the wrong titles, so I tried to name them myself but I cannot get the right names in the right spots.

df<-data.frame(color=rep(c("red", "blue", "green"), 10), letter=c(letter=c("a", "b", "c", "d", "e", "b", "c", "d", "e", "f", "c", "d", "e", "f", "g", "d", "e", "f", "g", "h", "e", "b", "c", "d", "e", "f", "c", "d", "e", "f"))
b<-as.data.frame(table(df))
c<-reshape(b, direction="wide", idvar="color", timevar="letter")

  color Freq.a Freq.b Freq.c Freq.d Freq.e Freq.f Freq.g Freq.h
1  blue      0      1      2      1      3      2      0      1
2 green      0      1      2      2      2      2      1      0
3   red      1      1      1      3      2      1      1      0

To get rid of the "Freq.", I added used names but this didn't give the right numbers for the column names. This happens for anything I name for the first column.

names(c)<-c("color", unique(b$letter))
  color 1 2 3 4 5 6 7 8
1  blue 0 1 2 1 3 2 0 1
2 green 0 1 2 2 2 2 1 0
3   red 1 1 1 3 2 1 1 0

I tried just unique without concatenating something for the first column, and the correct numbers are column names, but obviously they are in the wrong place. How can I get the right unique values over the correct columns?

names(c)<-unique(b$letter)

      a b c d e f g h NA
1  blue 0 1 2 1 3 2 0  1
2 green 0 1 2 2 2 2 1  0
3   red 1 1 1 3 2 1 1  0

Upvotes: 0

Views: 56

Answers (2)

acylam
acylam

Reputation: 18661

your b$letter column is a factor (unique(b$letter) will also be a factor), hence when being concatenated with a character, R implicitly coerces its "values" (not "levels") to character, giving you numbers.

df <- data.frame(color=rep(c("red", "blue", "green"), 10), 
               letter=c(letter=c("a", "b", "c", "d", "e", 
                                 "b", "c", "d", "e", "f", 
                                 "c", "d", "e", "f", "g", 
                                 "d", "e", "f", "g", "h", 
                                 "e", "b", "c", "d", "e", 
                                 "f", "c", "d", "e", "f")))

b <- as.data.frame(table(df))
c <- reshape(b, direction="wide", idvar="color", timevar="letter")

You can easily verify this by comparing the following:

> unique(b$letter)
[1] a b c d e f g h
Levels: a b c d e f g h

> class(unique(b$letter))
[1] "factor"

> as.character(unique(b$letter))
[1] "a" "b" "c" "d" "e" "f" "g" "h"

> class(as.character(unique(b$letter)))
[1] "character"

To solve this, it's as simple as using the second version:

names(c) <- c("color", as.character(unique(b$letter)))

Alternatively, you can also use sub to remove "Freq." from names(c) (which IMO is a safer and easier approach):

names(c) <- sub('^Freq\\.', '', names(c))

Result:

  color a b c d e f g h
1  blue 0 1 2 1 3 2 0 1
2 green 0 1 2 2 2 2 1 0
3   red 1 1 1 3 2 1 1 0

Upvotes: 1

mysteRious
mysteRious

Reputation: 4294

Is this what you mean?

> setNames(reshape(b, timevar="numbers", idvar="color", direction="wide"), 
      c("Name", unique(b$numbers)))
   Name 1 2 3 4 5 6 7 8
1  blue 0 1 2 1 3 2 0 1
2 green 0 1 2 2 2 2 1 0
3   red 1 1 1 3 2 1 1 0

Upvotes: 1

Related Questions