Reputation:
I want to convert my data from long into wide format. If find it really annoying that reshape
changes the column labels, is there any way to avoid that?
For example if I would have data in long format like this:
Year Name Value
1996 a 1
1997 a 2
1998 b 3
1999 b 4
I would use reshape()
like this
reshape(long, timevar = "Year", idvar = "Name", direction = "wide")
The conversion itself works fine, but the column names would look like these:
Name Time.1996 Time.1997 Time.1998 Time.1999
The only thing I can think of right now is to rename the column names manually, which is really annoying if you have to do it for multiple data sets.
Upvotes: 10
Views: 9044
Reputation: 193547
I don't see a question here, but if it bothers you that much, there are a few alternatives.
In base R, you can use xtabs
(a lot of the time) for this type of conversion.
as.data.frame.matrix(xtabs(Value ~ Name + Year, long))
# 1996 1997 1998 1999
# a 1 2 0 0
# b 0 0 3 4
Or, resorting to packages, you can use dcast
from "reshape2".
library(reshape2)
dcast(long, Name ~ Year)
# Using Value as value column: use value.var to override.
# Name 1996 1997 1998 1999
# 1 a 1 2 NA NA
# 2 b NA NA 3 4
I do agree that with a single variable being reshaped in this manner, it can be annoying to have the names modified the way they are. However, keep in mind other typical use-cases of the reshape
function, and you'll see it can be quite helpful. For example, let's add another column to your "long" dataset and then reshape:
> long$Something <- 5:8
> reshape(long, timevar = "Year", idvar = "Name", direction = "wide")
Name Value.1996 Something.1996 Value.1997 Something.1997 Value.1998
1 a 1 5 2 6 NA
3 b NA NA NA NA 3
Something.1998 Value.1999 Something.1999
1 NA NA NA
3 7 4 8
Without retaining the name of the column that is being reshaped, you wouldn't know which 1996 is for "Value" and which is for "Something".
Finally, if you wanted to use reshape
(which I happen to like), and your actual data are accurately reflected by this example (just one column that needs to be "pivoted" by one or more columns), here are two more options:
Use setNames
. It's easy to predict what the names should be: a vector of your ID variables, followed by the time variable.
> setNames(reshape(long, timevar = "Year", idvar = "Name", direction = "wide"),
+ c("Name", long$Year))
Name 1996 1997 1998 1999
1 a 1 2 NA NA
3 b NA NA 3 4
Use sub
or gsub
to edit the variable names. Again, since we know we just have one variable that was pivoted, it is easy to predict the change that needs to be made.
> wide <- reshape(long, timevar = "Year", idvar = "Name", direction = "wide")
> names(wide)
[1] "Name" "Value.1996" "Value.1997" "Value.1998" "Value.1999"
> names(wide) <- gsub("Value.", "", names(wide))
> wide
Name 1996 1997 1998 1999
1 a 1 2 NA NA
3 b NA NA 3 4
Upvotes: 23