Jd Baba
Jd Baba

Reputation: 6118

Change data in wide form to long form in R

Hi I have a data as follows:

  SN       X1      Y1       X2      Y2       X3      Y3       X4      Y4
1  1 512398.4 3001847 512397.4 3002770 513305.4 3002771 513306.4 3001848
2  2 513306.4 3001848 513305.4 3002771 514213.4 3002772 514214.4 3001849
3  3 514214.4 3001849 514213.4 3002772 515121.5 3002773 515123.2 3001850
4  4 515123.2 3001850 515121.5 3002773 516030.1 3002774 516031.4 3001852
5  5 516031.4 3001852 516030.1 3002774 516938.4 3002776 516939.7 3001853
6  6 516939.7 3001853 516938.4 3002776 517846.6 3002777 517848.2 3001854

I want to change this data such that the output would look like this:

512398.4 3001847
512397.4 3002770
513305.4 3002771
513306.4 3001848
................
517848.2 3001854

I could use the reshape2 package to change the columns to variables and then values on one column but not sure how to do that automatically as I have thousands of rows.

dput of data is as follows:

structure(list(SN = 1:6, X1 = c(512398.4, 513306.4, 514214.4, 
515123.2, 516031.4, 516939.7), Y1 = c(3001846.8, 3001847.8, 3001848.8, 
3001850, 3001851.5, 3001852.8), X2 = c(512397.4, 513305.4, 514213.4, 
515121.5, 516030.1, 516938.4), Y2 = c(3002769.8, 3002770.8, 3002771.8, 
3002772.5, 3002774, 3002775.5), X3 = c(513305.4, 514213.4, 515121.5, 
516030.1, 516938.4, 517846.6), Y3 = c(3002770.8, 3002771.8, 3002772.5, 
3002774, 3002775.5, 3002776.8), X4 = c(513306.4, 514214.4, 515123.2, 
516031.4, 516939.7, 517848.2), Y4 = c(3001847.8, 3001848.8, 3001850, 
3001851.5, 3001852.8, 3001853.8)), .Names = c("SN", "X1", "Y1", 
"X2", "Y2", "X3", "Y3", "X4", "Y4"), row.names = c(NA, 6L), class = "data.frame")

Upvotes: 1

Views: 75

Answers (2)

akrun
akrun

Reputation: 887058

You could also do this with melt from the devel version of data.table (v1.9.5) by specifying the column index in the measure.vars.

library(data.table)
melt(setDT(mydf), measure.vars=list(grep('X', names(mydf)), 
  grep('Y', names(mydf))), value.name=c('X', 'Y'), variable.name='Var')
#    SN Var        X       Y
# 1:  1   1 512398.4 3001847
# 2:  2   1 513306.4 3001848
# 3:  3   1 514214.4 3001849
# 4:  4   1 515123.2 3001850
# 5:  5   1 516031.4 3001852
# 6:  6   1 516939.7 3001853
# 7:  1   2 512397.4 3002770
# 8:  2   2 513305.4 3002771
# 9:  3   2 514213.4 3002772
#10:  4   2 515121.5 3002773
#11:  5   2 516030.1 3002774
#12:  6   2 516938.4 3002776
#13:  1   3 513305.4 3002771
#14:  2   3 514213.4 3002772
#15:  3   3 515121.5 3002773
#16:  4   3 516030.1 3002774
#17:  5   3 516938.4 3002776
#18:  6   3 517846.6 3002777
#19:  1   4 513306.4 3001848
#20:  2   4 514214.4 3001849
#21:  3   4 515123.2 3001850
#22:  4   4 516031.4 3001852
#23:  5   4 516939.7 3001853
#24:  6   4 517848.2 3001854

Upvotes: 3

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

You can try merged.stack from my "splitstackshape" package:

library(splitstackshape)
merged.stack(mydf, var.stubs = c("X", "Y"), sep = "var.stubs")
#     SN .time_1        X       Y
#  1:  1       1 512398.4 3001847
#  2:  1       2 512397.4 3002770
#  3:  1       3 513305.4 3002771
#  4:  1       4 513306.4 3001848
#  5:  2       1 513306.4 3001848
#  6:  2       2 513305.4 3002771
#  7:  2       3 514213.4 3002772
#  8:  2       4 514214.4 3001849
#  9:  3       1 514214.4 3001849
# 10:  3       2 514213.4 3002772
# 11:  3       3 515121.5 3002772
# 12:  3       4 515123.2 3001850
# 13:  4       1 515123.2 3001850
# 14:  4       2 515121.5 3002772
# 15:  4       3 516030.1 3002774
# 16:  4       4 516031.4 3001852
# 17:  5       1 516031.4 3001852
# 18:  5       2 516030.1 3002774
# 19:  5       3 516938.4 3002776
# 20:  5       4 516939.7 3001853
# 21:  6       1 516939.7 3001853
# 22:  6       2 516938.4 3002776
# 23:  6       3 517846.6 3002777
# 24:  6       4 517848.2 3001854
#     SN .time_1        X       Y

The base R approach would be to use reshape, like this:

reshape(mydf, direction = "long", idvar = "SN", varying = 2:ncol(mydf), sep = "")

Upvotes: 3

Related Questions