Reputation: 45
My data looks a bit like:
dummy.from <- data.frame(SetID = rep(c(104:109), times=4), Name = rep(c("A1", "A2", "A3", "A4"), each=6), Value=sample(c(1:100,0.5), 24) )
So:
SetID Name Value
1 104 A1 82
2 105 A1 79
3 106 A1 54
4 107 A1 87
5 108 A1 62
6 109 A1 28
7 104 A2 37
8 105 A2 72
9 106 A2 100
10 107 A2 64
11 108 A2 14
...
Basically, what I want to do is to transfer part of the data to another data frame, based on another value (not shown) calculated separately for each SetID
.
For that I use a for loop like:
dummy.to <- data.frame(SetID=numeric(0), Name=character(0), value=numeric(0), stringsAsFactors=FALSE)
for(i in 104:109){
dummy.to[(nrow(dummy.to)+1):(nrow(dummy.to)+4),] <- dummy.from[dummy.from$SetID==i,]
}
The problem that I encounter is that just looking at the latter part of the code (dummy.from[dummy.from$SetID==i,])
is just the way I want it to be stored, when I then look at dummy.to, for some reason the Name column has been converted to numbers like this:
> dummy.to
SetID Name value
1 104 1 82
7 104 2 37
13 104 3 52
19 104 4 73
2 105 1 79
8 105 2 72
14 105 3 91
....
Although strangely, when looking at the structure (str(dummy)
), the Name column is still of type character.
I'm really confused about this, as I'd like my names to show up as they were in the original data.frame.
I know I could just create a loop to change the names back, but I'm wondering if there's something that I'm overlooking in the code which could be causing this behaviour.
Any advice would be much appreciated!
Upvotes: 0
Views: 217
Reputation: 28441
Your for loop is sorting the data frame by the "SetID" column. There is a function for that called order
dummy.from[order(dummy.from$SetID),]
Or using the devel version of data.table
you can order you data by reference. Link here: Installation: data.table
library(data.table) ## v 1.9.5+
setorder(dummy.from, SetID)
Upvotes: 1
Reputation: 319
I'm not sure that I understood what you need to do but it seems, in first place, that you don't need any for
loop, Insted, to obtain the result you need:
dummy.to <- dummy.from[dummy.from$SetID==104:109,]
The problem you mentioned about thwe types is because the Name column in dummy.from is not character but numeric, because it is a factor.
Upvotes: 0
Reputation: 1569
data.frame auto sets any strings to factors. You want to change that.
dummy.from <- data.frame(SetID = rep(c(104:109), times=4), Name = rep(c("A1", "A2", "A3", "A4"), each=6), Value=sample(c(1:100,0.5), 24) )
str(dummy.from)
'data.frame': 24 obs. of 3 variables:
$ SetID: int 104 105 106 107 108 109 104 105 106 107 ...
$ Name : Factor w/ 4 levels "A1","A2","A3",..: 1 1 1 1 1 1 2 2 2 2 ...
$ Value: num 37 9 69 38 93 71 91 34 86 51 ...
Here's what you want
dummy.from <- data.frame(SetID = rep(c(104:109), times=4), Name = rep(c("A1", "A2", "A3", "A4"), each=6), Value=sample(c(1:100,0.5), 24), stringsAsFactors = F) #your desired output just requires stringsAsFactors = F
> str(dummy.from)
'data.frame': 24 obs. of 3 variables:
$ SetID: int 104 105 106 107 108 109 104 105 106 107 ...
$ Name : chr "A1" "A1" "A1" "A1" ...
$ Value: num 80 46 61 52 38 9 7 59 15 56 ...
Upvotes: 1