Reputation: 2775
Here is my dataset:
> data<-read.csv(file.choose())
> data$MaxDate<-as.character(data$MaxDate)
> data$Batch<-gsub(" ", "\\.",data$Batch)
> p<-data[1:5,]
> p
ManagerName Employee.Name Employee.ID MaxDate Batch
1 Abarrientos, Claire Vinnikov, Olga 32403 8/11/2015 Batch.47
2 Adel, Bonnie Adams, Tracy 201850
3 Adel, Bonnie Black, Chantal 213746 7/29/2011 Batch.17
4 Adel, Bonnie Brandoli, Morena 201990 7/29/2011 Batch.17
5 Adel, Bonnie Campbell, Melissa 201931
X.New.Employee.EHS.Document.Sign.off Batch.01 Batch.02 Batch.03 Batch.04
1 1 NA NA NA NA
2 NA 1 1 1 1
3 1 1 1 1 1
4 1 1 1 1 1
5 NA 1 1 1 1
Batch.06 Batch.07 Batch.08 Batch.09 Batch.10 Batch.11 Batch.18 Batch.19
1 NA NA NA NA NA NA NA NA
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1
4 1 1 1 1 1 1 1 1
5 1 1 1 1 1 1 1 1
Batch.20 Batch.22 Batch.24 Batch.25 Batch.26 Batch.27 Batch.28 Batch.29
1 NA NA NA NA NA NA NA NA
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1
4 1 1 1 1 1 1 1 1
5 1 1 1 1 1 1 1 1
Batch.30 Batch.31 Batch.32 Batch.33 Batch.34 Batch.35 Batch.36 Batch.37
1 NA NA NA NA NA NA NA NA
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1
4 1 1 1 1 1 1 1 1
5 1 1 1 1 1 1 1 1
Batch.38 Batch.39 Batch.40 Batch.41 Batch.42 Batch.43 Batch.44 Batch.45
1 NA NA NA NA NA NA NA NA
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1
4 1 1 1 1 1 1 1 1
5 1 NA NA NA NA NA NA NA
Batch.46 Batch.47
1 NA NA
2 1 1
3 NA NA
4 1 1
5 NA NA
I'm a novice R user and I'm trying to figure out how to change all column values to 1 given the Batch Number. For example, for the first row, the Batch Number is "Batch.47". Therefore, I want to replace all the values from the columns "Batch.1", "Batch.2", "Batch.3" and all the way through to "Batch.47" to "1". However, I only want to do this for rows that have a value of "1" under the New.Employee.Sign.Off column. For the second row, there is no corresponding Batch Number because "Adams, Tracy" has "NA" under employee sign-off. Therefore, I want this particular row to stay the same. Keep in mind that not all Batch Numbers are included. For example, there is no Batch Number 13 to 17.
Here's my code thus far:
for (i in 1:nrow(p)) {
if (p$X.New.Employee.EHS.Document.Sign.off[i] == 1) {
k<-which(colnames(p)==p$Batch[i])
p[i,]<-replace(p[i,],6:k[i],1)
i=i+1
}
else if (is.na(p$X.New.Employee.EHS.Document.Sign.off[i])) {
i=i+1
}
}
This produces the following error:
Error in if (p$X.New.Employee.EHS.Document.Sign.off[i] == 1) { :
missing value where TRUE/FALSE needed
Would really appreciate any guidance. Much appreciated and thanks so very much in advance!
Here are the properties of the data set:
> str(data)
'data.frame': 3372 obs. of 44 variables:
$ ManagerName : Factor w/ 209 levels "Abarrientos, Claire",..: 1 2 2 2 2 2 2 2 2 2 ...
$ Employee.Name : Factor w/ 3371 levels "Abas, Ma Cecilia",..: 3155 14 304 346 455 648 850 934 1021 1089 ...
$ Employee.ID : Factor w/ 3368 levels "(blank)","0",..: 3257 278 2025 359 325 3092 1695 2075 1043 1196 ...
$ MaxDate : chr "8/11/2015" "" "7/29/2011" "7/29/2011" ...
$ Batch : chr "Batch.47" "" "Batch.17" "Batch.17" ...
$ X.New.Employee.EHS.Document.Sign.off: int 1 NA 1 1 NA 1 1 NA NA 1 ...
$ Batch.01 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.02 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.03 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.04 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.06 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.07 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.08 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.09 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.10 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.11 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.18 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.19 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.20 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.22 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.24 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.25 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.26 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.27 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.28 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.29 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.30 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.31 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.32 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.33 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.34 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.35 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.36 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.37 : int NA 1 1 1 1 NA 1 1 1 1 ...
$ Batch.38 : int NA 1 1 1 1 NA NA 1 1 1 ...
$ Batch.39 : int NA 1 1 1 NA NA 1 1 1 1 ...
$ Batch.40 : int NA 1 1 1 NA NA 1 1 1 1 ...
$ Batch.41 : int NA 1 1 1 NA NA 1 1 1 1 ...
$ Batch.42 : int NA 1 1 1 NA NA 1 1 1 1 ...
$ Batch.43 : int NA 1 1 1 NA NA 1 1 1 1 ...
$ Batch.44 : int NA 1 1 1 NA NA 1 1 1 1 ...
$ Batch.45 : int NA 1 1 1 NA NA 1 1 1 1 ...
$ Batch.46 : int NA 1 NA 1 NA NA NA NA NA 1 ...
$ Batch.47 : int NA 1 NA 1 NA 1 1 1 NA NA ...
>
Upvotes: 0
Views: 114
Reputation: 1615
colnames(p) -> clnames
clnames[grep("Batch.", x = clnames)] -> Batchvec
apply(p[,Batchvec], 2,
function(x) ifelse(p$X.New.Employee.EHS.Document.Sign.off == 1, 1, 0)
) -> newp
Then bind newp to the columns in p that don't start with "Batch" etc..as in..
cbind(p[,"employee.sign"], newp)
Upvotes: 1