Reputation: 411
Let's say I have a data frame with numerical values like this
AA01.AVG_Beta AA02.AVG_Beta AA03.AVG_Beta AA04.AVG_Beta AA05.AVG_Beta
1 0.15851770 0.44264830 0.46662180 0.79579230 0.555430100
2 0.87148450 0.93462340 0.92591830 0.93812860 0.942683400
3 0.60907060 0.92463760 0.62698660 0.86852790 0.457659300
4 0.10728340 0.07848221 0.06340047 0.08589865 0.118239800
5 0.72353630 0.91198210 0.87339600 0.88050440 0.902925300
6 0.52616050 0.57114700 0.29431990 0.56032260 0.530103800
7 0.50321330 0.78129660 0.26986880 0.77825860 0.924097500
8 0.47808630 0.11267250 0.30519660 0.36128510 0.741012600
9 0.17698960 0.11461960 0.57776080 0.37801670 0.465766500
10 0.01268375 0.01370702 0.01194124 0.01227029 0.009222724
I want to change all numerical values to letter in each row using these conditions
Avg beta 0-0.2 change to AA, Avg beta 0.4-0.6 change to AB, Avg beta 0.8-1 change to BB
So I wrote something like that
apply(table, 2, function(x) ifelse (x>0 & x< 0.2, "AA",ifelse(x>0.4 & x<0.6,"AB",
+ "BB")) )
But I get this
AA01.AVG_Beta AA02.AVG_Beta AA03.AVG_Beta AA04.AVG_Beta AA05.AVG_Beta
[1,] "AA" NA NA NA NA
[2,] "BB" NA NA NA NA
[3,] "BB" NA NA NA NA
[4,] "AA" NA NA NA NA
[5,] "BB" NA NA NA NA
[6,] "AB" NA NA NA NA
[7,] "AB" NA NA NA NA
[8,] "AB" NA NA NA NA
[9,] "AA" NA NA NA NA
[10,] "AA" NA NA NA NA
only the first column maybe I am missing something related with for loops?
Thanks in advance
Upvotes: 3
Views: 9785
Reputation: 179388
Use sapply
instead of apply
:
Recreate your data:
dat <- read.table(text="
AA01.AVG_Beta AA02.AVG_Beta AA03.AVG_Beta AA04.AVG_Beta AA05.AVG_Beta
1 0.15851770 0.44264830 0.46662180 0.79579230 0.555430100
2 0.87148450 0.93462340 0.92591830 0.93812860 0.942683400
3 0.60907060 0.92463760 0.62698660 0.86852790 0.457659300
4 0.10728340 0.07848221 0.06340047 0.08589865 0.118239800
5 0.72353630 0.91198210 0.87339600 0.88050440 0.902925300
6 0.52616050 0.57114700 0.29431990 0.56032260 0.530103800
7 0.50321330 0.78129660 0.26986880 0.77825860 0.924097500
8 0.47808630 0.11267250 0.30519660 0.36128510 0.741012600
9 0.17698960 0.11461960 0.57776080 0.37801670 0.465766500
10 0.01268375 0.01370702 0.01194124 0.01227029 0.009222724
")
Use sapply
:
sapply(dat, function(x)
ifelse (x>0 & x< 0.2, "AA",ifelse(x>0.4 & x<0.6,"AB", "BB"))
)
AA01.AVG_Beta AA02.AVG_Beta AA03.AVG_Beta AA04.AVG_Beta AA05.AVG_Beta
[1,] "AA" "AB" "AB" "BB" "AB"
[2,] "BB" "BB" "BB" "BB" "BB"
[3,] "BB" "BB" "BB" "BB" "AB"
[4,] "AA" "AA" "AA" "AA" "AA"
[5,] "BB" "BB" "BB" "BB" "BB"
[6,] "AB" "AB" "BB" "AB" "AB"
[7,] "AB" "BB" "BB" "BB" "BB"
[8,] "AB" "AA" "BB" "BB" "BB"
[9,] "AA" "AA" "AB" "BB" "AB"
[10,] "AA" "AA" "AA" "AA" "AA"
Upvotes: 2
Reputation: 66834
You can use cut
:
x <- c(0.15,0.2,0.4,0.6,0.8,1.0)
cut(x,c(0,0.2,0.4,0.6,0.8,1.0),labels=c("AA",NA,"AB",NA,"BB"))
[1] AA AA <NA> AB <NA> BB
Levels: AA <NA> AB <NA> BB
Warning message:
In `levels<-`(`*tmp*`, value = c("AA", NA, "AB", NA, "BB")) :
duplicated levels will not be allowed in factors anymore
Note the warning since I used NA for both your gaps in the partitions.
Upvotes: 4