Reputation: 39
I have dataframe and task:"Define your own criterion of income level, and split data according to levels of this criterion"
dput(head(creditcard))
structure(list(card = structure(c(2L, 2L, 2L, 2L, 2L, 2L), levels = c("no",
"yes"), class = "factor"), reports = c(0L, 0L, 0L, 0L, 0L, 0L
), age = c(37.66667, 33.25, 33.66667, 30.5, 32.16667, 23.25),
income = c(4.52, 2.42, 4.5, 2.54, 9.7867, 2.5), share = c(0.03326991,
0.005216942, 0.004155556, 0.06521378, 0.06705059, 0.0444384
), expenditure = c(124.9833, 9.854167, 15, 137.8692, 546.5033,
91.99667), owner = structure(c(2L, 1L, 2L, 1L, 2L, 1L), levels = c("no",
"yes"), class = "factor"), selfemp = structure(c(1L, 1L,
1L, 1L, 1L, 1L), levels = c("no", "yes"), class = "factor"),
dependents = c(3L, 3L, 4L, 0L, 2L, 0L), days = c(54L, 34L,
58L, 25L, 64L, 54L), majorcards = c(1L, 1L, 1L, 1L, 1L, 1L
), active = c(12L, 13L, 5L, 7L, 5L, 1L), income_fam = c(1.13,
0.605, 0.9, 2.54, 3.26223333333333, 2.5)), row.names = c("1",
"2", "3", "4", "5", "6"), class = "data.frame")
I defined this criterion in this way
inc_l<-c("low","average","above average","high")
grad_fact<-function(x){
ifelse(x>=10, 'high',
ifelse(x>6 && x<10, 'above average',
ifelse(x>=3 && x<=6,'average',
ifelse(x<3, 'low'))))
}
And added a column like this
creditcard<-transform(creditcard, incom_levev=factor(sapply(creditcard$income, grad_fact), inc_l, ordered = TRUE))
But I need not to use saaply for this and I tried to do it in this way
creditcard<-transform(creditcard, incom_level=factor(grad_fact(creditcard$income),inc_l, ordered = TRUE))
But in this case, all the elements of the column take the value "average" and I don't understand why, please help me figure out the problem
Upvotes: 1
Views: 26
Reputation: 887048
We may need to change the &&
to &
as &&
will return a single TRUE/FALSE. According to ?"&&"
& and && indicate logical AND and | and || indicate logical OR. The shorter forms performs elementwise comparisons in much the same way as arithmetic operators. The longer forms evaluates left to right, proceeding only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in if clauses.
In addition, the last ifelse
didn't had a no
case
grad_fact<-function(x){
ifelse(x>=10, 'high',
ifelse(x>6 & x<10, 'above average',
ifelse(x>=3 & x<=6,'average',
ifelse(x<3, 'low', NA_character_))))
}
and then use
creditcard <- transform(creditcard, incom_level=
factor(grad_fact(income),inc_l, ordered = TRUE))
-output
creditcard
card reports age income share expenditure owner selfemp dependents days majorcards active income_fam incom_level
1 yes 0 37.66667 4.5200 0.033269910 124.983300 yes no 3 54 1 12 1.130000 average
2 yes 0 33.25000 2.4200 0.005216942 9.854167 no no 3 34 1 13 0.605000 low
3 yes 0 33.66667 4.5000 0.004155556 15.000000 yes no 4 58 1 5 0.900000 average
4 yes 0 30.50000 2.5400 0.065213780 137.869200 no no 0 25 1 7 2.540000 low
5 yes 0 32.16667 9.7867 0.067050590 546.503300 yes no 2 64 1 5 3.262233 above average
6 yes 0 23.25000 2.5000 0.044438400 91.996670 no no 0 54 1 1 2.500000 low
Upvotes: 0