Reputation: 13
I am working on R 3.4.3 on Windows 10. I have a dataframe made of numeric values and characters. I would like to replace only the numeric values but when I do that the characters also change and are replaced.
How can I edit my function to make it affect only the numeric values and not the characters?
Here is the piece of code of my function:
dataframeChange <- function(dFrame){
thresholdVal <- 20
dFrame[dFrame >= thresholdVal] <- -1
return(dFrame)
}
Here is a dataframe example:
example_df <- data.frame(
myNums = c (1:5),
myChars = c("A","B","C","D","E"),
stringsAsFactors = FALSE
)
Thanks for the help!
Upvotes: 1
Views: 2338
Reputation: 8120
Another data.table
solution.
library(data.table)
dataframeChange <- function(dFrame){
setDT(dFrame)
for(j in seq_along(dFrame)){
set(dFrame, i= which(dFrame[[j]] < 20), j = j, value = -1)
}
}
dataframeChange_dt(example_df)
example_df
# myNums myChars
# 1: -1 A
# 2: 20 B
# 3: 30 C
# 4: -1 D
# 5: 70 E
It does not explicitly call only numeric columns, however I tested on multiple datasets and it does not effect the non-numeric columns.
Upvotes: 0
Reputation: 21284
Use mutate_if
from dplyr
:
library(dplyr)
example_df %>% mutate_if(is.numeric, funs(if_else(. >= thresh, repl, .)))
myNums myChars
1 10 A
2 -1 B
3 -1 C
4 5 D
5 -1 E
Explanation:
mutate
family of functions is for variable assignment or updating. mutate_if
functions (specified within funs()
) are only applied to columns which satisfy the first argument (in this case, is.numeric()
) if_else
clause based on OP rules. Data:
thresh <- 20
repl <- -1.0
example_df <- data.frame(
myNums = c(10,20,30,5,70),
myChars = c("A","B","C","D","E"),
stringsAsFactors = FALSE
)
example_df
myNums myChars
1 10 A
2 20 B
3 30 C
4 5 D
5 70 E
Upvotes: 1
Reputation: 13135
As Tim's comment, you should be aware of the location of the numeric columns which we can locate them using ind <- sapply(dFrame, is.numeric)
dataframeChange <- function(dFrame){
#browser()
thresholdVal <- 20
ind <- sapply(dFrame, is.numeric)
dFrame[(dFrame[,ind] >= thresholdVal),ind] <- -1
#dFrame[dFrame >= thresholdVal] <- -1
return(dFrame)
}
Upvotes: 2
Reputation: 21759
Using data.table
, we can avoid explicit loops and is faster. Here I've set the threshold value as 2:
# set to data table
setDT(example_df)
# get numeric columns
num_cols <- names(example_df)[sapply(example_df, is.numeric)]
# loop over all columns at once
example_df[,(num_cols) := lapply(.SD, function(x) ifelse(x>2,-1, x)), .SDcols=num_cols]
print(example_df)
myNums myChars
1: 1 A
2: 2 B
3: -1 C
4: -1 D
5: -1 E
Upvotes: 0