00schneider
00schneider

Reputation: 798

R: trimming a variable and adding it to a dataframe

I am an R beginner. I would like to trim a variable using the Trim function of the package "DescTools". This works fine with:

mydata <- data.frame(
 a <- rnorm(40, mean = 0, sd = 1)
 )
a_trim <- Trim(mydata$a, trim = 0.2, na.rm = TRUE)

This creates an object, however, I would like to add it to my dataframe mydata. When I try to do this by

mydata$a_trim <- Trim(mydata$a, trim = 0.2, na.rm = TRUE)

R gives me an error because mydata$a_trim has fewer rows than the dataframe (obviously, since it is a trimmed variable). How can I do this?

Thanks for your patience and help!

Upvotes: 0

Views: 5115

Answers (2)

Andri Signorell
Andri Signorell

Reputation: 1309

Reflecting this post I changed the function Trim to return the indices of the trimmed elements as attribute "trim". Now, you will still get the trimmed vector back, however if you simply want to label the elements to be trimmed, you can do something like:

a <- rnorm(40, mean = 0, sd = 1)
a_trim <- Trim(mydata$a, trim = 0.2, na.rm = TRUE)
data.frame(x=a, 
           trim=is.element(seq_len(length(a), attr(a_trim, "trim")))

(since DescTools 0.99.18)

Upvotes: 0

Dex Groves
Dex Groves

Reputation: 331

Trim isn't suitable for what you want to do. It removes extreme values from a vector so that you can pass that vector to something like mean or sd so that those quantities can be computed without the influence of outliers.

To set extreme values to NA you can use quantile.

upper_quantile <- quantile(mydata$a, 0.9)
lower_quantile <- quantile(mydata$a, 0.1)

# col a     where a > its 90th percentile    becomes NA
mydata$a[mydata$a > upper_quantile] <- NA
mydata$a[mydata$a < lower_quantile] <- NA

Upvotes: 3

Related Questions