Reputation: 1521
I want to replace missing values or character values in a data with the row mean. for example in the data below, the missing values are denoted with "U" and I want to replace all the "U" from p1 to p6 with the values in the "ave" column for each row. There are thousands of the rows to replace.
num p1 p2 p3 p4 p5 p6 ave
L1 0 10 1 U 0 -10 1.3
L2 10 1 10 10 U 10 7.1
L3 U 10 10 U 1 -10 3.1
Upvotes: 1
Views: 720
Reputation: 193517
Here's one approach:
mydf <- read.table(
header = TRUE, stringsAsFactors = FALSE,
text = "num p1 p2 p3 p4 p5 p6 ave
L1 0 10 1 U 0 -10 1.3
L2 10 1 10 10 U 10 7.1
L3 U 10 10 U 1 -10 3.1")
cbind(mydf[1],
t(apply(mydf[-1], 1,
function(x) ifelse(x == "U", x["ave"], x))))
# num p1 p2 p3 p4 p5 p6 ave
# 1 L1 0 10 1 1.3 0 -10 1.3
# 2 L2 10 1 10 10 7.1 10 7.1
# 3 L3 3.1 10 10 3.1 1 -10 3.1
Upvotes: 0
Reputation: 6104
for loops are generally discouraged in the r language so sven's answer is better, but here's a straightforward way to do what you're trying to do..
# example data table
mtcars
# here's how to access all the columns below two in the first row
mtcars[ 1 , mtcars[ 1 , ] < 2 ]
# here's how to take the mean of all columns at or above two in the first row
rowMeans( mtcars[ 1 , mtcars[ 1 , ] >= 2 ] , na.rm = T )
# here's how to overwrite the values below two with the mean of all columns at or above two
mtcars[ 1 , mtcars[ 1 , ] < 2 ] <- rowMeans( mtcars[ 1 , mtcars[ 1 , ] >= 2 ] , na.rm = T )
# run this command for every row, and you're done
for ( i in seq( nrow( mtcars ) ) ){
mtcars[ i , mtcars[ i , ] < 2 ] <-
rowMeans( mtcars[ i , mtcars[ i , ] >= 2 ] , na.rm = T )
}
Upvotes: 0
Reputation: 81693
The data:
df<-read.table(text="num p1 p2 p3 p4 p5 p6 ave
L1 0 10 1 U 0 -10 1.3
L2 10 1 10 10 U 10 7.1
L3 U 10 10 U 1 -10 3.1 ", header = TRUE)
You could use apply
to replace the U
s:
as.data.frame(t(apply(df, 1, function(x) replace(x, x == "U", tail(x, 1)))))
num p1 p2 p3 p4 p5 p6 ave
1 L1 0 10 1 1.3 0 -10 1.3
2 L2 10 1 10 10 7.1 10 7.1
3 L3 3.1 10 10 3.1 1 -10 3.1
Upvotes: 1