TYL
TYL

Reputation: 1637

Applying a custom function to every row in r

I created a function to calculate the rollmean of a row in a dataframe:

rollmean_circular <- function(x) {t(rollmean(t(cbind(x[9:10],x,x[1:2])),5))}

df <- structure(list(X1 = c(5L, 5L, 9L, 0L, 9L, 10L, 10L, 1L, 0L, 10L
), X2 = c(6L, 8L, 6L, 9L, 7L, 5L, 0L, 7L, 5L, 8L), X3 = c(10L, 
7L, 2L, 1L, 2L, 10L, 2L, 9L, 6L, 4L), X4 = c(6L, 0L, 9L, 1L, 
6L, 8L, 3L, 7L, 8L, 1L), X5 = c(0L, 9L, 8L, 3L, 1L, 8L, 3L, 9L, 
5L, 2L), X6 = c(0L, 10L, 9L, 10L, 3L, 1L, 6L, 0L, 6L, 9L), X7 = c(9L, 
10L, 0L, 10L, 10L, 9L, 0L, 1L, 10L, 2L), X8 = c(2L, 6L, 3L, 7L, 
7L, 9L, 8L, 9L, 1L, 0L), X9 = c(0L, 8L, 8L, 9L, 0L, 5L, 9L, 9L, 
4L, 8L), X10 = c(1L, 4L, 3L, 0L, 1L, 7L, 3L, 6L, 5L, 0L)), class = "data.frame", row.names = c(NA, 
-10L))

   X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1   5  6 10  6  0  0  9  2  0   1
2   5  8  7  0  9 10 10  6  8   4
3   9  6  2  9  8  9  0  3  8   3
4   0  9  1  1  3 10 10  7  9   0
5   9  7  2  6  1  3 10  7  0   1
6  10  5 10  8  8  1  9  9  5   7
7  10  0  2  3  3  6  0  8  9   3
8   1  7  9  7  9  0  1  9  9   6
9   0  5  6  8  5  6 10  1  4   5
10 10  8  4  1  2  9  2  0  8   0

What this function does is given a vector, it will append the last 2 element to the front and first 2 element to the back and then do a rollmean so there will not be any NAs at the front or back.

It works perfectly when I apply to 1 row of a df.

r = df[1,]
rollmean_circular[r]

  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
1  4.4  5.6  5.4  4.4    5  3.4  2.2  2.4  3.4   2.8

However, when I use apply to apply this function to every row of my dataframe, it returns a logical(0).

apply(df,1,rollmean_circular)

logical(0)

Can I know what I am missing?

When I apply another function that gives the same output for a single row, it works:

stdize <- function(x, na.rm=T) {(x - min(x, na.rm=T)) / (max(x, na.rm=T) - min(x, na.rm=T))}

stdize(r)

   X1  X2 X3  X4 X5 X6  X7  X8 X9 X10
1 0.5 0.6  1 0.6  0  0 0.9 0.2  0 0.1

apply(df,1,stdize)

    [,1] [,2]      [,3] [,4] [,5]      [,6] [,7]      [,8] [,9] [,10]
X1   0.5  0.5 1.0000000  0.0  0.9 1.0000000  1.0 0.1111111  0.0   1.0
X2   0.6  0.8 0.6666667  0.9  0.7 0.4444444  0.0 0.7777778  0.5   0.8
X3   1.0  0.7 0.2222222  0.1  0.2 1.0000000  0.2 1.0000000  0.6   0.4
X4   0.6  0.0 1.0000000  0.1  0.6 0.7777778  0.3 0.7777778  0.8   0.1
X5   0.0  0.9 0.8888889  0.3  0.1 0.7777778  0.3 1.0000000  0.5   0.2
X6   0.0  1.0 1.0000000  1.0  0.3 0.0000000  0.6 0.0000000  0.6   0.9
X7   0.9  1.0 0.0000000  1.0  1.0 0.8888889  0.0 0.1111111  1.0   0.2
X8   0.2  0.6 0.3333333  0.7  0.7 0.8888889  0.8 1.0000000  0.1   0.0
X9   0.0  0.8 0.8888889  0.9  0.0 0.4444444  0.9 1.0000000  0.4   0.8
X10  0.1  0.4 0.3333333  0.0  0.1 0.6666667  0.3 0.6666667  0.5   0.0

Upvotes: 1

Views: 152

Answers (2)

jay.sf
jay.sf

Reputation: 73802

Seems you're confusing vectors and matrices in your function. You could unlist in the function and transpose later.

rollmean_circular <- function(x) zoo::rollmean(unlist(c(x[9:10], x, x[1:2])),5)

t(apply(df, 1, rollmean_circular))
#       X1  X2  X3  X4  X5  X6  X7  X8  X9 X10
#  [1,] 4.4 5.6 5.4 4.4 5.0 3.4 2.2 2.4 3.4 2.8
#  [2,] 6.4 4.8 5.8 6.8 7.2 7.0 8.6 7.6 6.6 6.2
#  [3,] 5.6 5.8 6.8 6.8 5.6 5.8 5.6 4.6 4.6 5.8
#  [4,] 3.8 2.2 2.8 4.8 5.0 6.2 7.8 7.2 5.2 5.0
#  [5,] 3.8 5.0 5.0 3.8 4.4 5.4 4.2 4.2 5.4 4.8
#  [6,] 7.4 8.0 8.2 6.4 7.2 7.0 6.4 6.2 8.0 7.2
#  [7,] 4.8 3.6 3.6 2.8 2.8 4.0 5.2 5.2 6.0 6.0
#  [8,] 6.4 6.0 6.6 6.4 5.2 5.2 5.6 5.0 5.2 6.4
#  [9,] 4.0 4.8 4.8 6.0 7.0 6.0 5.2 5.2 4.0 3.0
# [10,] 6.0 4.6 5.0 4.8 3.6 2.8 4.2 3.8 4.0 5.2

This can also be done in base R (w/ most of the credits to @MattiPastell):

fun <- function(x, n=5) na.omit(filter(c(tail(x, 2), x, head(x, 2)), rep(1 / n, n), sides=2))
t(apply(df, 1, fun))
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#  [1,]  4.4  5.6  5.4  4.4  5.0  3.4  2.2  2.4  3.4   2.8
#  [2,]  6.4  4.8  5.8  6.8  7.2  7.0  8.6  7.6  6.6   6.2
#  [3,]  5.6  5.8  6.8  6.8  5.6  5.8  5.6  4.6  4.6   5.8
#  [4,]  3.8  2.2  2.8  4.8  5.0  6.2  7.8  7.2  5.2   5.0
#  [5,]  3.8  5.0  5.0  3.8  4.4  5.4  4.2  4.2  5.4   4.8
#  [6,]  7.4  8.0  8.2  6.4  7.2  7.0  6.4  6.2  8.0   7.2
#  [7,]  4.8  3.6  3.6  2.8  2.8  4.0  5.2  5.2  6.0   6.0
#  [8,]  6.4  6.0  6.6  6.4  5.2  5.2  5.6  5.0  5.2   6.4
#  [9,]  4.0  4.8  4.8  6.0  7.0  6.0  5.2  5.2  4.0   3.0
# [10,]  6.0  4.6  5.0  4.8  3.6  2.8  4.2  3.8  4.0   5.2

Upvotes: 2

G. Grothendieck
G. Grothendieck

Reputation: 270348

rollmean will automatically work on every column of its input so this can be done directly eliminating the apply:

library(zoo)
t(rollmean(t(cbind(df[9:10], df, df[1:2])), 5))

or using stats::filter in the base of R which will also work on every column:

t(filter(t(df), rep(1, 5)/5, circular = TRUE))

Either of tehse give this matrix:

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]  4.4  5.6  5.4  4.4  5.0  3.4  2.2  2.4  3.4   2.8
 [2,]  6.4  4.8  5.8  6.8  7.2  7.0  8.6  7.6  6.6   6.2
 [3,]  5.6  5.8  6.8  6.8  5.6  5.8  5.6  4.6  4.6   5.8
 [4,]  3.8  2.2  2.8  4.8  5.0  6.2  7.8  7.2  5.2   5.0
 [5,]  3.8  5.0  5.0  3.8  4.4  5.4  4.2  4.2  5.4   4.8
 [6,]  7.4  8.0  8.2  6.4  7.2  7.0  6.4  6.2  8.0   7.2
 [7,]  4.8  3.6  3.6  2.8  2.8  4.0  5.2  5.2  6.0   6.0
 [8,]  6.4  6.0  6.6  6.4  5.2  5.2  5.6  5.0  5.2   6.4
 [9,]  4.0  4.8  4.8  6.0  7.0  6.0  5.2  5.2  4.0   3.0
[10,]  6.0  4.6  5.0  4.8  3.6  2.8  4.2  3.8  4.0   5.2

Depending on the needs of your application you could consider storing these series in columns rather than rows in which case the transposes would not be needed.

Upvotes: 0

Related Questions