Reputation: 429
UPDATED:
I have a dataframe as follows:
x <- data.frame("A" = c(15, 13.5, 12, 9.1),
"B" = c(13.6, 8.4, 6.7, NA),
"C" = c(8.5, 2.43, 1.23, NA),
"D" = c(8.6, 6.43, 4.23, NA),
"E" = c(7.5, 4.43, 3.23, NA))
And I need to arrange to something like this:
xR <- data.frame("A" = c(15, 13.5, 12, 9.1, NA, NA, NA, NA, NA),
"B" = c(NA, 13.6, 8.4, 6.7, NA, NA, NA, NA, NA),
"C" = c(NA, NA, 8.5, 2.43, 1.23, NA, NA, NA, NA),
"D" = c(NA, NA, 8.6, 6.43, 1.23, NA, NA, NA, NA),
"E" = c(NA, NA, NA, 7.5, 4.43, 3.23, NA, NA, NA))
The logic is as follows: I need to place the first value on col B where it is equal to or greater than the first value it meets in Col A and so on for Col B and Col C. I need to try automate it as my dataframe is huge. Thanks for your help.
Upvotes: 0
Views: 91
Reputation: 8844
Perhaps try these two functions
align_proper <- function(df, at = c("first", "last"), offset = 0L) {
df <- as.matrix(df)
dr <- nrow(df)
dc <- ncol(df)
m <- cbind(t(df), NA_real_)
pos <- is.na(m); m[pos] <- Inf
pos <- cumsum(c(0L, pmin(
max.col(m[-1L, 1L] >= m[-dc, ], match.arg(at)),
max.col(pos[-dc, ], "first")
) - 1L) + offset)
pos <- pos - min(pos)
outlen <- max(pos) + dr
pos <- pos + 1L
proto <- matrix(
nrow = outlen, ncol = dc,
dimnames = list(NULL, dimnames(df)[[2L]])
)
as.data.frame(Reduce(function(out, i) {
out[seq.int(pos[[i]], by = 1L, length.out = dr), i] <- df[, i]
out
}, seq_along(pos), proto))
}
align_top <- function(df) {
collen <- max(colSums(!is.na(df)))
as.data.frame(vapply(df, function(x) {
x[!is.na(x)][seq_len(collen)]
}, numeric(collen)))
}
We first use align_top
to remove extra leading or trailing NAs (if any) for each column. You can skip this step if your actual data looks exactly like what you showed in your question, but we need to ensure a standardised input.
A B C A B C
1 15.0 NA NA 1 15.0 13.6 8.50
2 13.5 13.6 NA 2 13.5 8.4 2.43
3 12.0 8.4 8.50 ====> 3 12.0 6.7 1.23
4 9.1 6.7 2.43 4 9.1 NA NA
5 NA NA 1.23
6 NA NA NA
7 NA NA NA
8 NA NA NA
align_proper
then gives you the desired output
> align_proper(align_top(xR))
A B C
1 15.0 NA NA
2 13.5 13.6 NA
3 12.0 8.4 8.50
4 9.1 6.7 2.43
5 NA NA 1.23
6 NA NA NA
> align_proper(align_top(x))
A B C
1 15.0 NA NA
2 13.5 13.6 NA
3 12.0 8.4 8.50
4 9.1 6.7 2.43
5 NA NA 1.23
6 NA NA NA
> align_proper(data.frame("A" = c(1.23, 2.43, 6.5, NA),
+ "B" = c(6.7, 8.4, 8.9, NA),
+ "C" = c(9.1, 12, 13.5, 15)), "last")
A B C
1 1.23 NA NA
2 2.43 NA NA
3 6.50 6.7 NA
4 NA 8.4 NA
5 NA 8.9 9.1
6 NA NA 12.0
7 NA NA 13.5
8 NA NA 15.0
Update
I add one more argument offset
in the function. Now you can do this
> align_proper(data.frame("A" = c(1.23, 2.43, 6.5, NA),
+ "B" = c(6.7, 8.4, 8.9, NA),
+ "C" = c(9.1, 12, 13.5, 15)), "last", -1L)
A B C
1 1.23 NA NA
2 2.43 6.7 NA
3 6.50 8.4 9.1
4 NA 8.9 12.0
5 NA NA 13.5
6 NA NA 15.0
> align_proper(data.frame("A" = c(1.23, 2.43, 6.5, NA),
+ "B" = c(6.7, 8.4, 8.9, NA),
+ "C" = c(9.1, 12, 13.5, 15)), "last", 3L)
A B C
1 1.23 NA NA
2 2.43 NA NA
3 6.50 NA NA
4 NA NA NA
5 NA NA NA
6 NA 6.7 NA
7 NA 8.4 NA
8 NA 8.9 NA
9 NA NA NA
10 NA NA NA
11 NA NA 9.1
12 NA NA 12.0
13 NA NA 13.5
14 NA NA 15.0
Upvotes: 2
Reputation: 174586
I don't know of an way to easily automate this, but you can do it in base R with the following algorithm:
x <- data.frame("A" = c(1.23, 2.43, 6.5, NA),
"B" = c(6.7, 8.4, 8.9, NA),
"C" = c(9.1, 12, 13.5, 15))
x <- as.list(x)
for(i in 2:length(x)) {
x[[i]] <- c(rep(NA, which.max(x[[i-1]][x[[i-1]] < x[[i]][1]])), x[[i]])
}
x <- lapply(x, function(y) c(y, rep(NA, max(sapply(x, length)) - length(y))))
as.data.frame(x)
#> A B C
#> 1 1.23 NA NA
#> 2 2.43 NA NA
#> 3 6.50 NA NA
#> 4 NA 6.7 NA
#> 5 NA 8.4 NA
#> 6 NA 8.9 NA
#> 7 NA NA 9.1
#> 8 NA NA 12.0
#> 9 NA NA 13.5
#> 10 NA NA 15.0
Upvotes: 1