Reputation: 1523
This question is not a duplicate because my data.frame
does not have the same amount of NA
values in all Columns and therefore the solution mentioned in that question does not work.
I have a data.frame
with a lot of NA
values and I would like to delete all cells (important: not rows or columns, cells) that have NA values. The original would look like this:
A B
1 NA
NA 2
2 NA
NA NA
NA NA
NA 4
3 5
The desired result would look like this:
A B
1 2
2 4
3 5
The number of columns would have to stay the same, but it does not matter if the values remain on the same rows. They can just be moved up.
I could image one could delete all cells with the condition NA (maybe with apply) and get the result. Or maybe a simple sorting ?
Thanks.
Update:
A B C
1 3
2
4 3
1 2
3 5
4
9
7 1
Upvotes: 4
Views: 6352
Reputation: 42592
The OP has requested to remove the NA
s columnwise but has pointed out that there might be different numbers of NA in each column.
This can be solved using data.table
in two steps:
library(data.table)
# step 1: coerce to data.table in place, move NAs to the bottom of each column,
# maintain the original order of non-NA values
result <- data.table(DF)[, lapply(.SD, function(x) x[order(is.na(x))])]
A B C 1: 1 2 3 2: 4 1 3 3: 3 9 2 4: 7 NA 5 5: NA NA 4 6: NA NA 1 7: NA NA NA 8: NA NA NA 9: NA NA NA 10: NA NA NA
# step 2: trim result
# either using Reduce
result[!result[, Reduce(`&`, lapply(.SD, is.na))]]
# or using zoo::na.trim()
zoo::na.trim(result, is.na = "all")
A B C 1: 1 2 3 2: 4 1 3 3: 3 9 2 4: 7 NA 5 5: NA NA 4 6: NA NA 1
So, there will be some NA
s at the end of each colummn unavoidably because all columns in a data.frame have the same length.
Or, alternatively, only complete rows can be kept using the is.na
parameter to na.trim()
:
zoo::na.trim(result, is.na = "any")
A B C 1: 1 2 3 2: 4 1 3 3: 3 9 2
As mentioned before, data.frame
s and cbind()
expect all column vectors to have the same length. Here is an alternative solution without data.table
which uses the cbind.fill()
function from the rowr
package which pads vectors with the fill
value until the same length:
setNames(do.call(function(...) rowr::cbind.fill(..., fill = NA), lapply(DF, na.omit)),
colnames(DF))
A B C 1 1 2 3 2 4 1 3 3 3 9 2 4 7 NA 5 5 NA NA 4 6 NA NA 1
As supplied by the OP in the update:
DF <- structure(list(A = c(1L, NA, 4L, NA, NA, NA, 3L, NA, NA, 7L),
B = c(NA, 2L, NA, NA, 1L, NA, NA, NA, 9L, NA), C = c(3L,
NA, 3L, NA, 2L, NA, 5L, 4L, NA, 1L)), .Names = c("A", "B",
"C"), row.names = c(NA, -10L), class = "data.frame")
Upvotes: 5
Reputation: 323386
You can try ,Base on @UweBlock's Data and your updated question.
dat=as.data.frame(na.omit(apply(dat,2,function (x) x[order(is.na(x))])))
dat
A B C
1 1 2 3
2 4 1 3
3 3 9 2
Upvotes: 2