Reputation: 63994
I have a data that looks like this:
FOO,yyy,Unigene126925_All,Unigene137063_All,0.238087
,,Unigene126925_All,Unigene24551_All,0.374231
,,Unigene126925_All,Unigene31835_All,0.367897
BAR,xxx,Unigene126925_All,Unigene165366_All,0.247844
,,Unigene126925_All,Unigene111784_All,0.344493
Which after reading it with the following code:
dt <- read.csv("http://dpaste.com/1612639/plain/",header=FALSE,fill=FALSE,na.strings = "")
# The <NA> coercion here is intentional.
It produces this result:
> dt
V1 V2 V3 V4 V5
1 FOO yyy Unigene126925_All Unigene137063_All 0.238087
2 <NA> <NA> Unigene126925_All Unigene24551_All 0.374231
3 <NA> <NA> Unigene126925_All Unigene31835_All 0.367897
4 BAR xxx Unigene126925_All Unigene165366_All 0.247844
5 <NA> <NA> Unigene126925_All Unigene111784_All 0.344493
What I want to do is to replace the <NA>
cells with the preceding values, yielding this:
FOO yyy Unigene126925_All Unigene137063_All 0.238087
FOO yyy Unigene126925_All Unigene24551_All 0.374231
FOO yyy Unigene126925_All Unigene31835_All 0.367897
BAR xxx Unigene126925_All Unigene165366_All 0.247844
BAR xxx Unigene126925_All Unigene111784_All 0.344493
In the above example the 2nd row has NA, it has to take the values for V1 and V2 columns from the preceding rows that contain the values.
How can I achieve that in R?
Upvotes: 1
Views: 177
Reputation: 57210
You can use na.locf
function (package zoo
):
library(zoo)
dt$V1 <- na.locf(dt$V1)
dt$V2 <- na.locf(dt$V2)
or in one shot:
dt <- na.locf(dt)
obtaining
> dt
V1 V2 V3 V4 V5
1 FOO yyy Unigene126925_All Unigene137063_All 0.238087
2 FOO yyy Unigene126925_All Unigene24551_All 0.374231
3 FOO yyy Unigene126925_All Unigene31835_All 0.367897
4 BAR xxx Unigene126925_All Unigene165366_All 0.247844
5 BAR xxx Unigene126925_All Unigene111784_All 0.344493
Upvotes: 4