Rilcon42
Rilcon42

Reputation: 9763

removing NaN using dplyr

I am trying to remove the NaN values and sort by the row.names. I tried to do this using dplyr, but my attempt didnt work. Can someone suggest a way to fix it?

require(markovchain)
data1<-data.frame(dv=rep(c("low","high"),3),iv1=sample(c("A","B","C"),replace=T,6))
markov<-markovchainFit(data1)
markovDF<-as(markov, "data.frame")
library(dplyr)
markovDF%>%filter(rowSums>0)%>%arrange(desc(markovDF[,1]))


> markov
$estimate
             A         B         C high low
A          NaN       NaN       NaN  NaN NaN
B          NaN       NaN       NaN  NaN NaN
C          NaN       NaN       NaN  NaN NaN
high 0.3333333 0.0000000 0.6666667    0   0
low  0.6666667 0.3333333 0.0000000    0   0

GOAL:

      A    B  C  high low
high .33 .00 .67  0    0
low  .67 .33  .00 0    0

Upvotes: 8

Views: 9727

Answers (2)

Joel Carlson
Joel Carlson

Reputation: 640

It seems that nelsonauner's answer alters the row.names attribute. Since you want to sort by row.names that seems like an issue.

You don't need dplyr to do this:

library(markovchain)
data1 <- data.frame(dv=rep(c("low","high"),3),iv1=sample(c("A","B","C"),replace=T,6))
markov<-markovchainFit(data1)

#Get into dataframe
markov <- as.data.frame(markov$estimate@transitionMatrix)

#Remove rows that contain nans
markov <- markov[complete.cases(markov), ]

#sort by rowname
markov <- markov[order(row.names(markov)),]

             A         B         C high low
high 0.0000000 0.3333333 0.6666667    0   0
low  0.3333333 0.3333333 0.3333333    0   0

Upvotes: 5

Nelson Auner
Nelson Auner

Reputation: 1509

There are two problems to be solved here.

  1. dplyr is meant to operate on dataframes, so we need to get the data into a dataframe. You attempt to do this with markovDF<-as(markov, "data.frame"), but I couldn't get that to work. (Did you get a non-empty dataframe?)

  2. remove rows with an NaN in a specific row (I'll use row A, you can change it to include all rows if you want)

You can solve this problem with

> markov$estimate@transitionMatrix %>% 
    as.data.frame %>% 
    dplyr::filter(!is.na(A)) 
    %>% arrange(-A)


          A         B         C high low
1 0.3333333 0.3333333 0.3333333    0   0
2 0.0000000 0.6666667 0.3333333    0   0

Upvotes: 4

Related Questions