Isaiah
Isaiah

Reputation: 2155

Using tidytext and broom but not finding tidier for LDA_VEM

The tidytext book has examples with a tidier for topicmodels:

library(tidyverse)
library(tidytext)
library(topicmodels)
library(broom)

year_word_counts <- tibble(year = c("2007", "2008", "2009"),
+                            word = c("dog", "cat", "chicken"),
+                            n = c(1753L, 1157L, 1057L))

animal_dtm <- cast_dtm(data = year_word_counts, document = year, term = word, value = n)

animal_lda <- LDA(animal_dtm, k = 5, control = list( seed = 1234))

animal_lda <- tidy(animal_lda, matrix = "beta")

# Console output
Error in as.data.frame.default(x) : 
  cannot coerce class "structure("LDA_VEM", package = "topicmodels")" to a data.frame
In addition: Warning message:
In tidy.default(animal_lda, matrix = "beta") :
  No method for tidying an S3 object of class LDA_VEM , using as.data.frame

Replicating the error which is also seen here but in this instance library(tidytext) is present.

Below is a list of all packages are their corresponding version:

 packageVersion("tidyverse")
 ‘1.2.1’

 packageVersion("tidytext")
 ‘0.1.6’   

 packageVersion("topicmodels")
 ‘0.2.7’  

 packageVersion("broom")
 ‘0.4.3’

Output from function call sessionInfo():

R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] broom_0.4.3       tidytext_0.1.6    forcats_0.2.0     stringr_1.2.0     dplyr_0.7.4       purrr_0.2.4       readr_1.1.1       tidyr_0.8.0      
 [9] tibble_1.4.2      ggplot2_2.2.1     tidyverse_1.2.1   topicmodels_0.2-7

loaded via a namespace (and not attached):
 [1] modeltools_0.2-21 slam_0.1-42       NLP_0.1-11        reshape2_1.4.3    haven_1.1.1       lattice_0.20-35   colorspace_1.3-2  SnowballC_0.5.1  
 [9] stats4_3.4.3      yaml_2.1.16       rlang_0.1.6       pillar_1.1.0      foreign_0.8-69    glue_1.2.0        modelr_0.1.1      readxl_1.0.0     
[17] bindrcpp_0.2      bindr_0.1         plyr_1.8.4        munsell_0.4.3     gtable_0.2.0      cellranger_1.1.0  rvest_0.3.2       psych_1.7.8      
[25] tm_0.7-3          parallel_3.4.3    tokenizers_0.1.4  Rcpp_0.12.15      scales_0.5.0      jsonlite_1.5      mnormt_1.5-5      hms_0.4.1        
[33] stringi_1.1.6     grid_3.4.3        cli_1.0.0         tools_3.4.3       magrittr_1.5      lazyeval_0.2.1    janeaustenr_0.1.5 crayon_1.3.4     
[41] pkgconfig_2.0.1   Matrix_1.2-12     xml2_1.2.0        lubridate_1.7.2   assertthat_0.2.0  httr_1.3.1        rstudioapi_0.7    R6_2.2.2         
[49] nlme_3.1-131      compiler_3.4.3   

Upvotes: 2

Views: 2290

Answers (4)

Matthias Reccius
Matthias Reccius

Reputation: 111

Adding to the very helpful answer provided by Julia Silge:

I too believe that the interaction between loading .Rdata and the topicmodels package is the culprit here. But you can still work with your saved workspace:

I was able to eliminate the problem by starting with a fresh restart of RStudio, loading the topicmodels package and then loading the .Rdata. Done in this sequence, the error message disappears. Loading first the data and then the package does not work.

One more word on workspaces: In the case of LDA, using these along with your RScripts is really the only way I could figure out to work efficiently. Depending on the parameters and the size of the corpus, fitting an LDA-model may take several hours. It is crucial to be able to save the model fit to then do further analyses down the road.

Upvotes: 1

L&#233;o Henry
L&#233;o Henry

Reputation: 137

I had the same issue when I loaded the LDA I had saved. Finally, for no apparent reasons when I restarted the R session I worked again.

Upvotes: 0

Isaiah
Isaiah

Reputation: 2155

Deleting .Rhistory and .RData led to correct behaviour.

Upvotes: 6

Julia Silge
Julia Silge

Reputation: 11663

Wow, that is extremely mysterious to me. I am not able to reproduce that error. I installed to all the same versions/etc as you, except that I am on MacOS instead of Windows. I do have tests for the LDA tidiers that run and pass on Windows on Appveyor, so I would expect this to work.

The code you have should work without loading broom, for what it's worth.

library(tidyverse)
library(tidytext)
library(topicmodels)

year_word_counts <- tibble(year = c("2007", "2008", "2009"),
                           word = c("dog", "cat", "chicken"),
                           n = c(1753L, 1157L, 1057L))

animal_dtm <- cast_dtm(data = year_word_counts, document = year, term = word, value = n)

animal_lda <- LDA(animal_dtm, k = 5, control = list( seed = 1234))

class(animal_lda)
#> [1] "LDA_VEM"
#> attr(,"package")
#> [1] "topicmodels"

tidy(animal_lda, matrix = "beta")
#> # A tibble: 15 x 3
#>    topic term                                                beta
#>    <int> <chr>                                              <dbl>
#>  1     1 dog     0.0000000000000000000000000000000000000000000372
#>  2     2 dog     0.0000000000000000000000000000000000000000000372
#>  3     3 dog     0.0000000000000000000000000000000000000000000372
#>  4     4 dog     1.00                                            
#>  5     5 dog     0.0000000000000000000000000000000000000000000372
#>  6     1 cat     0.0000000000000000000000000000000000000000000372
#>  7     2 cat     0.0000000000000000000000000000000000000000000372
#>  8     3 cat     0.0000000000000000000000000000000000000000000372
#>  9     4 cat     0.0000000000000000000000000000000000000000000372
#> 10     5 cat     1.00                                            
#> 11     1 chicken 0.0000000000000000000000000000000000000000000372
#> 12     2 chicken 0.0000000000000000000000000000000000000000000372
#> 13     3 chicken 1.00                                            
#> 14     4 chicken 0.0000000000000000000000000000000000000000000372
#> 15     5 chicken 0.0000000000000000000000000000000000000000000372

Created on 2018-02-14 by the reprex package (v0.2.0).

What happens if you load library(methods) as well?

Upvotes: 2

Related Questions