Reputation: 1352
For a project I use the R package wordVectors, and the function train_word2vec() see an example here
My first question: This function requires a train_file, which is a single .txt file (on your computer). For now, you have to store this file in a specific directory in your computer. But I have the file also in my R environment (in a data.frame, within the column called: text (df$text) ).
I want to avoid reading the .txt file, but instead use an R data.frame with text. Is there a workaround?
My second question: The same function (train_word2vec) has a 'ouput' parameter, described as 'Path of the output file'. Again, I don't want to get anything on my computer, so is there a workaround that I can store the output model (let's say "vec.bin"), in my R environment (R-script)?
CODE:
library(devtools)
install_github("mukul13/rword2vec")
library(rword2vec)
model=word2vec(train_file = "text8",output_file = "vec.bin",binary=1)
# Instead of "text8" I want to insert a data.frame column (containing text).
# Instead of "vec.bin" I want to have something Like "foo <- vec.bin" in R. So that the output stays within R and not on my PC.
Upvotes: 0
Views: 135
Reputation: 26843
rword2vec
is a thin wrapper around word2vec
, a program written in C that expects to read from training a training file and write to an output file. See for example here: https://github.com/mukul13/rword2vec/blob/master/R/word2vec.R#L28. The corresponding C function is here: https://github.com/mukul13/rword2vec/blob/master/src/word2vec.c#L638. There is no way to read or write data.frame
s there.
Have you tried text2vec as an alternative? At least at first sight it looks more flexible.
Upvotes: 1