Reputation: 47
I have a .csv file with only one column containing 1000 rows. Each row contains a word (bag-of-words model). Now I want to find out for each word whether it is a noun, verb, adjective etc. .I would like to have a second column (with 1000 rows), each containing the information (noun or verb) belongig to the word in column 1.
I already have imported the csv into R. But what do I have to do now?
[Here is an example. I have these words and I want to find out whether it is a noun verb etc]
[
Upvotes: 1
Views: 209
Reputation: 6813
You could use spacyr
which is an R Wrapper to the Python package spaCy
.
Note: you will have to
library(spacyr)
spacy_initialize(python_executable = '/path/to/python')
Then for your terms:
Terms <- data.frame(Term = c("unit",
"determine",
"generate",
"digital",
"mount",
"control",
"position",
"input",
"output",
"user"), stringsAsFactors = FALSE)
Use the function spacy_parse()
to tag your terms and add them to your dataframe:
Terms$POS_TAG <- spacy_parse(Terms$Term)$pos
The result is:
Term POS_TAG
1 unit NOUN
2 determine VERB
3 generate VERB
4 digital ADJ
5 mount VERB
6 control NOUN
7 position NOUN
8 input NOUN
9 output NOUN
10 user NOUN
Upvotes: 0
Reputation: 23608
There are multiple options, but you could use udpipe
for this. The
terms <- data.frame(term = c("unit", "determine", "generate", "digital", "mount", "control", "position", "input", "output", "user"),
stringsAsFactors = FALSE)
library(udpipe)
# check if model is already downloaded.
if (file.exists("english-ud-2.0-170801.udpipe"))
ud_model <- udpipe_load_model(file = "english-ud-2.0-170801.udpipe") else {
ud_model <- udpipe_download_model(language = "english")
ud_model <- udpipe_load_model(ud_model$file_model)
}
# no need for parsing as this data only contains single words.
t <- udpipe_annotate(ud_model, terms$term, parser = "none")
t <- as.data.frame(t)
terms$POSTAG <- t$upos
terms
term POSTAG
1 unit NOUN
2 determine VERB
3 generate VERB
4 digital ADJ
5 mount NOUN
6 control NOUN
7 position NOUN
8 input NOUN
9 output NOUN
10 user NOUN
Upvotes: 1