user3827298
user3827298

Reputation: 31

Error in `row.names<-.data.frame`(`*tmp*`, value = c(NA_real_, NA_real_

I am trying to build a model using the tweets and polarity. But in the middle I get this weird error: At this line:

analytics <- create_analytics(container, MAXENT_CLASSIFY)

I get this

Error in `row.names<-.data.frame`(`*tmp*`, value = c(NA_real_, NA_real_,  : 
  duplicate 'row.names' are not allowed
In addition: Warning messages:
1: In cbind(labels, BEST_LABEL = as.numeric(best_labels), BEST_PROB = best_probs,  :
  NAs introduced by coercion
2: In create_documentSummary(container, score_summary) :
  NAs introduced by coercion
3: In cbind(MANUAL_CODE = testing_codes, CONSENSUS_CODE = scores$BEST_LABEL,  :
  NAs introduced by coercion
4: In create_topicSummary(container, score_summary) :
  NAs introduced by coercion
5: In cbind(TOPIC_CODE = as.numeric(as.vector(topic_codes)), NUM_MANUALLY_CODED = manually_coded,  :
  NAs introduced by coercion
6: In cbind(labels, BEST_LABEL = as.numeric(best_labels), BEST_PROB = best_probs,  :
  NAs introduced by coercion
7: non-unique values when setting 'row.names':

My CSV file looks like:

text, polarity
Hello I forget the password of my credit card need to know how I can make my statement, neutral
can provide the swift code thanks, neutral
thanks just one more doubt has this card commissions with these characteristics, neutral
Thanks, neutral
are arriving mail scam, negative
can you help me I need to pay an online purchase and ask me for a terminal my debit which is, neutral
if I do not win anything this time I change banks, negative
you can be the next winner of the million that circumvents account award date January, neutral
account and see my accounts so I can have the, negative
thanks i just send the greetings consultation, neutral
may someday enable office not sick people, negative
hello is running payments through the online banking no, negative
thanks hope they do, neutral
should pay attention to many happened to us that your system flushed insufficient balance or had no money in the accounts, negative
yesterday someone had the dignity to answer the telephone banking and verify that the system is crap, negative
and tried but apparently the problem is just to pay movistar services, neutral
good morning was trying to pay for services through the website but get error retry in minutes, negative
if no system agent is non clients or customers also, positive

The code I am using is:

library(RTextTools)

pg <- read.csv("cleened_tweets.csv", header=TRUE, row.names=NULL)

head(pg)

pgT <- as.factor(pg$text)

pgP <- as.factor(pg$polarity)

doc_matrix <- create_matrix(pgT, language="spanish", removeNumbers=TRUE, stemWords=TRUE, removeSparseTerms=.998)

dim(doc_matrix)

container <- create_container(doc_matrix, pgP, trainSize=1:275, testSize=276:375, virgin=FALSE)

MAXENT <- train_model(container,"MAXENT")

MAXENT_CLASSIFY <- classify_model(container, MAXENT)

analytics <- create_analytics(container, MAXENT_CLASSIFY)

summary(analytics)

Upvotes: 2

Views: 3096

Answers (2)

ganong
ganong

Reputation: 311

I have also encountered this error with RTextTools. The create_analytics function cannot handle factor variables or strings -- only numeric labels. I usually just merge my text labels back on at the end after running this code.

Upvotes: 1

Pawan Kumar
Pawan Kumar

Reputation: 546

Convert your pgP variable from as.factor to as.numeric. This should re-solve the issue

pgP <- as.numeric(as.factor(pg$polarity))

Upvotes: 0

Related Questions