user1477388
user1477388

Reputation: 21430

Using View with data.frame subset adds row.names column

I create two subsets (data.frames) like so:

sms_raw_train <- sms_raw[1:4169, ]
sms_raw_test <- sms_raw[4170:5559, ]

The first, sms_raw_train, looks like this:

    type    text
1   ham Hope you are having a good week. Just checking in
2   ham K..give back my thanks.
3   ham Am also doing in cbe only. But have to pay.

The second, sms_raw_test, looks like this:

    row.names   type    text
1   4170    ham I'm coming home 4 dinner.
2   4171    ham Come by our room at some point so we can iron out the plan for this weekend
3   4172    ham Its sunny in california. The weather's just cool

As you can see, it adds a row.names column. However, if I do this:

> str(sms_raw_test[1:3, ])
'data.frame':   3 obs. of  2 variables:
 $ type: Factor w/ 2 levels "ham","spam": 1 1 1
 $ text: chr  "I'm coming home 4 dinner." "Come by our room at some point so we can iron out the plan for this weekend" "Its sunny in california. The weather's just cool"

The column doesn't actually exist.

What is the purpose of this column? Why was it added to the View(sms_raw_train)?

Upvotes: 2

Views: 491

Answers (1)

Matthew Lundberg
Matthew Lundberg

Reputation: 42639

View is adding that column for display. As you have seen, it is not actually present in the subset.

From help(View):

If there are row names on the data frame that are not 1:nrow, they are displayed in a separate first column called row.names.

The row names for sms_raw_data are (presumably) 4170:5559.

The row names for sms_raw_train are 1:nrow so this behavior is not evident there.

Upvotes: 4

Related Questions