Tyler Rinker
Tyler Rinker

Reputation: 109864

data.table consistency for assignment by reference for n = 1 rows vs n > 1

I have the following set up where I want to cut a text string, producing a list, and then reassign back to the data.table as a nested column. It works great for n > 1 rows but not if n = 1 row. What can I do to ensure consistency:

library(data.table)

cutdash <- function(x) {strsplit(x, '-')}

## n > 1 row
x <- data.frame(
    text = c("I paid for books-He was not interesting in his teaching.", 'good'),
    id = 1:2, stringsAsFactors = FALSE)

## n = 1 row
y <- data.frame(text = "I paid for books-He was not interesting in his teaching.", 
    id = 3, stringsAsFactors = FALSE)


## good
x2 <- data.table::data.table(x)
x2[, text := cutdash(text)][]

##                                                        text id
## 1: I paid for books,He was not interesting in his teaching.  1
## 2:                                                     good  2

## bad 
y2 <- data.table::data.table(y)
y2[, text := cutdash(text)][]

##                text id
## 1: I paid for books  3

## Warning message:
## In `[.data.table`(y2, , `:=`(text, cutdash(text))) :
##   Supplied 2 items to be assigned to 1 items of column 'text' (1 unused)

Not that the second dataframe with 1 row only allows the first element to be reassigned.

Upvotes: 0

Views: 54

Answers (1)

Taylor H
Taylor H

Reputation: 436

You need to wrap your call to str_split with list or . (as an alias of list inside of the data.table [. Like so:

y2[, text := .(cutdash(text))][]

or

y2[, text := list(cutdash(text))][]

Upvotes: 2

Related Questions