tushaR
tushaR

Reputation: 3116

Evaluate a function on .SDcols and return all columns in data.table

I have a data.table "result" which is the output of eclat() from arules package in R.

        lhs           rhs         sup        conf      lift       itemset
1: {XXXXXXXXXXX} {XXXXXXXXOOO} 0.001635323 1.0000000 611.50000       1
2: {XXXXXXXXOOO} {XXXXXXXXXXX} 0.001635323 1.0000000 611.50000       1
3: {XXXXX00XXXX} {XXXXXX0XXXX} 0.001635323 1.0000000  32.18421       2
4: {XXXXXXX00XX} {XXX0XXXXXXX} 0.001635323 0.5000000 203.83333       3
5: {XXX0XXXXXXX} {XXXXXXX00XX} 0.001635323 0.6666667 203.83333       3  

I want to replace the "{" and "}" in lhs and rhs column of "result". I am able to run gsub and return back a data.table with lhs and rhs values only using the following statement:

result[,lapply(.SD,gsub,pattern = "{",replacement = "",fixed = T),.SDcols=c("lhs","rhs")][,lapply(.SD,gsub,pattern="}",replacement="",fixed=T)]

Output that I am getting:

         lhs           rhs      
1:  XXXXXXXXXXX   XXXXXXXXOOO
2:  XXXXXXXXOOO   XXXXXXXXXXX
3:  XXXXX00XXXX   XXXXXX0XXXX
4:  XXXXXXX00XX   XXX0XXXXXXX
5:  XXX0XXXXXXX   XXXXXXX00XX

But I also want to retain the values of sup, conf, lift and itemset.

Expected output:

        lhs           rhs         sup      conf      lift         itemset
1:  XXXXXXXXXXX   XXXXXXXXOOO  0.001635323 1.0000000 611.50000       1
2:  XXXXXXXXOOO   XXXXXXXXXXX  0.001635323 1.0000000 611.50000       1
3:  XXXXX00XXXX   XXXXXX0XXXX  0.001635323 1.0000000  32.18421       2
4:  XXXXXXX00XX   XXX0XXXXXXX  0.001635323 0.5000000 203.83333       3
5:  XXX0XXXXXXX   XXXXXXX00XX  0.001635323 0.6666667 203.83333       3

How to get the remaining columns with their names?

Upvotes: 2

Views: 1019

Answers (2)

Sathish
Sathish

Reputation: 12703

library('data.table')
for(col in c("lhs", "rhs")){
  set(result, , col, gsub( "[{}]", '', result[[col]] ) )  # using set() function is faster than `[`. see ?`:=`
}

result
#            lhs         rhs         sup      conf      lift itemset
# 1: XXXXXXXXXXX XXXXXXXXOOO 0.001635323 1.0000000 611.50000       1
# 2: XXXXXXXXOOO XXXXXXXXXXX 0.001635323 1.0000000 611.50000       1
# 3: XXXXX00XXXX XXXXXX0XXXX 0.001635323 1.0000000  32.18421       2
# 4: XXXXXXX00XX XXX0XXXXXXX 0.001635323 0.5000000 203.83333       3
# 5: XXX0XXXXXXX XXXXXXX00XX 0.001635323 0.6666667 203.83333       3

Upvotes: 1

akrun
akrun

Reputation: 886938

We can do the assignment to replace in place the output we got from the lapply

result[,c("lhs", "rhs") := lapply(.SD,gsub,pattern = "[{}]",
       replacement = ""), .SDcols=c("lhs","rhs")]
result
#           lhs         rhs         sup      conf      lift itemset
#1: XXXXXXXXXXX XXXXXXXXOOO 0.001635323 1.0000000 611.50000       1
#2: XXXXXXXXOOO XXXXXXXXXXX 0.001635323 1.0000000 611.50000       1
#3: XXXXX00XXXX XXXXXX0XXXX 0.001635323 1.0000000  32.18421       2
#4: XXXXXXX00XX XXX0XXXXXXX 0.001635323 0.5000000 203.83333       3
#5: XXX0XXXXXXX XXXXXXX00XX 0.001635323 0.6666667 203.83333       3

Upvotes: 3

Related Questions