Jonathan Ramirez
Jonathan Ramirez

Reputation: 127

R inner join different data types

I was wondering if there was a way or maybe another package that uses SQL queries to manipulate dataframes so that I don't necessarily have to convert numerical variables to strings/characters.

input_key <- c(9061,8680,1546,5376,9550,9909,3853,3732,9209)
output_data <- data.frame(input_key)

answer_product <- c("Water", "Bread",   "Soda", "Chips", "Chicken",     "Cheese",   "Chocolate",    "Donuts",   "Juice")
answer_data <- data.frame(cbind(input_key, answer_product), stringsAsFactors     = FALSE)

left_join(output_data,answer_data, by = "input_key")

Upvotes: 1

Views: 802

Answers (1)

rconradin
rconradin

Reputation: 346

The left_join function from dplyr work also with numerical value as key.

I think that you problem come from the 'cbind' function, because its output is a matrix those can only store one kind of data type. In your case, the numeric values are casted to char. In contrary of matrix, data.frame could store different type of data, like a list.

Form your code, the key column is converted to char:

> str(answer_data)
'data.frame':   9 obs. of  2 variables:
 $ input_key     : chr  "9061" "8680" "1546" "5376" ...
 $ answer_product: chr  "Water" "Bread" "Soda" "Chips" ...

If instead you construct the data.frame with:

answer_data_2 <- data.frame(
  input_key = input_key,
  answer_product = answer_product,
  stringsAsFactors = FALSE
  )

the key colunm stay numeric

> str(answer_data_2)
'data.frame':   9 obs. of  2 variables:
 $ input_key     : num  9061 8680 1546 5376 9550 ...
 $ answer_product: chr  "Water" "Bread" "Soda" "Chips" ...

and

left_join(output_data,answer_data, by = "input_key")

work with the numerical keys

Upvotes: 1

Related Questions