Reputation: 31
I have a data frame with the following structure:
library(tidyverse)
df <- tibble::tribble(
~var1, ~var2, ~var3,
"ano 2005", "km 128000", "marca chevrolet",
"ano 2019", "marca hyundai tucson", "km 50000",
"marca grand vitara sz", "ano 2012", "NA"
)
I need to create new variables where the corresponding information is assigned for this use the following code
df %>%
stack() %>%
select(-ind) %>%
separate(values, into = c("column", "value")) %>%
pivot_wider(value, column, values_fn = list(value = list)) %>%
unnest(cols = c(marca, ano, km))
but having NA I have the following error: No common size for marca
, size 120, and km
, size 119.
There is also another error and it just returns the first word and the rest eliminates it.
if anyone can help me I would really appreciate it
ano marca km
2005 chevrolet 128000
2019 hyundai 50000
2012 grand
Upvotes: 1
Views: 63
Reputation: 3943
Here is a tidyverse
solution. This requires the NA
value in your data frame to be a NA_character_
, not the quoted string "NA"
. The argument extra='merge'
tells separate()
not to drop additional words in the car model name beyond the first.
# replace quoted NA with true NA
df[df == 'NA'] <- NA_character_
df %>%
mutate(id = 1:nrow(df)) %>%
pivot_longer(-id) %>%
separate(value, into = c('column', 'value'), extra = 'merge') %>%
select(-name) %>%
filter(!is.na(column)) %>%
pivot_wider(id_cols = id, names_from = column)
# A tibble: 3 x 4
# id ano km marca
# <int> <chr> <chr> <chr>
#1 1 2005 128000 chevrolet
#2 2 2019 50000 hyundai tucson
#3 3 2012 NA grand vitara sz
Upvotes: 0
Reputation: 27732
here is my data.table
-approach
library( data.table )
#set to data.table format
setDT(df)
#create row_id's
df[, id := .I][]
#melt to long
ans <- melt( df, id.vars = "id" )
#split strings, using first space as separator
ans[, c("col_name", "col_value") := as.data.table( stringr::str_split_fixed( value, " ", 2 ) ) ]
#cast to wide
dcast( ans[!col_name == "NA",], id ~ col_name, value.var = "col_value")
# id ano km marca
# 1: 1 2005 128000 chevrolet
# 2: 2 2019 50000 hyundai tucson
# 3: 3 2012 <NA> grand vitara sz
Upvotes: 1