Adam_S
Adam_S

Reputation: 770

Convert column from list to factor or character

I have addresses I need to compare. I got 90% of the way there thanks to a helpful answer on this site, but I need the last 10%.

I have the code below to generate addresses for comparison. I need to see if there is any difference between addr1 and addr2.

eg_data <- data.frame(addr1 = c('123 Main St','742 Evergreen 
Ter','8435 Roanoke Dr','1340 N State Pkwy') , addr2 = c('123 
Main St Apt 4','742 Evergreen Terrace','8435 Roanoke Dr Unit 
5','1340 N State Pkwy'), stringsAsFactors = FALSE)

Next part, very helpful, is combining vecsets subfunction vsetdiff with strsplit, to compare the two and extract any difference

eg_data$addr_comp2_1 <- mapply(vsetdiff, strsplit(eg_data$addr2, 
split=""), strsplit(eg_data$addr1, split=""))

Run the code and see, but I am left with the differences in the format like c(" ","A","p","t"," ","4") for difference b/t the row1 addresses, and it is in list form. I need this column to be individual rows of strings or factors. In the data view, I need to see "addr_comp2_1 : chr "123..." rather than addr_comp2_1:List of 4 , so that the dataframe itself gives me " Apt 4" in col3 / row1 and not c(" ","A","p","t"," ","4").

I have tried

eg_data$fix <- paste(eg_data$addr_comp2_1, collapse=', ')
eg_data$fix2 <- str_c(eg_data$addr_comp2_1, collapse=',')
eg_data$fix3 <- as.factor(eg_data$addr_comp2_1)
eg_data$fix4 <- lapply(eg_data$addr_comp2_1, unlist)
eg_data$fix5 <- (matrix(unlist(eg_data$addr_comp2_1), nrow=4, 
byrow=F))
eg_data$fix6 <- unlist(eg_data$addr_comp2_1, use.names=FALSE, 
recursive=FALSE) 

These obviously don't work. The fix5 is close, but it gives each individual character its own row, as opposed to taking the groupings of c(), so I end up with 17 rows, instead of adding a single column of four.

Any help is appreciated.

Upvotes: 0

Views: 66

Answers (1)

Naveen
Naveen

Reputation: 1210

You just have to concatenate the results. lapply function will do it for you.

Code

eg_data <- data.frame(addr1 = c('123 Main St','742 Evergreen 
Ter','8435 Roanoke Dr','1340 N State Pkwy') , addr2 = c('123 
Main St Apt 4','742 Evergreen Terrace','8435 Roanoke Dr Unit 
5','1340 N State Pkwy'), stringsAsFactors = FALSE)

eg_data$addr_comp2_1 <- mapply(vsetdiff, strsplit(eg_data$addr2,split=""), strsplit(eg_data$addr1, split=""))

eg_data$addr_comp2_2 = lapply(eg_data$addr_comp2_1, paste, collapse = '')

Output

enter image description here

Upvotes: 1

Related Questions