Reputation: 1133
I am working with Census data and I need to combine four character columns into a single column.
Example:
LOGRECNO STATE COUNTY TRACT BLOCK
60 01 001 021100 1053
61 01 001 021100 1054
62 01 001 021100 1055
63 01 001 021100 1056
64 01 001 021100 1057
65 01 001 021100 1058
I want to create a new column that adds the strings of STATE, COUNTY, TRACT, and BLOCK together into a single string. Example:
LOGRECNO STATE COUNTY TRACT BLOCK BLOCKID
60 01 001 021100 1053 01001021101053
61 01 001 021100 1054 01001021101054
62 01 001 021100 1055 01001021101055
63 01 001 021100 1056 01001021101056
64 01 001 021100 1057 01001021101057
65 01 001 021100 1058 01001021101058
I've tried:
AL_Blocks$BLOCK_ID<- paste(c(AL_Blocks$STATE, AL_Blocks$County, AL_Blocks$TRACT, AL_Blocks$BLOCK), collapse = "")
But this combines all rows of all four columns into a single string.
Upvotes: 31
Views: 102542
Reputation: 369
The new kid on the block is the glue
package:
library(glue)
my_data %>%
glue::glue("{STATE}{COUNTY}{TRACT}{BLOCK}")
Upvotes: 2
Reputation: 1
You can both WRITE and READ Text files with any specified "string-separator", not necessarily a character separator. This is very useful in many cases when the data has practically all terminal symbols, and thus, no 1 symbol can be used as a separator. Here are examples of read and write functions:
writeSepText <- function(df, fileName, separator) {
con <- file(fileName)
data <- apply(df, 1, paste, collapse = separator)
# data
data <- writeLines(data, con)
close(con)
return
}
writeSepText(df=as.data.frame(Titanic), fileName="/Users/user/break_sep.txt", separator="<break>")
readSepText <- function(fileName, separator) {
data <- readLines(con <- file(fileName))
close(con)
records <- sapply(data, strsplit, split=separator)
dataFrame <- data.frame(t(sapply(records,c)))
rownames(dataFrame) <- 1: nrow(dataFrame)
return(as.data.frame(dataFrame,stringsAsFactors = FALSE))
}
df <- readSepText(fileName="/Users/user/break_sep.txt", separator="<break>"); df
Upvotes: 0
Reputation: 105
You can use tidyverse
package:
DF %>% unite(new_var, STATE, COUNTY, TRACT, BLOCK)
Upvotes: 4
Reputation: 175
You can try this too
AL_Blocks <- transform(All_Blocks, BLOCKID = paste(STATE,COUNTY,
TRACT, BLOCK, sep = "")
Upvotes: 6
Reputation: 193517
You can use do.call
and paste0
. Try:
AL_Blocks$BLOCK_ID <- do.call(paste0, AL_Block[c("STATE", "COUNTY", "TRACT", "BLOCK")])
Example output:
do.call(paste0, AL_Blocks[c("STATE", "COUNTY", "TRACT", "BLOCK")])
# [1] "010010211001053" "010010211001054" "010010211001055" "010010211001056"
# [5] "010010211001057" "010010211001058"
do.call(paste0, AL_Blocks[2:5])
# [1] "010010211001053" "010010211001054" "010010211001055" "010010211001056"
# [5] "010010211001057" "010010211001058"
You can also use unite
from "tidyr", like this:
library(tidyr)
library(dplyr)
AL_Blocks %>%
unite(BLOCK_ID, STATE, COUNTY, TRACT, BLOCK, sep = "", remove = FALSE)
# LOGRECNO BLOCK_ID STATE COUNTY TRACT BLOCK
# 1 60 010010211001053 01 001 021100 1053
# 2 61 010010211001054 01 001 021100 1054
# 3 62 010010211001055 01 001 021100 1055
# 4 63 010010211001056 01 001 021100 1056
# 5 64 010010211001057 01 001 021100 1057
# 6 65 010010211001058 01 001 021100 1058
where "AL_Blocks" is provided as:
AL_Blocks <- structure(list(LOGRECNO = c("60", "61", "62", "63", "64", "65"),
STATE = c("01", "01", "01", "01", "01", "01"), COUNTY = c("001", "001",
"001", "001", "001", "001"), TRACT = c("021100", "021100", "021100",
"021100", "021100", "021100"), BLOCK = c("1053", "1054", "1055", "1056",
"1057", "1058")), .Names = c("LOGRECNO", "STATE", "COUNTY", "TRACT",
"BLOCK"), class = "data.frame", row.names = c(NA, -6L))
Upvotes: 22
Reputation: 464
Or try this
DF$BLOCKID <-
paste(DF$LOGRECNO, DF$STATE, DF$COUNTY,
DF$TRACT, DF$BLOCK, sep = "")
(Here is a method to set up the dataframe for people coming into this discussion later)
DF <-
data.frame(LOGRECNO = c(60, 61, 62, 63, 64, 65),
STATE = c(1, 1, 1, 1, 1, 1),
COUNTY = c(1, 1, 1, 1, 1, 1),
TRACT = c(21100, 21100, 21100, 21100, 21100, 21100),
BLOCK = c(1053, 1054, 1055, 1056, 1057, 1058))
Upvotes: 5
Reputation: 1538
Try this:
AL_Blocks$BLOCK_ID<- with(AL_Blocks, paste0(STATE, COUNTY, TRACT, BLOCK))
there was a typo in County... it should've been COUNTY. Also, you don't need the collapse parameter.
I hope that helps.
Upvotes: 30