Reputation: 982
I'm trying to subset a dataframe according to the value of a column which can change name over different versions of the dataframe. The value I want to test for is "----" in a column named either "SIC" or "NAICS".
Version 1:
df
MSA SIC EMPFLAG EMP
1 40 ---- 43372
2 40 07-- 192
3 40 0700 192
Version 2:
df
MSA NAICS EMPFLAG EMP
1 40 ---- 78945
2 40 07-- 221
3 40 0700 221
The expect result is:
Version 1:
df
MSA EMP
1 40 43372
Version 2:
df
MSA EMP
1 40 78945
The following code doesn't work:
df <- ifelse("SIC" %in% colnames(df),
df[df$SIC=="----", c("MSA", "EMP")],
df[df$NAICS=="----", c("MSA", "EMP")])
Upvotes: 0
Views: 211
Reputation: 76402
The problem with your code is the use of the vectorized ifelse
when you don't really need it.
df <- if(any(grepl("SIC", colnames(df)))) {
df[df$SIC=="----", c("MSA", "EMP")]
} else {
df[df$NAICS=="----", c("MSA", "EMP")]
}
df
Note that you can also use %in%
, which is probably simpler.
df <- if(any("SIC" %in% colnames(df))){
df[df$SIC=="----", c("MSA", "EMP")]
} else {
df[df$NAICS=="----", c("MSA", "EMP")]
}
Finally, after reading the answer by William Ashford, the following one-liner will do exactly what you've asked. Just use the fact that the columns in question are always the second one.
df <- df[df[, 2] == "----",-which(names(df) %in% c('SIC','NAICS','EMPFLAG'))]
The credits for this go to him.
Upvotes: 1
Reputation: 339
As seen in How to drop columns by name in a data frame
Subset your dataframe such that,
df = df[,-which(names(df) %in% c('SIC','NAICS'))]
This was a very easy answer to find so mights I suggest you take a look through SO before posting questions.
Upvotes: 0