Reputation: 45
Totally new to R and I trying to solve this using the dplyr package. I want to filter out and return countries that both have Import and Export values and view them separately. I tried a lot of methods such as select and filter but have been unable to do so.
Country Year Quantity Description Import/Export
A 2001 10 Frozen Export
B 2001 50 Fresh Import
B 2004 20 Frozen Export
C 2003 30 Frozen Import
C 2005 40 Fresh Export
C 2006 60 Frozen Import
D 2007 290 Fresh Import
Ideally, the end result should be this:
Country Year Quantity Description Import/Export
B 2001 50 Fresh Import
B 2004 20 Frozen Export
C 2003 30 Frozen Import
C 2005 40 Fresh Export
C 2006 60 Frozen Import
Upvotes: 2
Views: 119
Reputation: 887901
Using data.table
library(data.table)
setDT(df)[df[, .I[all(c('Import', 'Export') %in% `Import/Export`)], Country]$V1]
Country Year Quantity Description Import/Export
1: B 2001 50 Fresh Import
2: B 2004 20 Frozen Export
3: C 2003 30 Frozen Import
4: C 2005 40 Fresh Export
5: C 2006 60 Frozen Import
Upvotes: 1
Reputation: 9878
We can group_by()
Country, then filter all groups with any
"Import/Export" =='Import' and any
""Import/Export"" == 'Export'
library(dplyr)
df %>% group_by(Country) %>%
filter(any(`Import/Export`=='Import') &
any(`Import/Export`=='Export')) %>%
ungroup()
# A tibble: 5 x 5
Country Year Quantity Description `Import/Export`
<chr> <dbl> <dbl> <chr> <chr>
1 B 2001 50 Fresh Import
2 B 2004 20 Frozen Export
3 C 2003 30 Frozen Import
4 C 2005 40 Fresh Export
5 C 2006 60 Frozen Import
data
structure(list(Country = c("A", "B", "B", "C", "C", "C", "D"),
Year = c(2001, 2001, 2004, 2003, 2005, 2006, 2007), Quantity = c(10,
50, 20, 30, 40, 60, 290), Description = c("Frozen", "Fresh",
"Frozen", "Frozen", "Fresh", "Frozen", "Fresh"), `Import/Export` = c("Export",
"Import", "Export", "Import", "Export", "Import", "Import"
)), row.names = c(NA, -7L), class = c("tbl_df", "tbl", "data.frame"
))
Upvotes: 3
Reputation: 389275
Using data from @GuedesBF answer here is another dplyr
way to filter groups which has both 'Import'
and 'Export'
.
library(dplyr)
df %>%
group_by(Country) %>%
filter(all(c('Import', 'Export') %in% `Import/Export`)) %>%
ungroup()
# Country Year Quantity Description `Import/Export`
# <chr> <dbl> <dbl> <chr> <chr>
#1 B 2001 50 Fresh Import
#2 B 2004 20 Frozen Export
#3 C 2003 30 Frozen Import
#4 C 2005 40 Fresh Export
#5 C 2006 60 Frozen Import
Upvotes: 3
Reputation: 10637
library(tidyverse)
data <- tribble(
~Country, ~Year, ~Quantity, ~Description, ~`Import/Export`,
"A", 2001, 10, "Frozen", "Export",
"B", 2001, 50, "Fresh", "Import",
"B", 2004, 20, "Frozen", "Export",
"C", 2003, 30, "Frozen", "Import",
"C", 2005, 40, "Fresh", "Export",
"C", 2006, 60, "Frozen", "Import",
"D", 2007, 290, "Fresh", "Import"
)
data
#> # A tibble: 7 x 5
#> Country Year Quantity Description `Import/Export`
#> <chr> <dbl> <dbl> <chr> <chr>
#> 1 A 2001 10 Frozen Export
#> 2 B 2001 50 Fresh Import
#> 3 B 2004 20 Frozen Export
#> 4 C 2003 30 Frozen Import
#> 5 C 2005 40 Fresh Export
#> 6 C 2006 60 Frozen Import
#> 7 D 2007 290 Fresh Import
selected_countries <-
data %>%
mutate(is_there = TRUE) %>%
distinct(Country, `Import/Export`, is_there) %>%
pivot_wider(names_from = "Import/Export", values_from = is_there) %>%
filter(!is.na(Export) & !is.na(Import)) %>%
pull(Country) %>%
unique()
selected_countries
#> [1] "B" "C"
data %>% filter(Country %in% selected_countries)
#> # A tibble: 5 x 5
#> Country Year Quantity Description `Import/Export`
#> <chr> <dbl> <dbl> <chr> <chr>
#> 1 B 2001 50 Fresh Import
#> 2 B 2004 20 Frozen Export
#> 3 C 2003 30 Frozen Import
#> 4 C 2005 40 Fresh Export
#> 5 C 2006 60 Frozen Import
Created on 2021-09-11 by the reprex package (v2.0.1)
Upvotes: 0