Reputation: 11
I have a observation month(ascending order) column as well an account number column. I want to run a unique/distinct in which I want unique account numbers but make sure that the rows R is taking is in ascending order as the observation month.
Observation month A/C num
2010-09 abc1
2010-10 abc1
2010-11 xyz
I want the unique to filter out only the first row as its in ascending order I want to run a unique in my account number column so as I get the first entry of every account
Upvotes: 1
Views: 63
Reputation: 42544
The OP has requested:
I want the unique to filter out only the first row as its in ascending order I want to run a unique in my account number column so as I get the first entry of every account
This can be translated to an aggregation task
Show the first, i.e., minimum, month of observation for each account number
dplyr
library(dplyr) DF %>% group_by(A.C.num) %>% summarise(Observation.month = min(Observation.month))
# A tibble: 2 x 2 A.C.num Observation.month <chr> <chr> 1 abc1 2010-09 2 xyz 2010-11
data.table
library(data.table)
setDT(DF)[, .(Observation.month = min(Observation.month)), by = A.C.num]
A.C.num Observation.month 1: abc1 2010-09 2: xyz 2010-11
library(data.table)
DF <- fread(
" Observation.month A.C.num
2010-09 abc1
2010-10 abc1
2010-11 xyz")
Upvotes: 0
Reputation: 2101
Something like this?
df <- read.table(text = "Observation_month A_C_num
2010-09 abc1
2010-10 abc1
2010-11 xyz", header = T)
df_correct_order <- df[order(df$Observation_month), ]
df_without_duplicates <- df_correct_order[!duplicated(df_correct_order$A_C_num), ]
df_without_duplicates
> df_without_duplicates
Observation_month A_C_num
1 2010-09 abc1
3 2010-11 xyz
Edit:
If you prefer a dplyr
-solution:
library(dplyr)
df %>% arrange(Observation_month) %>% distinct(A_C_num, .keep_all = T)
Upvotes: 1