alisha
alisha

Reputation: 11

How to run a unique in a column making sure it keeps the rows in ascending order

I have a observation month(ascending order) column as well an account number column. I want to run a unique/distinct in which I want unique account numbers but make sure that the rows R is taking is in ascending order as the observation month.

   Observation month A/C num

2010-09                 abc1
2010-10                 abc1
2010-11                  xyz

I want the unique to filter out only the first row as its in ascending order I want to run a unique in my account number column so as I get the first entry of every account

Upvotes: 1

Views: 63

Answers (2)

Uwe
Uwe

Reputation: 42544

The OP has requested:

I want the unique to filter out only the first row as its in ascending order I want to run a unique in my account number column so as I get the first entry of every account

This can be translated to an aggregation task

Show the first, i.e., minimum, month of observation for each account number

dplyr

library(dplyr) DF %>% group_by(A.C.num) %>% summarise(Observation.month = min(Observation.month))

# A tibble: 2 x 2
  A.C.num Observation.month
    <chr>             <chr>
1    abc1           2010-09
2     xyz           2010-11

data.table

library(data.table)
setDT(DF)[, .(Observation.month = min(Observation.month)), by = A.C.num]
   A.C.num Observation.month
1:    abc1           2010-09
2:     xyz           2010-11

Data

library(data.table)
DF <- fread(
  "   Observation.month A.C.num
2010-09                 abc1
2010-10                 abc1
2010-11                  xyz")

Upvotes: 0

Tino
Tino

Reputation: 2101

Something like this?

df <- read.table(text = "Observation_month A_C_num
                 2010-09                 abc1
                 2010-10                 abc1
                 2010-11                  xyz", header = T)

df_correct_order <- df[order(df$Observation_month), ]
df_without_duplicates <- df_correct_order[!duplicated(df_correct_order$A_C_num), ]
df_without_duplicates

> df_without_duplicates
  Observation_month A_C_num
1           2010-09    abc1
3           2010-11     xyz

Edit:

If you prefer a dplyr-solution:

library(dplyr)
df %>% arrange(Observation_month) %>% distinct(A_C_num, .keep_all = T)

Upvotes: 1

Related Questions