Reputation: 401
My updated data set is as below
tmp_number_1 tmp_name_1 ID
7990918840 Yvette 33098
7958376552 Mum 33098
7951755055 Dad 33098
7951755055 Dad mob 33098
7581498864 Wynne Lewis 33098
7581498864 Wynne Lewis mob 33098
87128486 James Braithewaite 33098
1869353690 Fleetclaims 33098
447915381850 Kath 33098
919446540717 Sujata Egbert 33098
87124812 Chris Riley 33098
7958376552 Mum Mob 33098
7958376552 Mum Mob new 33098
I want the last row of the records where "tmp_number_1" is repeated.
Answer I am looking is
tmp_number_1 tmp_name_1 ID
7990918840 Yvette 33098
**7951755055 Dad mob** 33098
**7581498864 Wynne Lewis mob** 33098
87128486 James Braithewaite 33098
1869353690 Fleetclaims 33098
447915381850 Kath 33098
919446540717 Sujata Egbert 33098
87124812 Chris Riley 33098
**7958376552 Mum Mob new** 33098
** is the last occurance of "tmp_number_1"
Upvotes: 1
Views: 68
Reputation: 19960
Wouldn't this be a good place to use setkey
and unique
? Borrowing df1
from @akrun with the updated ID
column.
EDIT - As per @Arun suggestion, the use of setkey
is not needed here.
library(data.table)
unique(setDT(df1), by="tmp_number_1", fromLast=TRUE)
tmp_number_1 tmp_name_1 ID
1: 7990918840 Yvette 33098
2: 7951755055 Dad mob 33098
3: 7581498864 Wynne Lewis mob 33098
4: 87128486 James Braithewaite 33098
5: 1869353690 Fleetclaims 33098
6: 447915381850 Kath 33098
7: 919446540717 Sujata Egbert 33098
8: 87124812 Chris Riley 33098
9: 7958376552 Mum Mob new 33098
Upvotes: 1
Reputation: 887048
You could try
library(dplyr)
df1 %>%
group_by(tmp_number_1) %>%
slice(n())
Or
library(data.table)
setDT(df1)[, .SD[.N], tmp_number_1]
Or
setDT(df1)[df1[,seq_len(.N)==.N , tmp_number_1]$V1]
df1 <- structure(list(tmp_number_1 = c(7990918840, 7958376552,
7951755055,
7951755055, 7581498864, 7581498864, 87128486, 1869353690,
447915381850,
919446540717, 87124812, 7958376552, 7958376552),
tmp_name_1 = c("Yvette",
"Mum", "Dad", "Dad mob", "Wynne Lewis", "Wynne Lewis mob",
"James Braithewaite",
"Fleetclaims", "Kath", "Sujata Egbert", "Chris Riley", "Mum Mob",
"Mum Mob new")), .Names = c("tmp_number_1", "tmp_name_1"),
class = "data.frame", row.names = c(NA, -13L))
Upvotes: 4
Reputation: 31161
If df
is your data.frame
, you can try:
library(data.table)
setDT(df)[,tail(tmp_name_1,1),by=tmp_number_1]
# tmp_number_1 V1
#1: 7990918840 Yvette
#2: 7958376552 Mum Mob new
#3: 7951755055 Dad mob
#4: 7581498864 Wynne Lewis mob
#5: 87128486 James Braithewaite
#6: 1869353690 Fleetclaims
#7: 447915381850 Kath
#8: 919446540717 Sujata Egbert
#9: 87124812 Chris Riley
Upvotes: 3