C# Deedle equivalent to pandas df.drop_duplicates?

Question

In Python pandas, I can easily drop duplicates in a DataFrame with:

df1.drop_duplicates(['Service Date', 'Customer Number'], inplace=True)

Is there anything in C# or Deedle that's this simple and fast? Or do I need to iterate over the entire frame (from a large CSV file) to drop duplicates?

The data I'm working with is imported from a large CSV file with about 40 columns and 12k rows. For each date, there are multiple entries for Customer Number. I need to eliminate duplicate Customer Number rows (leaving only one unique) per date.

Here's some simplified data, using DATE and RECN as the columns used to de-dupify:

NAME,       TYPE,  DATE,      RECN,  COMM
Kermit,     Frog,  06/30/14,  1,     1test
Kermit,     Frog,  06/30/14,  1,     2test
Ms. Piggy,  Pig,   07/01/14,  2,     1test
Fozzy,      Bear,  06/29/14,  3,     1test
Kermit,     Frog,  07/02/14,  1,     3test
Kermit,     Frog,  07/02/14,  1,     4test
Kermit,     Frog,  07/02/14,  1,     5test
Ms. Piggy,  Pig,   07/02/14,  2,     3test
Fozzy,      Bear,  07/02/14,  3,     2test
Ms. Piggy,  Pig,   07/02/14,  2,     2test

C# Deedle equivalent to pandas df.drop_duplicates?

Answers (1)

Related Questions