L-dan
L-dan

Reputation: 19

Adding column indicating the number of times its corresponding value occurs in another column

My data set is attached. I would like to create a new column that has the length of each unique quarter (QUART) identifier. In other words, for each row I want to create a new value that has the number of times its corresponding QUART appears in the data set

So row 1 should have a new column with the value "4" because 1992.2 occurs 4 times.

My data structure looks like"

ID   QUART      Trasaction   New Column (I want)
1     1992.2     Company 1         4
2     1992.2     Company 2         4 
3     1992.2     Company 3         4 
4     1992.2     Company 4         4
5     1992.3     Company 5         1
6     1992.4     Company 6         1
7     1993.1     Company 7         1

Thanks

enter image description here

Upvotes: 0

Views: 39

Answers (2)

MKR
MKR

Reputation: 20085

One option in base-R can be achieved using mapply as:

df$count <- mapply(function(x)sum(df$QUART == x), df$QUART)
#    ID  QUART Trasaction count
# 1  1 1992.2  Company 1     4
# 2  2 1992.2  Company 2     4
# 3  3 1992.2  Company 3     4
# 4  4 1992.2  Company 4     4
# 5  5 1992.3  Company 5     1
# 6  6 1992.4  Company 6     1
# 7  7 1993.1  Company 7     1

Note: Since QUART is of type numeric / double. Hence my suggestion is to instead of comparing with == difference of two values should be compared against double precision limit of hardware. To take care of those situation a solution could be as

mapply(function(x)sum(abs(df$QUART - x) <= 0.000001), df$QUART)

Data

df <- read.table(text = "
ID   QUART      Trasaction   
1     1992.2     'Company 1'   
2     1992.2     'Company 2'   
3     1992.2     'Company 3'   
4     1992.2     'Company 4'   
5     1992.3     'Company 5'   
6     1992.4     'Company 6'   
7     1993.1     'Company 7'",
header = TRUE, stringsAsFactors = FALSE)

Upvotes: 0

Maurits Evers
Maurits Evers

Reputation: 50668

You can use dplyr::group_by with n() to count the number of identical entries per QUART:

library(tidyverse);
df %>%
    group_by(QUART) %>%
    mutate(count = n());
## A tibble: 7 x 4
## Groups:   QUART [4]
#     ID QUART Trasaction count
#  <int> <dbl> <fct>      <int>
#1     1 1992. Company 1      4
#2     2 1992. Company 2      4
#3     3 1992. Company 3      4
#4     4 1992. Company 4      4
#5     5 1992. Company 5      1
#6     6 1992. Company 6      1
#7     7 1993. Company 7      1

Sample data

df <- read.table(text =
    "ID   QUART      Trasaction
1     1992.2     'Company 1'
2     1992.2     'Company 2'
3     1992.2     'Company 3'
4     1992.2     'Company 4'
5     1992.3     'Company 5'
6     1992.4     'Company 6'
7     1993.1     'Company 7'", header = T)

Upvotes: 1

Related Questions