sunitprasad1
sunitprasad1

Reputation: 768

R : Calculate frequency of a transaction and increment for every transaction

I have a dataframe called cust

    > head(cust,15)
    Txn_date  Cust_no   Credit
1  2013-12-02 12345000  400.00
2  2013-12-02 12345000  300.00
3  2013-12-02 12345000  304.71
4  2013-12-02 12345000  475.00
5  2013-12-02 12345000  325.00
6  2013-12-02 34567890  1390.00
7  2013-12-02 34567890  100.00
8  2013-12-02 34567890  500.00
9  2013-12-02 23232323  5.00
10 2013-12-02 23232323  130.00
11 2013-12-02 23232323  5975.00
12 2013-12-02 23232323  3711.00
13 2013-12-02 14345422  12530.50
14 2013-12-02 14345422  3312.00
15 2013-12-02 98765432  370.00

To calculate total amount, based on the Cust_no, I have used the within and then applied cumsum to it, like below

newCust <-within(cust, {
   RunningTotal <- ave(cust$Credit, cust$Cust_no, FUN = cumsum)})

and hence my result is this, as you can see the Credit is getting incremented based on the cust number

>head(newCust,15)
   Txn_date  Cust_no   Credit   RunningTotal    
1  2013-12-02 12345000  400.00   400.00
2  2013-12-02 12345000  300.00   700.00
3  2013-12-02 12345000  304.71   1004.71
4  2013-12-02 12345000  475.00   1479.71
5  2013-12-02 12345000  325.00   1804.71
6  2013-12-02 34567890  1390.00  1390.00
7  2013-12-02 34567890  100.00   1490.00
8  2013-12-02 34567890  500.00   1990.00
9  2013-12-02 23232323  5.00     5.00
10 2013-12-02 23232323  130.00   135.00
11 2013-12-02 23232323  5975.00  6110.00
12 2013-12-02 23232323  3711.00  9821.00
13 2013-12-02 14345422  12530.50 12530.50
14 2013-12-02 14345422  3312.00  15842.5
15 2013-12-02 98765432  370.00   370

Now my question is, how do I find out the number of transactions for every Cust_no, using the above logic of within and length?? or maybe, some other logic.

I am not getting the desired output as shown below even I tried using aggregate() and apply..

   Txn_date  Cust_no   Credit   Frequency   
1  2013-12-02 12345000  400.00   1
2  2013-12-02 12345000  300.00   2
3  2013-12-02 12345000  304.71   3
4  2013-12-02 12345000  475.00   4
5  2013-12-02 12345000  325.00   5
6  2013-12-02 34567890  1390.00  1
7  2013-12-02 34567890  100.00   2
8  2013-12-02 34567890  500.00   3
9  2013-12-02 23232323  5.00     1
10 2013-12-02 23232323  130.00   2
11 2013-12-02 23232323  5975.00  3
12 2013-12-02 23232323  3711.00  4
13 2013-12-02 14345422  12530.50 1
14 2013-12-02 14345422  3312.00  2
15 2013-12-02 98765432  370.00   1

Upvotes: 0

Views: 578

Answers (1)

akrun
akrun

Reputation: 887711

You can try

 within(cust, Frequency <- ave(seq_along(Cust_no), Cust_no, FUN=seq_along))
 #     Txn_date  Cust_no   Credit Frequency
 #1  2013-12-02 12345000   400.00         1
 #2  2013-12-02 12345000   300.00         2
 #3  2013-12-02 12345000   304.71         3
 #4  2013-12-02 12345000   475.00         4
 #5  2013-12-02 12345000   325.00         5
 #6  2013-12-02 34567890  1390.00         1
 #7  2013-12-02 34567890   100.00         2
 #8  2013-12-02 34567890   500.00         3
 #9  2013-12-02 23232323     5.00         1
 #10 2013-12-02 23232323   130.00         2
 #11 2013-12-02 23232323  5975.00         3
 #12 2013-12-02 23232323  3711.00         4
 #13 2013-12-02 14345422 12530.50         1
 #14 2013-12-02 14345422  3312.00         2
 #15 2013-12-02 98765432   370.00         1

Or using splitstackshape

 library(splitstackshape)
 getanID(cust, 'Cust_no')

Or using data.table

 library(data.table)
 setDT(cust)[, Frequency:=1:.N, by=Cust_no]

Upvotes: 2

Related Questions