Reputation: 43
This is my Dataframe:
CustomerID InvoiceNo
0 12346.0 [541431, C541433]
1 12347.0 [537626, 542237, 549222, 556201, 562032, 57351]
2 12348.0 [539318, 541998, 548955, 568172]
3 12349.0 [577609]
4 12350.0 [543037]
Desired Output:
CustomerID InvoiceCount
0 12346.0 2
1 12347.0 6
2 12348.0 4
3 12349.0 1
4 12350.0 1
I want to calculate the total number of Invoice a customer(CustomerID) have. Please help.
Upvotes: 1
Views: 1128
Reputation: 142641
If you have real list
then you can do
df['InvoiceCount'] = df['InvoiceNo'].apply(len)
If you have string
with list then you would have to convert string to real list before count
df['InvoiceNo'] = df['InvoiceNo'].apply(eval)
But it may not work if number C541433
(with C
) is correct and may need
df['InvoiceCount'] = df['InvoiceNo'].apply(lambda x: len(x.split(',')))
or similar to example in @Datanovice comment
df['InvoiceCount'] = df['InvoiceNo'].str.split(',').str.len()
Minimal working example
import pandas as pd
import io
text = '''CustomerID;InvoiceNo
12346.0;[541431, 541433]
12347.0;[537626, 542237, 549222, 556201, 562032, 57351]
12348.0;[539318, 541998, 548955, 568172]
12349.0;[577609]
12350.0;[543037]'''
df = pd.read_csv(io.StringIO(text), sep=';')
print( df['InvoiceNo'].apply(lambda x: len(eval(x))) )
print( df['InvoiceNo'].apply(eval).apply(len) )
print( df['InvoiceNo'].apply(lambda x: len(x.split(','))) )
print( df['InvoiceNo'].str.split(',').str.len() )
df['InvoiceNo'] = df['InvoiceNo'].apply(eval)
print( df['InvoiceNo'].apply(len) )
Upvotes: 0
Reputation: 1260
See if this works:
df["InvoiceCount"] = df['InvoiceNo'].str.len()
Upvotes: 4
Reputation: 74
If thats in a list, you can use the function 'len'
So let's say the list is in the variable values:
values = [537626, 542237, 549222, 556201, 562032, 57351]
then the amount is:
len(values) # 6
this would return 6 in this example
Upvotes: -1