Reputation: 159
I want to count the number of pipe symbol occurrence in a column of a data frame and it equals 5, then I need to append another pipe(|) symbol to the existing value.
df2['smartexpenseid']
0 878497|253919815?HOTEL?141791520780|||305117||
1 362593||||35068||
2 |231931871509?CARRT?231940968972||||177849|
3 955304|248973233?HOTEL?154687992630||||93191|
4 27984||||5883|3242|
5 3579321|253872763?HOTEL?128891721799|92832814|||
6 127299|248541768?HOTEL?270593355555|||||
7 |231931871509?CARRT?231940968972||||177849|
8 831665||||80658||
9 |247132692?HOTEL?141790728905||||6249|
For ex: for row number 5, the (|) count is 5, so it should add another (|) to the existing value and for other rows, since count is 6, we just leave it as it is. Can somebody help me with this ?
I tried these
if df2['smartexpenseid'].str.count('\|')==5:
df2['smartexpenseid'].append('\|')
This is throwing me error saying "The truth value of a Series is ambiguous"
and also
a = df2['smartexpenseid'].str.count('\|')
if 5 in a:
a.index(5)
Upvotes: 1
Views: 1537
Reputation: 19104
So you have the vectorized str methods down. Now you need to conditionally append an extra '|'
character. See Pandas section on masking for more info.
m = df2['smartexpenseid'].str.count('\|') == 5
df2.loc[m, 'smartexpenseid'] = df2['smartexpenseid'][m].values + '|'
Upvotes: 3