Reputation: 305
I have an example of a df that looks like this:
Group 0 1 Total
0 1 Text 5 5
1 1 Text2 5 5
2 1 SUM 10 10
3 2 10% 10 10
4 2 100% 100 100
5 2 SUM 110 110
6 3 Text3 4 4
7 3 SUM 4 4
Each row is divided into groups - in this case groups from 1 to 3 - the number of groups is different, but this does not matter.
I would like all elements in df[0]
being number and %
sign (e.g. 10% or 100%) to change their value in Total
column to negative values
An example of such a solution:
Group 0 1 Total
0 1 Text 5 5
1 1 Text2 5 5
2 1 SUM 10 10
3 2 10% 10 -10 <--- change the value here
4 2 100% 100 -100 <--- change the value here
5 2 SUM 110 110
6 3 Text3 4 4
7 3 SUM 4 4
as far as I can pull this off, I don't know how to change these values. I've tried that way:
df[df[0].str.contains('\d%', regex= True, na=False)]
This way I have all the rows in which the value of df[0]
have these requirements - the problem is that I don't know how to change their value in the Total
column to negative. I tried that, but it doesn't help and there are errors:
df['Total'] = np.where(df[df[0].str.contains('\d%', regex= True, na=False)], df['Total'] *= -1, df['Total'])
In addition, I would like to change the value df[0] == SUM
to a negative value but only when all the elements in the same group as SUM were values that met the previous requirements.
Example:
Group 0 1 Total
0 1 Text 5 5
1 1 Text2 5 5
2 1 SUM 10 10
3 2 10% 10 -10
4 2 100% 100 -100
5 2 SUM 110 -110 <--- here should be a negative value - they are in the same group
6 3 Text3 4 4
7 3 SUM 4 4
if in a given group (in this case group 2) there are values which for df[0]
contain only a number and a sign of percentage, then their values for the Total
column should be negative. Additionally, if all rows in the group meet this condition (these are numbers with a percentage sign) then for df[0] == SUM
the value in Total
column should also change the sign.
How to connect it to such a regex? How to make SUM
dependence in a given group on previous values in same group?
Upvotes: 1
Views: 2651
Reputation: 8302
try this,
df.loc[df['0'].str.contains("%"), "Total"] *= -1
df.loc[df.groupby("Group")['Total']
.transform(lambda x : x * -1 if any(x < 0) else x) < 0, "Total"] *= -1
output,
idx Group 0 1 Total
0 0 1 Text 5 5
1 1 1 Text2 5 5
2 2 1 SUM 10 10
3 3 2 10% 10 -10
4 4 2 100% 100 -100
5 5 2 SUM 110 -110
6 6 3 Text3 4 4
7 7 3 SUM 4 4
Upvotes: 1
Reputation: 862611
Use:
#test number with %
m = df[0].str.contains('\d%', regex= True, na=False)
#multiple if matched condition
df.loc[m, 'Total'] *= -1
#test if all values without last are matched m condition
m1 = m.groupby(df['Group']).transform(lambda x: x[:-1].all())
#multiple last value of group if matched m1 and not matched m
df.loc[m1 & ~m, 'Total'] *= -1
print (df)
Group 0 1 Total
0 1 Text 5 5
1 1 Text2 5 5
2 1 SUM 10 10
3 2 10% 10 -10
4 2 100% 100 -100
5 2 SUM 110 -110
6 3 Text3 4 4
7 3 SUM 4 4
Upvotes: 4