user15649753
user15649753

Reputation: 523

how I can count signs columns?

How I can count the number of repetitive positive or negative elements in each row?

Suppose I have the following data:

ski      2020    2021      2022     2023       2024      2025
book      1.2     5.6       8.4      -2         -5         6
jar       4.2      -5        -8      2          4           6
kook       -4      -5.2      -2.3    -5.6        -7        8

The output is a list for each row that counts the number of similar signs. For example in the first row we have 3 positive elements and then 2 negative and again one positive. So the output is [3,-2,1]. and for 2 other rows the output is as follows:

 jar   [1,-2,3]
 kook   [-5,1]

Upvotes: 2

Views: 139

Answers (5)

G.G
G.G

Reputation: 765

df1.set_index('ski').stack().reset_index()\
    .assign(col1=lambda dd:np.sign(dd.iloc[:,2]))\
    .assign(col2=lambda dd2:(dd2.col1.diff()!=0).cumsum())\
    .groupby(['ski','col2'],as_index=False).col1.sum()\
    .groupby(['ski'],as_index=False).col1.agg(list).pipe(print)

  ski              col1
0  book  [3.0, -2.0, 1.0]
1   jar  [1.0, -2.0, 3.0]
2  kook       [-5.0, 1.0]

Upvotes: 0

jezrael
jezrael

Reputation: 862731

Use DataFrame.clip with custom lambda function for count consecutive values:

#if necessary
df = df.set_index('ski')
print (df)
      2020  2021  2022  2023  2024  2025
ski                                     
book   1.2   5.6   8.4  -2.0    -5     6
jar    4.2  -5.0  -8.0   2.0     4     6
kook  -4.0  -5.2  -2.3  -5.6    -7     8


from  itertools import groupby

f = lambda x: [ int(sum(key for _ in group)) for key, group in groupby( x )]
s = df.clip(upper=1, lower=-1).apply(f, 1)
print (s)
ski
book    [3, -2, 1]
jar     [1, -2, 3]
kook       [-5, 1]
dtype: object

Upvotes: 2

Valdi_Bo
Valdi_Bo

Reputation: 30991

I assume that ski is the index column. If not, set it as the index, dropping the current one.

Start from defining a function, to be applied to each row:

def myCounts(row):
    sgn = row.ge(0)
    return sgn.groupby(sgn.ne(sgn.shift()).cumsum()).apply(
        lambda grp: grp.count() * (1 if grp.iloc[0] else -1)).tolist()

Then apply it:

result = df.apply(myCounts, axis=1)

For your source data, I got:

ski
book    [3, -2, 1]
jar     [1, -2, 3]
kook       [-5, 1]
dtype: object

My solution is significantly shorter than the other.

Upvotes: 1

keramat
keramat

Reputation: 4543

Use:

import pandas as pd
import numpy as np
data = '''ski      2020    2021      2022     2023       2024      2025
book      1.2     5.6       8.4      -2         -5         6
jar       4.2      -5        -8      2          4           6
kook       -4      -5.2      -2.3    -5.6        -7        8'''
data = np.array([x.split() for x in data.split('\n')])

import seaborn as sns

df = pd.DataFrame(data[1:,1:], columns = data[0,1:], index = data[1:,0])


output = []
import math
for i, row in df.iterrows():
    out = []
    c=0
    prev = math.copysign(1,float(row[0]))
    temp = row.append(pd.Series(-math.copysign(1,float(row[-1]))))
    for cell in temp:
        
        currrent_sign = math.copysign(1,float(cell))
        #print(prev, currrent_sign, c)
        if currrent_sign==prev:
            c+=currrent_sign
        else:
            prev = currrent_sign
            out.append(c)
            c=currrent_sign
    output.append(out)

Output:

[[3.0, -2.0, 1.0], [1.0, -2.0, 3.0], [-5.0, 1.0]]

Upvotes: 1

Shubham Sharma
Shubham Sharma

Reputation: 71689

Let us try:

s = np.sign(df.set_index('ski').stack())
s.groupby([pd.Grouper(level=0), s.diff().ne(0).cumsum()]).sum().groupby(level=0).agg(list)

ski
book    [3.0, -2.0, 1.0]
jar     [1.0, -2.0, 3.0]
kook         [-5.0, 1.0]
dtype: object

Upvotes: 1

Related Questions