Build a Dict of Counts based on Two Dataframe Columns

Question

I have a dataframe that looks like this:

    start   stop
0   1       2
1   3       4
2   2       1
3   4       3

I'm trying to build a dictionary with key= (start, stop) pairs from my list of tuples and the value= count of their occurrence, regardless of the order. In other words, (1,2) and (2,1) would both count as an occurrence of the pair (1,2) in the list of tuples.

Desired output: dict_count= {('1','2'):2, ('3','4'):2}

Here's my attempt:

my_list=[('1','2'),('3','4')]

for pair in my_list:
    count=0
    if ((df[df['start']]==pair[0] and df[df['end']]==pair[1]) or (df[df['start']]==pair[1]) and df[df['end']]==pair[0])::
        count+=1
    dict_count[pair]=count

However, this gives me a KeyError: KeyError: "['1' ...] not in index"

BENY · Accepted Answer

Using values + sort then we do groupby

df.values.sort()
df
  start stop
0   '1'  '2'
1   '3'  '4'
2   '1'  '2'
3   '3'  '4'
df.groupby(df.columns.tolist()).size()
start  stop
'1'    '2'     2
'3'    '4'     2
dtype: int64

If you need dict

df.groupby(df.columns.tolist()).size().to_dict()
{("'1'", "'2'"): 2, ("'3'", "'4'"): 2}

Update

df['orther']=1
df[['start','stop']]=np.sort(df[['start','stop']].values)
df.groupby(['start','stop']).size().to_dict()
{("'1'", "'2'"): 2, ("'3'", "'4'"): 2}

Build a Dict of Counts based on Two Dataframe Columns

Answers (2)

Related Questions