John Laudun
John Laudun

Reputation: 407

Find columns with only one non-zero value in pandas

Let me begin by noting that this question is very close to this question about getting non-zero values for each column in a pandas dataframe, but in addition to getting the values, I would also like to know the row from which it was drawn. (And, ultimately, I would like to be able to re-use the code to find columns in which a non-zero value occurred some x amount of times.)

What I have is a dataframe with a count of words for a given year of documents:

|Year / Term | word1 | word2 | word3 | ... | wordn |
|------------|-------|-------|-------|-----|-------|
| 2001       |  23   |   0   |   0   |     |   0   |
| 2002       |   0   |   0   |  12   |     |   0   |
| 2003       |   0   |  42   |  34   |     |   0   |
| year(n)    |   0   |   0   |   0   |     |  45   |

So for word1 I would like to get both 23 and 2001 -- this could be as a tuple or as a dictionary. (It doesn't matter so long as I can work through the data.) And ultimately, I would like very much to be able to discover that word3 enjoyed a two-year span of usage.

FTR, the dataframe has only 16 rows but it has a lot, a lot of columns. If there's an answer to this questions already available, revealing the weakness of my search fu, I will take the scorn as my just due.

Upvotes: 0

Views: 379

Answers (1)

BENY
BENY

Reputation: 323226

In you case melt then groupby

df.melt('Year / Term').loc[lambda x : x['value']!=0].groupby('variable')['value'].apply(tupl)

Upvotes: 2

Related Questions