Reputation: 407
Let me begin by noting that this question is very close to this question about getting non-zero values for each column in a pandas dataframe, but in addition to getting the values, I would also like to know the row from which it was drawn. (And, ultimately, I would like to be able to re-use the code to find columns in which a non-zero value occurred some x amount of times.)
What I have is a dataframe with a count of words for a given year of documents:
|Year / Term | word1 | word2 | word3 | ... | wordn |
|------------|-------|-------|-------|-----|-------|
| 2001 | 23 | 0 | 0 | | 0 |
| 2002 | 0 | 0 | 12 | | 0 |
| 2003 | 0 | 42 | 34 | | 0 |
| year(n) | 0 | 0 | 0 | | 45 |
So for word1
I would like to get both 23 and 2001 -- this could be as a tuple or as a dictionary. (It doesn't matter so long as I can work through the data.) And ultimately, I would like very much to be able to discover that word3
enjoyed a two-year span of usage.
FTR, the dataframe has only 16 rows but it has a lot, a lot of columns. If there's an answer to this questions already available, revealing the weakness of my search fu, I will take the scorn as my just due.
Upvotes: 0
Views: 379
Reputation: 323226
In you case melt
then groupby
df.melt('Year / Term').loc[lambda x : x['value']!=0].groupby('variable')['value'].apply(tupl)
Upvotes: 2