Reputation: 2797
I have a for-while-loop combination to check value differences for person-year observations. The entire things gives me a boolean list as outcome which I need for further analysis.
I tried several versions of append
, none was working.
Here is my data:
import pandas as pd
df = pd.DataFrame({'year': ['2001', '2004', '2005', '2006', '2007', '2008', '2009',
'2003', '2004', '2005', '2006', '2007', '2008', '2009',
'2003', '2004', '2005', '2006', '2007', '2008', '2009'],
'id': ['1', '1', '1', '1', '1', '1', '1',
'2', '2', '2', '2', '2', '2', '2',
'5', '5', '5','5', '5', '5', '5'],
'money': ['15', '15', '15', '21', '21', '21', '21',
'17', '17', '17', '20', '17', '17', '17',
'25', '30', '22', '25', '8', '7', '12']}).astype(int)
Here is my code:
# for every person
for i in df.id.unique():
# find the first and last index value
first = df[df['id']==i].index.values.astype(int)[0]
last = df[df['id']==i].index.values.astype(int)[-1]
# first element has to be kept
print(False)
# for all elements, compare values next to each other
while first < last:
abs_diff = abs( df['money'][first] - df['money'][first+1] ) > 0
# print TRUE, when adjacent values differ
print(abs_diff)
# update the counter
first +=1
It returns a boolean list namely: FalseFalseFalseTrueFalseFalseFalseFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueTrueTrueTrueTrue
Question: How can I save that loop output in a list?
Upvotes: 0
Views: 143
Reputation: 18647
IIUC use groupby
, diff
, fillna
and ne
:
df.groupby('id')['money'].diff().fillna(0).ne(0).to_list()
[out]
[False,
False,
False,
True,
False,
False,
False,
False,
False,
False,
True,
True,
False,
False,
False,
True,
True,
True,
True,
True,
True]
Upvotes: 2
Reputation: 443
from collections import defaultdict
result = defaultdict(list)
for i in df.id.unique():
# find the first and last index value
first = df[df['id']==i].index.values.astype(int)[0]
last = df[df['id']==i].index.values.astype(int)[-1]
# first element has to be kept
print(False)
result.append("False")
# my try: diff = []
# for all elements, compare values next to each other
while first < last:
abs_diff = abs( df['money'][first] - df['money'][first+1] ) > 0
# print TRUE, when adjacent values differ
print(abs_diff)
result[i].append(abs_diff)
# my try: diff.append(abs_diff)
# update the counter
first +=1
I guess it will work if you want to save id for each person individually.
Upvotes: 1
Reputation: 8917
If I understand your question correctly, you just need to define the variable outside of the for
loop:
output = list()
for i in df.id.unique():
# find the first and last index value
first = df[df['id']==i].index.values.astype(int)[0]
last = df[df['id']==i].index.values.astype(int)[-1]
# first element has to be kept
output.append(False)
# for all elements, compare values next to each other
while first < last:
abs_diff = abs( df['money'][first] - df['money'][first+1] ) > 0
# print TRUE, when adjacent values differ
output.append(abs_diff)
# update the counter
first +=1
print(output)
At the moment you're resetting the value each time through, so that you only end up with the output for the last id
.
Upvotes: 1
Reputation: 794
try this
import pandas as pd
df = pd.DataFrame({'year': ['2001', '2004', '2005', '2006', '2007', '2008', '2009',
'2003', '2004', '2005', '2006', '2007', '2008', '2009',
'2003', '2004', '2005', '2006', '2007', '2008', '2009'],
'id': ['1', '1', '1', '1', '1', '1', '1',
'2', '2', '2', '2', '2', '2', '2',
'5', '5', '5','5', '5', '5', '5'],
'money': ['15', '15', '15', '21', '21', '21', '21',
'17', '17', '17', '20', '17', '17', '17',
'25', '30', '22', '25', '8', '7', '12']}).astype(int)
# for every person
l=list()
for i in df.id.unique():
# find the first and last index value
first = df[df['id']==i].index.values.astype(int)[0]
last = df[df['id']==i].index.values.astype(int)[-1]
# first element has to be kept
print(False)
l.append(False)
# my try: diff = []
# for all elements, compare values next to each other
while first < last:
abs_diff = abs( df['money'][first] - df['money'][first+1] ) > 0
# print TRUE, when adjacent values differ
l.append(abs_diff)
# my try: diff.append(abs_diff)
# update the counter
first +=1
print(l)
output:
[False, False, False, True, False, False, False, False, False, False, True, True, False, False, False, True, True, True, True, True, True]
Upvotes: 1