Reputation: 1147
I need to fill a pandas dataframe in a list comprehension.
Although rows satisfying the criterias are appended to the dataframe.
However, at the end, dataframe is empty.
Is there a way to resolve this?
In real code, I'm doing many other calculations. This is a simplified code to regenerate it.
import pandas as pd
main_df = pd.DataFrame(columns=['a','b','c','d'])
main_df=main_df.append({'a':'a1', 'b':'b1','c':'c1', 'd':'d1'},ignore_index=True)
main_df=main_df.append({'a':'a2', 'b':'b2','c':'c2', 'd':'d2'},ignore_index=True)
main_df=main_df.append({'a':'a3', 'b':'b3','c':'c3', 'd':'d3'},ignore_index=True)
main_df=main_df.append({'a':'a4', 'b':'b4','c':'c4', 'd':'d4'},ignore_index=True)
print(main_df)
sub_df = pd.DataFrame()
df_columns = main_df.columns.values
def search_using_list_comprehension(row,sub_df,df_columns):
if row[0]=='a1' or row[0]=='a2':
dict= {a:b for a,b in zip(df_columns,row)}
print('dict: ', dict)
sub_df=sub_df.append(dict, ignore_index=True)
print('sub_df.shape: ', sub_df.shape)
[search_using_list_comprehension(row,sub_df,df_columns) for row in main_df.values]
print(sub_df)
print(sub_df.shape)
Upvotes: 0
Views: 380
Reputation: 1875
The problem is that you define an empty frame with sub_df = dp.DataFrame()
then you assign the same variable within the function parameters and within the list comprehension you provide always the same, empty sub_df as parameter (which is always empty). The one you append to within the function is local to the function only. Another “issue” is using python’s dict
variable as user defined. Don’t do this.
Here is what can be changed in your code in order to work, but I would strongly advice against it
import pandas as pd
df_columns = main_df.columns.values
sub_df = pd.DataFrame(columns=df_columns)
def search_using_list_comprehension(row):
global sub_df
if row[0]=='a1' or row[0]=='a2':
my_dict= {a:b for a,b in zip(df_columns,row)}
print('dict: ', my_dict)
sub_df = sub_df.append(my_dict, ignore_index=True)
print('sub_df.shape: ', sub_df)
[search_using_list_comprehension(row) for row in main_df.values]
print(sub_df)
print(sub_df.shape)
Upvotes: 2