Reputation: 704
When I'm building out dataframes inside a loop, I often find myself using this convention:
complete_df = None
for data_chunk in data_chunks:
partial_df = get_partial_df(data_chunk)
partial_df = do_some_stuff_to_my_df(partial_df)
if complete_df is None:
complete_df = partial_df
else:
complete_df = complete_df.append(partial_df)
I'm looking for a better / shorter / more pythonic way to do this. A ternary statement seems like it wouldn't be an improvement.
Upvotes: 1
Views: 94
Reputation: 1371
You can do away with the if else block if you initialize the complete_df
to an empty DataFrame
like this:
import pandas as pd
complete_df = pd.DataFrame()
for data_chunk in data_chunks:
partial_df = get_partial_df(data_chunk)
partial_df = do_some_stuff_to_my_df(partial_df)
complete_df = complete_df.append(partial_df)
Upvotes: 1
Reputation: 1
try this
complete_df = None
for data_chunk in data_chunks:
partial_df = get_partial_df(data_chunk)
complete_df = partial_df if complete_df is None else complete_df.append(partial_df)
Upvotes: 0
Reputation: 95
data_chunks = range(1, 100, 4)
def get_partial_df(num):
return num
#complete_df = None
complete_df = list()
print(type(complete_df))
for data_chunk in data_chunks:
partial_df = get_partial_df(data_chunk)
complete_df.append(partial_df)
# if complete_df is None:
# complete_df = partial_df ##here complete_df is int
# else:
# complete_df = complete_df.append(partial_df) ## appending to be done on list/tuple.
Upvotes: 0