Stacey
Stacey

Reputation: 5097

Append a row to a dataframe

Fairly new to pandas and I have created a data frame called rollParametersDf:

 rollParametersDf = pd.DataFrame(columns=['insampleStart','insampleEnd','outsampleStart','outsampleEnd'], index=[])

with the 4 column headings given. Which I would like to hold the reference dates for a study I am running. I want to add rows of data (one at a time) with the index name roll1, roll2..rolln that is created using the following code:

            outsampleEnd = customCalender.iloc[[totalDaysAvailable]]
            outsampleStart = customCalender.iloc[[totalDaysAvailable-outsampleLength+1]]
            insampleEnd = customCalender.iloc[[totalDaysAvailable-outsampleLength]]
            insampleStart = customCalender.iloc[[totalDaysAvailable-outsampleLength-insampleLength+1]]

            print('roll',rollCount,'\t',outsampleEnd,'\t',outsampleStart,'\t',insampleEnd,'\t',insampleStart,'\t')

            rollParametersDf.append({insampleStart,insampleEnd,outsampleStart,outsampleEnd})

I have tried using append but cannot get an individual row to append.

I would like the final dataframe to look like:

     insampleStart insampleEnd outsampleStart outsampleEnd
roll1       1             5           6              8      
roll2       2             6           7              9
:
rolln

Upvotes: 2

Views: 9403

Answers (2)

kilojoules
kilojoules

Reputation: 10083

You give key-values pairs to append

df = pd.DataFrame({'insampleStart':[], 'insampleEnd':[], 'outsampleStart':[], 'outsampleEnd':[]})
df = df.append({'insampleStart':[1,2], 'insampleEnd':[5,6], 'outsampleStart':[6,7], 'outsampleEnd':[8,9]}, ignore_index=True)

Upvotes: 2

b-r-oleary
b-r-oleary

Reputation: 166

The pandas documentation has an example of appending rows to a DataFrame. This appending action is different from that of a list in that this appending action generates a new DataFrame. This means that for each append action you are rebuilding and reindexing the DataFrame which is pretty inefficient. Here is an example solution:

# create empty dataframe
columns=['insampleStart','insampleEnd','outsampleStart','outsampleEnd']
rollParametersDf = pd.DataFrame(columns=columns)

# loop through 5 rows and append them to the dataframe
for i in range(5):
    # create some artificial data
    data = np.random.normal(size=(1, len(columns)))
    # append creates a new dataframe which makes this operation inefficient
    # ignore_index causes reindexing on each call.
    rollParametersDf = rollParametersDf.append(pd.DataFrame(data, columns=columns),
                                               ignore_index=True)

print rollParametersDf

   insampleStart  insampleEnd  outsampleStart  outsampleEnd
0       2.297031     1.792745        0.436704      0.706682
1       0.984812    -0.417183       -1.828572     -0.034844
2       0.239083    -1.305873        0.092712      0.695459
3      -0.511505    -0.835284       -0.823365     -0.182080
4       0.609052    -1.916952       -0.907588      0.898772

Upvotes: 1

Related Questions