user
user

Reputation: 2093

Creating new pandas df from old one

I have a dataframe data, and want to append another one at the end. The new dataframe is similar to the previous one, only the entries are swapped. I have the following code that works and illustrates what I am doing:

listL = data.shape[0]  
length = data.shape[1]
mid = (length-1) / 2.0
for j in range(0, 5) :
    data.loc[listL+j] = data.iloc[j]

for j in range(0, 5) :
    for i in range(start, end) :
        left = int(ceil(mid+i)) + 1
        right = int(ceil(mid-i))
        data.iloc[listL+j][left] = data.iloc[j][right]
        data.iloc[listL+j][0] = data.iloc[j][0] + 10

In this example I am adding only the first 5 rows at the end, and swap the columns. This does not scale well at all, and it is very inefficient. Can you help make this more efficient, eliminate the loops, and make it scale well (I would like to work with dataframes that have 10000's of entries). In particular, how can I make the swapping more efficient?

Update: Using one of the answers, I can now do:

tmpdf = data
data = pandas.concat([data, tmpdf])

for j in range(0, listL-1) :
    for i in range(start, end) :
        left = int(ceil(mid+i)) + 1
        right = int(ceil(mid-i))
        data.iloc[listL+j][left] = data.iloc[listL+j][right]
        data.iloc[listL+j][0] = data.iloc[listL+j][0] + 10

where listL is the number of rows in the original df data. I need to optimise the second part:

listL = data.shape[0]  
length = data.shape[1]
mid = (length-1) / 2.0 
for j in range(0, listL-1) :
    for i in range(start, end) :
        left = int(ceil(mid+i)) + 1
        right = int(ceil(mid-i))
        data.iloc[listL+j][left] = data.iloc[listL+j][right]
        data.iloc[listL+j][0] = data.iloc[listL+j][0] + 10

Upvotes: 1

Views: 538

Answers (2)

user
user

Reputation: 2093

This is what I ended up doing, thanks to the answers and comments received:

length = data.shape[1]    
mid = (length-1) / 2.0

start = -int(floor(mid))
end = int(floor(mid))

#for j in range(0, 5) :
#    data.loc[listL+j] = data.iloc[j]

tmpdf = data.copy(deep=True)
for i in range(start, end) :
    left = int(ceil(mid+i)) + 1
    right = int(ceil(mid-i))
    tmpdf[data.columns[left]] = data[data.columns[right]]

data = pandas.concat([data, tmpdf])

Upvotes: 0

Colonel Beauvel
Colonel Beauvel

Reputation: 31161

If you have df1 and df2, you can simply use pd.concat to add df2 first five rows, independantly of how columns are ordered:

pd.concat([df1, df2.ix[:4,]])

Upvotes: 1

Related Questions