justinian482
justinian482

Reputation: 1075

Add data from one column to another column on every other row

I have two data frames:

import pandas as pd
import numpy as np
sgRNA = pd.Series(["ABL1_sgABL1_130854834","ABL1_sgABL1_130862824","ABL1_sgABL1_130872883","ABL1_sgABL1_130884018"])
sequence = pd.Series(["CTTAGGCTATAATCACAATG","GGTTCATCATCATTCAACGG","TCAGTGATGATATAGAACGG","TTGCTCCCTCGAAAAGAGCG"])
df1=pd.DataFrame(sgRNA,columns=["sgRNA"])
df1["sequence"]=sequence

df2=pd.DataFrame(columns=["column"],
                    index=np.arange(len(df1) * 2))

I want to add values from both columns from df1 to df2 every other row, like this:

ABL1_sgABL1_130854834
CTTAGGCTATAATCACAATG
ABL1_sgABL1_130862824
GGTTCATCATCATTCAACGG
ABL1_sgABL1_130872883
TCAGTGATGATATAGAACGG
ABL1_sgABL1_130884018
TTGCTCCCTCGAAAAGAGCG

To do this for df1["sgRNA"] I used this code:

df2.iloc[0::2, :]=df1["sgRNA"]

But I get this error: ValueError: could not broadcast input array from shape (4,) into shape (4,1). What am I doing wrong?

Upvotes: 2

Views: 304

Answers (2)

VirtualScooter
VirtualScooter

Reputation: 1888

Besides Andrej Kesely's superior solution, to answer the question of what went wrong in the code, it's really minor:

df1["sgRNA"] is a series, one-dimensional, while df2.iloc[0::2, :] is a dataframe, two-dimensional.

The solution would be to make the "df2" part one-dimensional by selecting the one and only column, instead of selecting a slice of "all one columns", so to say:

df2.iloc[0::2, 0] = df1["sgRNA"]

Upvotes: 2

Andrej Kesely
Andrej Kesely

Reputation: 195573

I think you're looking for DataFrame.stack():

df2["column"] = df1.stack().reset_index(drop=True)
print(df2)

Prints:

                  column
0  ABL1_sgABL1_130854834
1   CTTAGGCTATAATCACAATG
2  ABL1_sgABL1_130862824
3   GGTTCATCATCATTCAACGG
4  ABL1_sgABL1_130872883
5   TCAGTGATGATATAGAACGG
6  ABL1_sgABL1_130884018
7   TTGCTCCCTCGAAAAGAGCG

Upvotes: 4

Related Questions