Reputation: 13
I have this kind of data like on image. I need to get sequences which Type are "secstr", fill it to new column next to column Sequence which have same PDB_ID number and Chain. At the end I want to delete rows with "secstr" sequences.
So far I have something like that:
["Secstr"] = sequences.Sequence[
(sequences['PDB_ID'] == sequences['PDB_ID']) &
(sequences['Chain'] == sequences['Chain']) &
(sequences['Type'] == 'secstr')]
The data I need should look like this:
PDB_ID Chain Sequence Secstr
0 101M A MVLSEGEWQLVLHVWAKVEA HHHH HHHHGGHH HHHH
1 102L A MVLSEGEWQLVLHVWAKVEA HHHH HHHHHHHGGHH HH
2 102M A MVLSEGEWQLVLHVWAKVEA HHHHHHHHHGGHH HHH
3 103L A MVLSEGEWQLVLHVWAKVEA HHHHH HHHHHH HHGGH
4 103L B MVLSEGEWQLVLHVWAKVEA HHHHH HHHHHH HHHHH
Upvotes: 0
Views: 57
Reputation: 35115
Combine the original DF and the DF extracted by 'secstr' to remove unnecessary columns. Does this meet the intent of the question?
# Splitting the DF by 'Type'
df2 = df[df['Type'] == 'secstr']
df2.set_index(['PDB_ID','Chain'], inplace=True)
# Extract and divide 'Type' except 'secstr' ('sequence' extraction)
df = df[~(df['Type'] == 'secstr')]
df.set_index(['PDB_ID','Chain'], inplace=True)
# Combining DF and DF2 (in the column direction)
new_df = pd.concat([df,df2], axis=1)
new_df.reset_index(inplace=True)
# Renaming a column
new_cols = ['PDB_ID', 'Chain', 'Type', 'Sequence', 'Type1', 'Secstr']
new_df.columns = new_cols
# Deleting unnecessary columns
new_df.drop(columns=new_df.columns[[2,4]], inplace=True)
new_df
PDB_ID Chain Sequence Secstr
0 101M A HJGDSDDLKEIEWUSKDSK OLKDSJDJFYEUKIBK
1 102L A HJGDSDDLKEIEWUSKDSK OLKDSJDJFYEUKIBK
2 102M A HJGDSDDLKEIEWUSKDSK OLKDSJDJFYEUKIBK
3 103L A HJGDSDDLKEIEWUSKDSK OLKDSJDJFYEUKIBK
4 103M A HJGDSDDLKEIEWUSKDSK OLKDSJDJFYEUKIBK
Upvotes: 1
Reputation: 13
PDB_ID Chain Sequence Secstr
0 101M A MVLSEGEWQLVLHVWAKVEA HHHH HHHHGGHH HHHH
1 102L A MVLSEGEWQLVLHVWAKVEA HHHH HHHHHHHGGHH HH
2 102M A MVLSEGEWQLVLHVWAKVEA HHHHHHHHHGGHH HHH
3 103L A MVLSEGEWQLVLHVWAKVEA HHHHH HHHHHH HHGGH
4 103L B MVLSEGEWQLVLHVWAKVEA HHHHH HHHHHH HHHHH
I need data something like that
Upvotes: 0