Reputation: 25
I want to change the order of certain column in certain row. The dataframe looks like this.
movieStream | optin_1 | musicStream | optin_2 | DramaStream | optin_3 |
---|---|---|---|---|---|
netflix | yes | spotify | yes | hulu | yes |
netflix | yes | spotify | yes | hulu | yes |
hulu | no | netflix | yes | spotify | no |
netflix | yes | spotify | yes | hulu | yes |
I want to reorder the bold part along with the others in the same column like this:
movieStream | optin_1 | musicStream | optin_2 | DramaStream | optin_3 |
---|---|---|---|---|---|
netflix | yes | spotify | yes | hulu | yes |
netflix | yes | spotify | yes | hulu | yes |
netflix | yes | spotify | no | hulu | no |
netflix | yes | spotify | yes | hulu | yes |
I don't know it's specific index number...and I also don't how many mixed rows going to be in this Dataframe.
Upvotes: 0
Views: 71
Reputation: 13349
One of the easy and worst solution I can think of using apply.
stream
and optin
pairs.key_order
list. Then convert the ordered list of tuples back to single list(flattening) 3) Return that list by converting it to pandas Series
.key_order = ['netflix', 'spotify','hulu']
def order_func(row):
tup = list(zip(*[iter(row.values)]*2))
sort_tup = (sorted(tup, key = lambda x: key_order.index(x[0])))
li = [i for sub in sort_tup for i in sub]
return pd.Series(li, index=row.index)
df = df.apply(order_func, axis=1)
df:
movieStream | optin_1 | musicStream | optin_2 | DramaStream | optin_3 | |
---|---|---|---|---|---|---|
0 | netflix | yes | spotify | yes | hulu | yes |
1 | netflix | yes | spotify | yes | hulu | yes |
2 | netflix | yes | spotify | no | hulu | no |
3 | netflix | yes | spotify | yes | hulu | yes |
Upvotes: 1
Reputation: 770
Assuming that there is only one value for each category (movie, music, drama), this code should work. Added comments for clarity.
# Load data to dataframe
df = pd.read_csv("data.csv")
# Filter out wrong df (to do less loops) - you can omit this step and just proceed with loop below
wrong_dfs = df.loc[(df['movieStream'] != "netflix") | (df['musicStream'] != "spotify") | (df['DramaStream'] != "hulu")]
# Setting proper order (from left to right)
order = ["netflix", "spotify", "hulu"]
# Working on raw list and indexes. Probably can be achieved with some pandas magic, but here is simple/easy to understand version of it ;)
for index, rows in wrong_dfs.iterrows():
raw_row = rows.tolist()
new_row = []
for item in order:
start_idx = raw_row.index(item)
new_row += raw_row[start_idx:start_idx+2]
df.loc[index] = new_row
Upvotes: 0