Min.Y Ku
Min.Y Ku

Reputation: 25

Change certain column order for certain row

I want to change the order of certain column in certain row. The dataframe looks like this.

movieStream optin_1 musicStream optin_2 DramaStream optin_3
netflix yes spotify yes hulu yes
netflix yes spotify yes hulu yes
hulu no netflix yes spotify no
netflix yes spotify yes hulu yes

I want to reorder the bold part along with the others in the same column like this:

movieStream optin_1 musicStream optin_2 DramaStream optin_3
netflix yes spotify yes hulu yes
netflix yes spotify yes hulu yes
netflix yes spotify no hulu no
netflix yes spotify yes hulu yes

I don't know it's specific index number...and I also don't how many mixed rows going to be in this Dataframe.

Upvotes: 0

Views: 71

Answers (2)

Pygirl
Pygirl

Reputation: 13349

One of the easy and worst solution I can think of using apply.

  1. Create a tuple of stream and optin pairs.
  2. Order by the key_order list. Then convert the ordered list of tuples back to single list(flattening) 3) Return that list by converting it to pandas Series.

key_order = ['netflix', 'spotify','hulu']

def order_func(row):
    tup = list(zip(*[iter(row.values)]*2))
    sort_tup = (sorted(tup, key = lambda x: key_order.index(x[0])))
    li = [i for sub in sort_tup for i in sub]
    return pd.Series(li, index=row.index)

df = df.apply(order_func, axis=1)

df:

movieStream optin_1 musicStream optin_2 DramaStream optin_3
0 netflix yes spotify yes hulu yes
1 netflix yes spotify yes hulu yes
2 netflix yes spotify no hulu no
3 netflix yes spotify yes hulu yes

Upvotes: 1

How about nope
How about nope

Reputation: 770

Assuming that there is only one value for each category (movie, music, drama), this code should work. Added comments for clarity.

# Load data to dataframe
df = pd.read_csv("data.csv")

# Filter out wrong df (to do less loops) - you can omit this step and just proceed with loop below
wrong_dfs = df.loc[(df['movieStream'] != "netflix") | (df['musicStream'] != "spotify") | (df['DramaStream'] != "hulu")]

# Setting proper order (from left to right)
order = ["netflix", "spotify", "hulu"]

# Working on raw list and indexes. Probably can be achieved with some pandas magic, but here is simple/easy to understand version of it ;)
for index, rows in wrong_dfs.iterrows():
    raw_row = rows.tolist()
    new_row = []
    for item in order:
        start_idx = raw_row.index(item)
        new_row += raw_row[start_idx:start_idx+2]
    df.loc[index] = new_row

Upvotes: 0

Related Questions