Luiz Fernando Lobo
Luiz Fernando Lobo

Reputation: 704

Create new columns based on a key column and a value column

I have two columns keys and of values (legacy bad design inherited), I want to break this columns into multiple other columns depending on the key column and using the value as the value in the row.

| Key                         | Value                       |
|-----------------------------|-----------------------------|
| .                           | None                        |
| Review Required?            | Yes                         |
| Installed?                  | No                          |
| Required Publish?           | No                          |
| Published?                  | Yes                         |
| Published?                  | No                          |
| Consult Required?           | Yes                         |
| IS Catalog Reference        | Null                        |
| License Review Required?    | No                          |

I tried something like this but it didn't work

  df.loc[~df['Key'].isnull()]
 .apply(lambda cols: pd.DataFrame({cols['Key']:cols['Value']},index=[1]),axis=0)

A sample on what I wish as a result:

| Review Required? | Installed?  | Required Publish? | Published?   | Published?  |
|------------------|-------------|-------------------|--------------|-------------|
|  Yes             | None        | None              | None         | None        |
| None             |          No | None              | None         | None        |
| None             | None        |  No               | None         | None        |
| None             | None        | None              |          Yes | None        |
| None             | None        | None              | None         |          No |

This happens a lot of times and not only for yes or no. Thanks in advance.

Upvotes: 1

Views: 307

Answers (1)

jezrael
jezrael

Reputation: 862611

Use DataFrame.dropna for remove rows with missing values, add Key to index and reshape by Series.unstack:

df = df.dropna(subset=['Key']).set_index('Key', append=True)['Value'].unstack()

Upvotes: 1

Related Questions