PySpark rename multiple columns based on regex pattern list

Question

I have a dataframe as shown below. I want to rename columns based on regex patterns.

patterns = ["price-usd-([0-9]+)", "list_price_([0-9]+)", "price_per_([0-9]+)_units", "pricefor([0-9]+)", "([0-9]+)_plus_price", "break_price_([0-9]+)", "price_break_pricing_([a-z]+)"]

Based on the above patterns i want to rename columns in dataframe as below.

------------------------------------------------------------------------------------------------------------------------------------------
| item_name | price-usd-1 | break_price_7  |    pricefor5  |  price_per_9_units | price_break_pricing_a |  2_plus_price  | list_price_8  |
------------------------------------------------------------------------------------------------------------------------------------------
| Samsung Z |   10000     |         5      |    9000       |         10         |          7000         |      4         |       21      |
| Moto G4   |   12000     |         10     |    10000      |         20         |          6000         |      3         |       43      |
| Mi 4i     |   15000     |         8      |    12000      |         20         |         10000         |      5         |       25      |
| Moto G3   |   20000     |         5      |    18000      |         12         |         15000         |      10        |       15      |
------------------------------------------------------------------------------------------------------------------------------------------

Output:

----------------------------------------------------------------------------------------------------------------------
| item_name |    price_1  |    price_7     |     price_5   |       price_9      |  price_a   |   price_2  |  price_8 |   
----------------------------------------------------------------------------------------------------------------------
| Samsung Z |   10000     |         5      |    9000       |         10         |  7000      |    4       |    21    |
| Moto G4   |   12000     |         10     |    10000      |         20         |  6000      |    3       |    43    |
| Mi 4i     |   15000     |         8      |    12000      |         20         |  10000     |    5       |    25    |
| Moto G3   |   20000     |         5      |    18000      |         12         |  15000     |    10      |    15    |
----------------------------------------------------------------------------------------------------------------------

PySpark rename multiple columns based on regex pattern list

Answers (1)

Related Questions