Replace comma only if the followed by integer in pyspark column

Question

values=[("3","100;PerMonth;BB;1500;Tm;TkU,2500;Trm;TU"),("4","100;CalendarDay;g;440;Term;Degram")]
df=spark.createDataFrame(values,['id','derivate'])

I want to change the comma to pipe inside a column only if the comma is before an integer pyspark

input

|id  |derivate                                   |
+---+--------------------------------------------+
|3  |100;PerMonth;BB;1500;Tm;TkU,2500;Trm;TU     |
|4  |100;CalendarDay;g;440;Term;Degram           |
+---+--------------------------------------------+

expected output

|id |derivate                                    |ITEMS                                       |
+---+--------------------------------------------+--------------------------------------------+
|3  |100;PerMonth;BBL;1500;Term;TkU,2500;Term;TEU|100;PerMonth;BBL|1500;Term;TkU|2500;Term;TEU|
|4  |100;CalendarDay;g;440;Term;Degram           |100;CalendarDay;g|440;Term;Degram           |

Replace comma only if the followed by integer in pyspark column

Answers (1)

Related Questions