Reputation: 71
I have the following dataframe with numeric values for each column:
Total.show(5)
|id_cliente|consumo_datos_MB|sms_enviados|minutos_llamadas_movil|minutos_llamadas_fijo|sum(id_cliente)|sum(consumo_datos_MB)|sum(sms_enviados)|sum(minutos_llamadas_movil)|sum(minutos_llamadas_fijo)|
+----------+----------------+------------+----------------------+---------------------+---------------+---------------------+-----------------+---------------------------+--------------------------+
| 2| 611| 0| 41| 38| 2| 611| 0| 41| 38|
| 8| 284| 5| 71| 31| 8| 284| 5| 71| 31|
| 14| 1324| 0| 28| 29| 14| 1324| 0| 28| 29|
| 21| 1748| 0| 81| 12| 21| 1748| 0| 81| 12|
| 25| 1555| 0| 60| 6| 25| 1555| 0| 60| 6|
+----------+----------------+------------+----------------------+---------------------+---------------+---------------------+-----------------+-----------------
What I need to do is create another DF with each of those values multiplied by a coefficient, which would be the following:
0.4 --> minutos_llamadas_movil
0.3 --> consumo_datos_MB
0.2 --> minutos_llamadas_fijo
0.1 --> sms_enviados
That means that I would have to multiply each item in each column for a different value, i.e.: every item under minutos_llamadas_movil would have to be multiplied for 0.4, each on under consumo_datos_MB for 0.3 and so on.
This is what I have tried:
Sumas = Total["consumo_datos_MB", "sms_enviados", "minutos_llamadas_movil", "minutos_llamadas_fijo"]=
0.3 * Total["consumo_datos_MB"],
0.4 * Total["minutos_llamadas_movil"],
0.2 * Total["minutos_llamadas_fijo"],
0.1 * Total["sms_enviados"]
print(Sumas)
I've gotten the following error message when trying to run the piece of code above: typeerror 'dataframe' object does not support item assignment pyspark
Everything I've seen online about this error talked about date formats, but that's not the case here so I'm not sure what the problem might be.
Can anyone help please?
Thanks a lot in advance!
Upvotes: 0
Views: 1179
Reputation: 14845
You can use withColumn to multipy the values:
Sumas = Total.withColumn("consumo_datos_MB", 0.3 * Total["consumo_datos_MB"] ) \
.withColumn("minutos_llamadas_movil", 0.4 * Total["minutos_llamadas_movil"] )\
.withColumn("minutos_llamadas_fijo", 0.2 * Total["minutos_llamadas_fijo"] ) \
.withColumn("sms_enviados", 0.1 * Total["sms_enviados"] )
Upvotes: 2