AltShift
AltShift

Reputation: 346

SparkR: "Cannot resolve column name..." when adding a new column to Spark data frame

I am trying to add some computed columns to a SparkR data frame, as follows:

Orders <- withColumn(Orders, "Ready.minus.In.mins",   
(unix_timestamp(Orders$ReadyTime) - unix_timestamp(Orders$InTime)) / 60)
Orders <- withColumn(Orders, "Out.minus.In.mins", 
(unix_timestamp(Orders$OutTime) - unix_timestamp(Orders$InTime)) / 60)

The first command executes ok, and head(Orders) reveals the new column. The second command throws the error:

15/12/29 05:10:02 ERROR RBackendHandler: col on 359 failed
Error in select(x, x$"*", alias(col, colName)) : 
error in evaluating the argument 'col' in selecting a method for function 
'select': Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
org.apache.spark.sql.AnalysisException: Cannot resolve column name 
"Ready.minus.In.mins" among (ASAP, AddressLine, BasketCount, CustomerEmail, CustomerID, CustomerName, CustomerPhone, DPOSCustomerID, DPOSOrderID, ImportedFromOldDb, InTime, IsOnlineOrder, LineItemTotal, NetTenderedAmount, OrderDate, OrderID, OutTime, Postcode, ReadyTime, SnapshotID, StoreID, Suburb, TakenBy, TenderType, TenderedAmount, TransactionStatus, TransactionType, hasLineItems, Ready.minus.In.mins);
at org.apache.spark.sql.DataFrame$$anonfun$resolve$1.apply(DataFrame.scala:159)
at org.apache.spark.sql.DataFrame$$anonfun$resolve$1.apply(DataFrame.scala:159)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.DataFrame.resolve(DataFrame.scala:158)
at org.apache.spark.sql.DataFrame$$anonfun$col$1.apply(DataFrame.scala:650)
at org.apa 

Do I need to do something to the data frame after adding the new column before it will accept another one?

Upvotes: 0

Views: 1912

Answers (2)

rcidadef
rcidadef

Reputation: 78

From the link, just use backsticks, when accessing the column, e.g.:

From using

df['Fields.fields1']

or something, use:

df['`Fields.fields1`']

Upvotes: 1

AltShift
AltShift

Reputation: 346

Found it here: spark-issues mailing list archives

SparkR isn't entirely happy with "." in a column name.

Upvotes: 0

Related Questions