Guillermo Herrera
Guillermo Herrera

Reputation: 510

How to add rows and values for given column?

So I have the following DataFrame right now, with the following value:

Dataset<Row> ds = sparkSession.read().text(pathFile);
+-------+--------+
| VALUE |  TIME  |
+-------+--------+
| 5000  |        |
+-------+--------+

where TIME doesn't have a value (or is null). How can I add a value to the TIME column? I will later on my program be adding more rows as well, and I will need to add/append values for both the VALUE and TIME columns. How can I do this?

Upvotes: 0

Views: 1192

Answers (1)

Jacek Laskowski
Jacek Laskowski

Reputation: 74779

How can I add a value to the TIME column?

and

TIME doesn't have a value (or is null)

leads me to believe that you may want to explore na operator.

na: DataFrameNaFunctions Returns a DataFrameNaFunctions for working with missing data.

that in turn gives you the way to fill missing values.

fill(value: String, cols: Array[String]): DataFrame Returns a new DataFrame that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.

If you just want to replace you should use withColumn operator.

withColumn(colName: String, col: Column): DataFrame Returns a new Dataset by adding a column or replacing the existing column that has the same name.

As the value for col you could use lit function.

lit(literal: Any): Column Creates a Column of literal value.

The other requirement was...

be adding more rows as well

That's union operator.

union(other: Dataset[T]): Dataset[T] Returns a new Dataset containing union of rows in this Dataset and another Dataset. This is equivalent to UNION ALL in SQL.

Upvotes: 2

Related Questions