pyspark autoincrement column

Question

I have a pyspark dataframe with the below format.

Table A:

 +----+--------+------+-------------+
    | ID |  date  | type | description |
    +----+--------+------+-------------+
    |  1 | 201905 | A    | descA       |
    |  2 | 202006 | B    | descB       |
    |  3 | 201503 | C    | descC       |
    |  4 | 201507 | D    | descD       |
    |  5 | 201601 | E    | descE       |
    |  6 | 201809 | F    | descF       |
    |  7 | 201011 | G    | descG       |
    +----+--------+------+-------------+

I have another table B which I need to append to table A. This table does not have the ID column. Table B

 +--------+------+-------------+
    |  Date  | Type | description |
    +--------+------+-------------+
    | 201001 | H    | descH       |
    | 201507 | I    | descI       |
    | 201907 | J    | descJ       |
    +--------+------+-------------+

Table B needs to appended to table A and the ID column must be auto-incremented by 1 for every additional entry as shown below.

Output table:

+----+--------+------+-------------+
| ID |  date  | type | description |
+----+--------+------+-------------+
|  1 | 201905 | A    | descA       |
|  2 | 202006 | B    | descB       |
|  3 | 201503 | C    | descC       |
|  4 | 201507 | D    | descD       |
|  5 | 201601 | E    | descE       |
|  6 | 201809 | F    | descF       |
|  7 | 201011 | G    | descG       |
|  8 | 201001 | H    | descH       |
|  9 | 201507 | I    | descI       |
| 10 | 201907 | J    | descJ       |
+----+--------+------+-------------+

Can you please tell me how I can do this using Pyspark?

Thanks.

pyspark autoincrement column

Answers (1)

Related Questions