Reputation: 41
I am looking to process text file using Spark RDD
which has data like below:
----------------------------*-----------------------
state:xx sub:z |Basic info
company:abc rate:123 |
----------------------------*------------------------
Date: 12-03-2019
I am expecting data to be in below format:
State:XX
Sub:z
Company:abc
rate:123
Date:12-03-2019
When I tried to remove special characters '-' using data1=data.ReplaceAll('-',"")
function, it is removing - even from date also,i.e 12032019, But date should be in 12-03-2019 and also I am not getting how to move sub:z ,company:abc andrate:123
to new lines.Please help
Upvotes: 1
Views: 288
Reputation: 448
without providing further details, here are my suggestions:
-
, you may get something like thisstate:xx sub:z |Basic info
company:abc rate:123 |
Date: 12-03-2019
|
state:xx sub:z
company:abc rate:123
Date: 12-03-2019
(blank space) with \n\r
not sure whether
Date:
has a blank space behindif so, you can replace that
'Date: '
to'Date:'
first
state:xx
sub:z
company:abc
rate:123
Date:12-03-2019
hope this would help
Upvotes: 1