Reputation: 359
Iam new to AWs glue.
I am facing issue in converting glue data frame to pyspark data frame :
Below is the crawler configuration i created for reading csv file glue_cityMapDB="csvDb" glue_cityMapTbl="csv table"
datasource2 = glue_context.create_dynamic_frame.from_catalog(database = glue_cityMapDB, table_name = glue_cityMapTbl, transformation_ctx = "datasource2")
datasource2.show()
print("Show the data source2 city DF")
cityDF=datasource2.toDF()
cityDF.show()
Here i am getting output from the glue dydf - #datasource2.show() But after converting to the pyspark DF, iam getting following error
S3NativeFileSystem (S3NativeFileSystem.java:open(1208)) - Opening 's3://s3source/read/names.csv' for reading 2020-04-24 05:08:39,789 ERROR [Executor task launch worker for task
Appreciate if anybody can help on this?
Upvotes: 3
Views: 8287
Reputation: 851
Make use of a file are of UTF-8 encoded. You can check using file or convert using inconv or any other text editor like sublime.
You can also read the files as a dataframe using:
df = spark.read.csv('s3://s3source/read/names.csv')
then convert to dynamic frames using fromDF()
Upvotes: 2