Inserting to Snowflake with Glue throws "IllegalArgumentException: No group with name "

Question

I have a Glue job that loads data from RDS to Snowflake:

This job used to insert to S3 prior to the existence of this Snowflake instance. Now trying to run it with Snowflake as sink returns this error: "IllegalArgumentException: No group with name "

From the driver logs:

23/03/29 09:45:32 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script job-rds-to-snowflake-visual.py: 50
23/03/29 09:45:32 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] {"Event":"GlueETLJobExceptionEvent","Timestamp":1680083132028,"Failure Reason":"Traceback (most recent call last):
  File \"/tmp/job-rds-to-snowflake-visual.py\", line 50, in 
    transformation_ctx=\"SnowflakeDataCatalog_node1680082896733\",
  File \"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py\", line 819, in from_catalog
    return self._glue_context.write_dynamic_frame_from_catalog(frame, db, table_name, redshift_tmp_dir, transformation_ctx, additional_options, catalog_id)
  File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 386, in write_dynamic_frame_from_catalog
    makeOptions(self._sc, additional_options), catalog_id)
  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py\", line 1305, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File \"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 117, in deco
    raise converted from None
pyspark.sql.utils.IllegalArgumentException: No group with name ","Stack Trace":[{"Declaring Class":"deco","Method Name":"raise converted from None","File Name":"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py","Line Number":117},{"Declaring Class":"__call__","Method Name":"answer, self.gateway_client, self.target_id, self.name)","File Name":"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py","Line Number":1305},{"Declaring Class":"write_dynamic_frame_from_catalog","Method Name":"makeOptions(self._sc, additional_options), catalog_id)","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py","Line Number":386},{"Declaring Class":"from_catalog","Method Name":"return self._glue_context.write_dynamic_frame_from_catalog(frame, db, table_name, redshift_tmp_dir, transformation_ctx, additional_options, catalog_id)","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py","Line Number":819},{"Declaring Class":"","Method Name":"transformation_ctx=\"SnowflakeDataCatalog_node1680082896733\",","File Name":"/tmp/job-rds-to-snowflake-visual.py","Line Number":50}],"Last Executed Line number":50,"script":"job-rds-to-snowflake-visual.py"}
23/03/29 09:45:32 ERROR ProcessLauncher: Error from Python:Traceback (most recent call last):
  File "/tmp/job-rds-to-snowflake-visual.py", line 50, in 
    transformation_ctx="SnowflakeDataCatalog_node1680082896733",
  File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 819, in from_catalog
    return self._glue_context.write_dynamic_frame_from_catalog(frame, db, table_name, redshift_tmp_dir, transformation_ctx, additional_options, catalog_id)
  File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 386, in write_dynamic_frame_from_catalog
    makeOptions(self._sc, additional_options), catalog_id)
  File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco
    raise converted from None
pyspark.sql.utils.IllegalArgumentException: No group with name 
23/03/29 09:45:31 INFO GlueContext: getCatalogSink: catalogId: null, nameSpace: sf_audit_db, tableName: auditlog_dev_public_rds_auditlog, isRegisteredWithLF: false
23/03/29 09:45:26 WARN SharedState: URL.setURLStreamHandlerFactory failed to set FsUrlStreamHandlerFactory
23/03/29 09:45:24 INFO GlueContext: The DataSource in action : com.amazonaws.services.glue.JDBCDataSource
23/03/29 09:45:24 INFO GlueContext: Glue secret manager integration: secretId is not provided.
23/03/29 09:45:24 INFO GlueContext: nameSpace: pg_audit_db, tableName: supportdatabase_public_audit_log_condensed, connectionName conn-rds-pg-auditdb, vendor: postgresql
23/03/29 09:45:24 INFO GlueContext: getCatalogSource: transactionId:  asOfTime:  catalogPartitionIndexPredicate:  
23/03/29 09:45:24 INFO GlueContext: getCatalogSource: catalogId: null, nameSpace: pg_audit_db, tableName: supportdatabase_public_audit_log_condensed, isRegisteredWithLF: false, isGoverned: false, isRowFilterEnabled: false, useAdvancedFiltering: false, isTableFromSchemaRegistry: false
23/03/29 09:45:22 INFO GlueContext: GlueMetrics configured and enabled
23/03/29 09:45:19 INFO Utils: Successfully started service 'sparkDriver' on port 42465.

I didn't touch the script generated as we want to keep the job in visual mode. Here's the script if it helps:

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from awsglue import DynamicFrame


def sparkSqlQuery(glueContext, query, mapping, transformation_ctx) -> DynamicFrame:
    for alias, frame in mapping.items():
        frame.toDF().createOrReplaceTempView(alias)
    result = spark.sql(query)
    return DynamicFrame.fromDF(result, glueContext, transformation_ctx)


args = getResolvedOptions(sys.argv, ["JOB_NAME"])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args["JOB_NAME"], args)

# Script generated for node RDS (Data Catalog)
RDSDataCatalog_node1 = glueContext.create_dynamic_frame.from_catalog(
    database="pg_audit_db",
    table_name="supportdatabase_public_audit_log_condensed",
    transformation_ctx="RDSDataCatalog_node1",
)

# Script generated for node SQL Query
SqlQuery0 = """
SELECT 
    *
FROM
      webapirequestlog
"""
SQLQuery_node1679649943271 = sparkSqlQuery(
    glueContext,
    query=SqlQuery0,
    mapping={"webapirequestlog": RDSDataCatalog_node1},
    transformation_ctx="SQLQuery_node1679649943271",
)

# Script generated for node Snowflake (Data Catalog)
SnowflakeDataCatalog_node1680082896733 = glueContext.write_dynamic_frame.from_catalog(
    frame=SQLQuery_node1679649943271,
    database="sf_audit_db",
    table_name="auditlog_dev_public_rds_auditlog",
    transformation_ctx="SnowflakeDataCatalog_node1680082896733",
)

job.commit()

I have tried googling the error but there aren't any results that are helpful. Any ideas what to check?

Inserting to Snowflake with Glue throws "IllegalArgumentException: No group with name <host>"

Answers (1)

Related Questions

Inserting to Snowflake with Glue throws &quot;IllegalArgumentException: No group with name &lt;host&gt;&quot;

Answers (1)

Related Questions

Inserting to Snowflake with Glue throws "IllegalArgumentException: No group with name <host>"