Reputation: 197
I have created list and trying to assign it to StructType() but getting error:
AttributeError: 'str' object has no attribute 'name'
My code:
from pyspark.sql import SparkSession
import logging
from pyspark.sql.types import *
from pyspark.sql.functions import to_timestamp
from pyspark.sql.functions import udf
from pyspark.sql.functions import lit
from pyspark.sql.functions import year, month, dayofmonth
from pyspark.context import SparkContext
from pyspark.sql import SQLContext
import argparse
logging.basicConfig(level=logging.INFO,filename = 'parquet.log')
logger = logging.getLogger(__name__)
parser = argparse.ArgumentParser()
parser.add_argument('--schema_py', '--list', nargs='+', required=True, dest='schema_py', help='Scheam def')
args = parser.parse_args()
schemaField = args.schema_py
print(type(schemaField)) #It will print <class 'list'>
schema = StructType(schemaField) # On this line facing issue
print(type(schema))
Output
$ python tst.py --schema_py 'StructField('col1', StringType(), True),StructField('col2', StringType(), True),StructField('col3', StringType(), True),StructField('col4', StringType(), True),'
<class 'list'>
Traceback (most recent call last):
File "brrConvertParquet.py", line 41, in <module>
schema = StructType(schemaField)
File "/home/sysbrrd/anaconda3/lib/python3.6/site-packages/pyspark/sql/types.py", line 484, in __init__
self.names = [f.name for f in fields]
File "/home/sysbrrd/anaconda3/lib/python3.6/site-packages/pyspark/sql/types.py", line 484, in <listcomp>
self.names = [f.name for f in fields]
AttributeError: 'str' object has no attribute 'name'
Please help me to understand what's going wrong here.
Upvotes: 2
Views: 17172
Reputation: 160
The problems i see are:
str
into the StructType()
call, rather than a list of [StructField(),]
or since you have nargs='+'
maybe you are passing in a list of strings. i.e.
["StructField('col1', StringType(), True)", "StructField('col2',
StringType(), True)", "StructField('col3', StringType(),
True)", "StructField('col4', StringType(), True)"]
.json
, pickle
, eval
or exec
.Asides that, everything else should work.
self.names = [f.name for f in fields]
breaks because fields
is a str
rather than a list of StructField
, if it were a list of StructField
as expected, the f.name
call should work just fine :-)
I hope this helps.
Upvotes: 4