White Shadows
White Shadows

Reputation: 139

Got multiple values for keyword argument 'inputLocForTrain'

I have a function defined in class

class X:

    def __init__(self, logger, tableDataLoader, dataCleanser, timeSeriesFunctions):
        self.logger = logger
        self.tableDataLoader = tableDataLoader
        self.dataCleanser = dataCleanser
        self.timeSeriesFunctions = timeSeriesFunctions

    def preProcess(self, inputLocForTrain, inputLocForTest, outputLoc, region, gl):

        # Do Something

I am trying to call this function preProcess through a multiprocessing class which is defined like this

class ProcessManager:

    def __init__(self, spark, logger):
        self.spark = spark
        self.logger = logger

    def applyMultiProcessExecution(self, func_arguments, targetFunction, iterableList):

        self.logger.info("Function Arguments : {}".format(func_arguments))
        jobs = []
        for x in iterableList:
            try:
                p = Process(target=targetFunction, args=(x,), kwargs=func_arguments)
                jobs.append(p)
                p.start()
            except:
                raise RuntimeError("Unable to create process for GL : {}".format(x))

        for job in jobs:
            job.join()

Now I am calling my ProcessManager like this

processManager = ProcessManager(spark=spark, logger=logger)
dataFetcherFactory = DataFetcherFactory(logger)
dataFetcher = dataFetcherFactory.getDataFetcher(pipelineType=pipelineType)
dataCleanser = DataCleanser(logger)
timeSeriesFunctions = TimeSeriesFunctions(logger)
tableDataLoader = TableDataLoader(logger=logger, dataFetcher=dataFetcher, dataCleanser=dataCleanser,
                         timeSeriesFunctions=timeSeriesFunctions)
preProcessDataForPCAModel = X(logger=logger,
                                                          tableDataLoader=tableDataLoader,
                                                          dataCleanser=dataCleanser,
                                                          timeSeriesFunctions=timeSeriesFunctions)
arguments = {FeatureConstants.INPUT_LOCATION_FOR_TRAIN: inputLocForTrain,
                 FeatureConstants.INPUT_LOCATION_FOR_TEST: inputLocForTest,
                 FeatureConstants.OUTPUT_LOCATION: outputLoc,
                 REGION: region}

processManager.applyMultiProcessExecution(func_arguments=arguments,
                              targetFunction=preProcessDataForPCAModel.preProcess,
                              iterableList=[504])

This returns me error : Process Process-1:

Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
TypeError: preProcess() got multiple values for keyword argument 'inputLocForTrain'

I went through several stackoverflow posts where people suggest that it is due to self parameter present as a part of class. I am unable to understand how can i resolve my problem, since I need my constructor arguments present as a part of self in order to do my computation.

Can anyone please let me know how can i resolve this ?

Upvotes: 0

Views: 539

Answers (1)

georgexsh
georgexsh

Reputation: 16624

try change:

def preProcess(self, inputLocForTrain, inputLocForTest, outputLoc, region, gl):

to:

def preProcess(self, gl, inputLocForTrain, inputLocForTest, outputLoc, region):

positional argument should appear at the beginning.

Upvotes: 1

Related Questions