Reputation: 929
I have a simple csv file
1
2
3
4
When I try to run some code over it this way.
grunt> SET job.name 'this_and_that';
grunt> SET mapreduce.job.queuename adhoc;
grunt> SET default_parallel 50;
grunt> index_row = load 'nmbr.csv' as (number:int);
grunt> dump index_row;
I get proper result.
(1)
(2)
(3)
(4)
But when I save the code in a file test.pig
SET job.name 'this_and_that';
SET mapreduce.job.queuename adhoc;
SET default_parallel 50;
index_row = load 'nmbr.csv' as (number:int);
dump index_row;
And try to run it this way.
$ pig -x mapreduce hdfs://nameservice1/user/evkuzmin/test.pig
I get this message.
17/01/11 16:14:14 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
17/01/11 16:14:14 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
17/01/11 16:14:14 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2017-01-11 16:14:14,306 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0.2.5.0.0-1245 (rexported) compiled Aug 26 2016, 02:07:35
2017-01-11 16:14:14,307 [main] INFO org.apache.pig.Main - Logging error messages to: /export/home/evkuzmin/pig_1484140454299.log
2017-01-11 16:14:20,083 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /export/home/evkuzmin/.pigbootup not found
2017-01-11 16:14:20,301 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://nameservice1
2017-01-11 16:14:20,401 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-test.pig-b92d8d10-6d6c-4018-b55c-da85716c482b
2017-01-11 16:14:21,549 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://hd-has011.vimpelcom.ru:8188/ws/v1/timeline/
2017-01-11 16:14:21,571 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook
2017-01-11 16:14:26,403 [main] INFO org.apache.pig.Main - Pig script completed in 12 seconds and 711 milliseconds (12711 ms)
I tried looking here for the errors,
/export/home/evkuzmin/pig_1484140454299.log
but the file wasn't there.
Upvotes: 0
Views: 984
Reputation: 1445
Do not put your test.pig in hdfs location.
instead make changes in local test.pig load location:
SET job.name 'this_and_that';
SET mapreduce.job.queuename adhoc;
SET default_parallel 50;
index_row = load 'hdfs://nameservice1/user/evkuzmin/nmbr.csv' as (number:int);
dump index_row;
then run your test.pig file from local only but in MAPREDUCE mode:
pig -x MAPREDUCE your/local/path/to/test.pig
Upvotes: 1