PIG: How to remove '::' in the column name

Question

I have a pig relation like below:

FINAL= {input_md5::type: chararray,input_md5::name: chararray,input_md5::id: long,input_md5::age: chararray,test_1:: type: chararray,test_2::name:chararray}

I am trying to store all columns for input_md5 relation to a hive table. like all input_md5::type: chararray,input_md5::name: chararray,input_md5::id: long,input_md5::age: chararray not taking test_1:: type: chararray,test_2::name:chararray

is there any command in pig which filters only columns of input_md5.Something like below:

STORE= FOREACH FINAL GENERATE all input_md5::type . I know that pig have :

FOREACH FINAL GENERATE all input_md5::type as type syntax, but i have many columns so I cannot use as in my code.

Because when i try: STORE= FOREACH FINAL GENERATE input_md5::type .. bus_input_md5::name;

Pig throws an error:

org.apache.hive.hcatalog.common.HCatException : 2007 : Invalid column position in partition schema : Expected column at position 1, found column

Thanks in advance,

Neethu Lalitha · Accepted Answer

Resolved this issue , below is the fix:

Create a relation with some filter condition as below:

DUMMY_RELATION= FILTER SOURCE_TABLE BY type== ''; (I took a column named type ,this can be filtered by any column in the table , all that matters is we need its schema)

FINAL_DATASET= UNION DUMMY_RELATION,SCHEMA_1,SCHEMA_2;

(this new DUMMY_RELATIONn should be placed 1st in the union) Now you no more have :: operator And your column names would match hive table's column names, provided your source table (to DUMMY_RELATION) and target table have same column order.

Thanks to myself :)

PIG: How to remove '::' in the column name

Answers (2)

Related Questions

PIG: How to remove &#39;::&#39; in the column name

Answers (2)

Related Questions

PIG: How to remove '::' in the column name