user6325753
user6325753

Reputation: 577

Load into Hive table imported entire data into first column only

I am trying to copy the Hive data from one server to another server. By this, I am exporting into hive data into CSV from server1 and trying to import that CSV file into Hive in server2.

My table contains following datatypes:

bigint

string

array

Here is my commands:

export:

hive -e 'select * from sample' > /home/hadoop/sample.csv

import:

load data local inpath '/home/hadoop/sample.csv' into table sample;

After importing into Hive table, entire row data into inserted into first column only.

How can I overcome this, or else is there a better way to copy data from one server to another server?

Upvotes: 0

Views: 1764

Answers (3)

HbnKing
HbnKing

Reputation: 1882

why not use hadoop command to transfer data from one cluster to another such as

 bash$ hadoop distcp hdfs://nn1:8020/foo/bar \ 
                    hdfs://nn2:8020/bar/foo

then load the data to your new table

load data inpath '/bar/foo/*' into table wyp;

your problem may caused by the delimiter ,The default delimiter '\001' if you havn't set when create a hivetable .. if you use hive -e 'select * from sample' > /home/hadoop/sample.csv will make all cloumn to one cloumn

Upvotes: 0

OneCricketeer
OneCricketeer

Reputation: 191691

You really should not be using CSV as your data transfer format

Upvotes: 1

Jay Shankar Gupta
Jay Shankar Gupta

Reputation: 6088

While creating table add below line at the end of create statment

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

Like Below:

hive>CREATE TABLE sample(id int,
                         name String) 
     ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

Then Load Data:

hive>load data local inpath '/home/hadoop/sample.csv' into table sample;

For Your Example

sample.csv

123,Raju,Hello|How Are You
154,Nishant,Hi|How Are You

So In above sample data first column is bigint, second is String and third is Array separated by |

hive> CREATE TABLE sample(id BIGINT,
                          name STRING,
                          messages ARRAY<String>) 
      ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
      COLLECTION ITEMS TERMINATED BY '|';
hive> LOAD DATA LOCAL INPATH '/home/hadoop/sample.csv' INTO TABLE sample;

Most important point :

Define delimiter for collection items and don't impose the array structure you do in normal programming.
Also, try to make the field delimiters different from collection items delimiters to avoid confusion and unexpected results.

Upvotes: 1

Related Questions