Dr. Fabien Tarrade
Dr. Fabien Tarrade

Reputation: 1696

create view on Hive table: comment for each variable are lost

I created a Hive table for which we added some description in the "comment" field for each variable as shown below:

spark.sql("create table test_comment (col string comment 'col comment') comment 'hello world table comment ' ")
spark.sql("describe test_comment").show()
+--------+---------+-----------+
|col_name|data_type|    comment|
+--------+---------+-----------+
|     col|   string|col comment|
+--------+---------+-----------+

All is good and we see the comment "col comment" in the commennt field of the variable "col".

Now when I am creating a view on this table, the "comment" field is not propagated to the view and the "comment" column is empty:

spark.sql("""create view test_comment_view as select * from test_comment""")
spark.sql("describe test_comment_view")

+--------+---------+-------+
|col_name|data_type|comment|
+--------+---------+-------+
|     col|   string|   null|
+--------+---------+-------+

Is there a way to keep the values of the comment field when created a view ? What is the reason of this "feature" ?

I am using:

Hadoop 2.6.0-cdh5.8.0

Hive 1.1.0-cdh5.8.0

Spark 2.1.0.cloudera1

Upvotes: 1

Views: 1906

Answers (1)

Bala
Bala

Reputation: 11234

What I have observed is that, comments are not inherited even when creating a table from another table. Looks like it is the default behaviour.

create table t1 like another_table 
desc t1  //includes comments
+-----------+------------+------------------+--+
| col_name  | data_type  |     comment      |
+-----------+------------+------------------+--+
| id        | int        | new employee id  |
| name      | string     | employee name    |
+-----------+------------+------------------+--+

create table t1 as select * from another_table
desc t1 //excludes comments
+-----------+------------+----------+--+
| col_name  | data_type  | comment  |
+-----------+------------+----------+--+
| id        | int        |          |
| name      | string     |          |
+-----------+------------+----------+--+

But there is a workaround. You can specify individual columns with comment when creating a view

create view v2(id2 comment 'vemp id', name2 comment 'vemp name') as select * from another_table;

+-----------+------------+------------+--+
| col_name  | data_type  |  comment   |
+-----------+------------+------------+--+
| id2       | int        | vemp id    |
| name2     | string     | vemp name  |
+-----------+------------+------------+--+

Upvotes: 3

Related Questions