lezebulon
lezebulon

Reputation: 7994

How are cdh package defined?

I have questions concerning cdh and how it is maintained:

thanks !

Upvotes: 0

Views: 46

Answers (1)

tk421
tk421

Reputation: 5957

You are correct in assuming parquet-1.5.0+cdh5.5.5+181 is closest to parquet 1.5.0. However the code will not be identical to parquet 1.5.0 upstream because:

  1. CDH enforces cross component compatibility. Code and applications using parquet-1.5.0 must also work with all the other Hadoop services (HDFS, Hive, Oozie, YARN, Spark, Solr, HBase). Incompatibilities would have to be fixed so parquet's code would include those bug fixes.

  2. CDH enforces major version compatibility. This means an application written on CDH5.1 should still work on CDH5.5 and CDH5.7, all CDH5.x versions. This also would alter the codebase.

The best way to interpret this is to say that parquet-1.5.0+cdh5.5.5+181 will support all features provided in parquet 1.5.0 and will also work with the corresponding Hadoop services packaged with CDH5.5.

Version compatibility is also the reason why CDH Hadoop service versions run older versions of the related upstream projects. It's much harder to maintain backwards compatibility especially if APIs change between versions.

Upvotes: 0

Related Questions