Reputation: 410
According to sqoop.apache.org, Sqoop 2 is not feature complete and should not be used for production systems. Fair enough, some people may want to test out Sqoop 2's new features on their test environments.
Cloudera has a feature comparison between Sqoop 1 and Sqoop 2 (https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cdh_ig_sqoop_vs_sqoop2.html), but according to the page there is nothing that Sqoop 2 provides that Sqoop 1 does not also provide.
So why would anyone use Sqoop 2 in its current form? Does it provide any advantages over Sqoop 1? If not, why is it available for use? Thanks in advance!
Upvotes: 4
Views: 5230
Reputation: 11597
Just as a quick note :
According to Cloudera (as of Nov 2017)
Note: Sqoop 2 is being deprecated. Cloudera recommends using Sqoop 1.
Upvotes: 10
Reputation: 28247
Some of the features expected in the Sqoop2 stable release:
Currently there are no stable releases of sqoop 2 available. But you may build the latest project to test the product and commit to the open project (if interested).
Refer:
Upvotes: 4
Reputation: 753
Apache Sqoop uses a client model where the user needs to the install Sqoop along with connectors/drivers on the client. Sqoop2 uses a service based model, where the connectors/drivers are installed on the Sqoop2 server. Also, all the configurations needs to be done on the Sqoop2 server.
From an MR perspective another difference is that Sqoop submits a Map only job, while Sqoop2 submits a MapReduce job where the Mappers would be transporting the data from the source, while the Reducers would be transforming the data according to the source specified. This provides a clean abstraction. In Sqoop, both the transportation and the transformations were provided by Mappers only.
Another major difference in Sqoop2 is from a security perspective. The administrator would be setting up the connections to the source and the targets, while the operator user uses the already established connections, so the operator user need not know the details about the connections. And operators will be given access to only some of the connectors as required.
Upvotes: 4