Reputation: 3127
Azure data lake analytics and azure databricks both can be used for batch processing. Could anyone please help me understand when to choose one over another?
Upvotes: 22
Views: 10775
Reputation: 51
Databricks has more language options that allows professional with different skills to work on the data. Also with databricks you can run jobs with high-performance, in-memory clusters.
In a project, we use data lake more as a storage, and do all the jobs (ETL, analytics) via databricks notebook. Storing data in data lake is cheaper $.
Back to your questions, if a complex batch job, and different type of professional will work on the data you. You may choose a Azure Data Lake + Databricks architecture. Otherwise an Azure Data Lake would satisfied your needs.
Take a look of these 2 articles would help. https://databricks.com/glossary/data-lake https://visualbi.com/blogs/microsoft/azure/etl-azure-databricks-vs-data-lake-analytics/
Upvotes: 5
Reputation: 14379
In my humble opinion, a lot of it comes down to existing skillsets. If you have a team experienced in Spark, Java, Python, r or Scala then Databricks is a natural fit. If on the other hand you have a team with existing SQL and c# skills, then the learning curve for them with U-SQL will be less steep.
That aside, there are other questions which can drive out differences:
UPDATE October 2018: As far as I am aware, U-SQL does not currently support ADLS Gen 2, which would count against it (happy to be corrected). I will update the post if and when that support is added.
UPDATE January 2019: U-SQL has not had any meaningful updates since Spring 2018.
HTH
Upvotes: 33