Su1tan
Su1tan

Reputation: 49

How to handle CSV files in the Bronze layer without the extra layer

If my raw data is in CSV format and I would like to store it in the Bronze layer as Delta tables then I would end up with four layers like Raw+Bronze+Silver+Gold. Which approach should I consider?

Upvotes: 0

Views: 449

Answers (1)

Chris
Chris

Reputation: 559

A bit of an open question, however with respect to retaining the "raw" data in CSV I would normally recommend this as storage of these data is usually cheap relative to the utility of being able to re-process if there are problems or for purpose of data audit/traceability.

I would normally take the approach of compressing the raw files after processing and perhaps tar-balling the files. In addition moving these files to colder/cheaper storage.

Upvotes: 1

Related Questions