Reputation: 507
I have created aws crawler which uses classifier to import csv files to data table. Which is working fine.
Issue: Every time crawler overwrites old data. I want to keep previous data and append new content of csv files.
i.e I have uploaded csv file with 250 records. And when I execute that crawler it populated table with 250 rows.
Now If I replace that csv file with some other content then It is overwriting old 250 rows and populate table with latest data only.
Can anyone please help me how can I keep old records and append new data.
Thank you,
Upvotes: 1
Views: 1415
Reputation: 2658
Glue crawler doesn't populate table with rows/records. It simply defines meta information about your data, i.e. infers table schema and location of those files on S3 (or other resources) etc. This means that you need to keep both files on S3 if you want to preserve old records.
Note, that if you keep new files in the same "folder" on S3 as the old ones, you don't need to re-run crawler, since information required to query those files (e.g. with Athena) had already been define.
Upvotes: 3