Reputation: 723
There is a framework used to ingest files into a DataLake in AWS S3, the name is Serverless DataLake Framework aka SDLF, some configuration is needed to move a file through many stages in the S3 repository. The first one is to pass a file from the S3/Landing stage to S3/Raw stage. To do that part of the configuration is the file: source_mappings.json, let me show an example:
[
{
"SourceId": "ABC123",
"Target": {
"Location": {
"Subdirectory": "domainxxx/systemyyy/filezzz/file_XX%Y%m%d"
}
},
"Source": {
"Location": {
"IncludePatterns": ["systemyyy/file_XX*"],
"DatePattern": "file_%Y%m%d"
}
},
"System": "systemyyy"
}
]
That works successfully because normally the files to ingest comes with a date as part of the name of the file, but I got a file to ingest that has no date as part of the name of the file, instead it has a consecutive number, lets say "file_1084.dat","file_1085.dat",..,"file_1090.dat"..
So my question is if anyone have tried this before.. I tried with many other tags like //d{4} or [0-9]{4} or just *, but nothing seems to work..
Upvotes: 0
Views: 22
Reputation: 723
A workaround, as the digit in the file name is four digit, then using %Y works successfully.
Upvotes: 0