Reputation: 32326
I am reading data from S3 bucket using Athena and the data from following file is correct.
# aws s3 ls --human s3://some_bucket/email_backup/email1/
2020-08-17 07:00:12 0 Bytes
2020-08-17 07:01:29 5.0 GiB email_logs_old1.csv.gz
When I change the path to _updated as shown below, I get an error.
# aws s3 ls --human s3://some_bucket/email_backup_updated/email1/
2020-08-22 12:01:36 5.0 GiB email_logs_old1.csv.gz
2020-08-22 11:41:18 5.0 GiB
This is because of the extra file without name in the same location. I have no idea how I managed to upload a file without a name. I will like to know how to repeat it (so that I can avoid it)
Upvotes: 1
Views: 1238
Reputation: 32326
If you add an extra non-breaking space at the end of the destination path, the file will be copied to S3 but with a blank name. for e.g.
aws s3 cp t.txt s3://some_bucket_123/email_backup_updated/email1/
(Note the non-breaking space after email1/ )
\xa0 is actually non-breaking space in Latin1, also chr(160). The non breaking space itself is the name of the file!
Using the same logic, I can remove the "space" file by adding the non-breaking space at the end.
aws s3 rm s3://some_bucket_123/email_backup_updated/email1/
I can also login to console and remove it from User Interface.
Upvotes: 1
Reputation: 35188
All S3 files have a name (in fact the full path is actually the object key, which is the metadata to define your object name).
If you see a blank named file in the path of s3://some_bucket/email_backup_updated/email1/
you have likely created a file named s3://some_bucket/email_backup_updated/email1/
.
As I mentioned earlier S3 objects use key, for this reason the file hierarchy does not exist. You simply are filtering by prefix instead.
You should be able to validate this by performing the following without the trailing slash aws s3 ls --human s3://some_bucket/email_backup_updated/email1
.
Upvotes: 1