Lee
Lee

Reputation: 35

Inconsistency in GET BUCKET request

I'm noticing differing results when listing the contents of folders within the same bucket, specifically, sometimes the home folder will be listed under the 'Contents' section (within the key element), but other times not. See the following two outputs:

This output does not include the prefixed directory

<?xml version='1.0' encoding='UTF-8'?>
<ListBucketResult xmlns='http://doc.s3.amazonaws.com/2006-03-01'>
<Name>
test22</Name>                            <=== Bucket
<Prefix>
16-Jul-2013</Prefix>                     <=== Prefixed folder
<Marker>
</Marker>
<IsTruncated>
false</IsTruncated>
<Contents>
<Key>
16-Jul-2013/0371.txt</Key>               <=== ONLY OBJECTS LISTED
<Generation>
1374016944689000</Generation>
<MetaGeneration>
1</MetaGeneration>
<LastModified>
2013-07-16T23:22:24.664Z</LastModified>
<ETag>
"5d858b3ddbf51fb5ec4501799e637b47"</ETag>
<Size>
96712</Size>
<Owner>
<ID>
00b4903a97d860d9d5a7d98a1c6385dc6146049499b88ceae217eaee7a0b2ff4</ID>
</Owner>
</Contents>

But this output does

<?xml version='1.0' encoding='UTF-8'?>
<ListBucketResult xmlns='http://doc.s3.amazonaws.com/2006-03-01'>
<Name>
test22</Name>                            <=== Bucket
<Prefix>
22-Aug-2013</Prefix>                     <=== Prefixed folder
<Marker>
</Marker>
<IsTruncated>
false</IsTruncated>
<Contents>
<Key>
22-Aug-2013/</Key>                       <=== FOLDER INCLUDED IN LIST
<Generation>
1377178774399000</Generation>
<MetaGeneration>
1</MetaGeneration>
<LastModified>
2013-08-22T13:39:34.337Z</LastModified>
<ETag>
"d41d8cd98f00b204e9800998ecf8427e"</ETag>
<Size>
0</Size>
<Owner>
<ID>
00b4903a97d0b7e1f638009476bba4c5d964f744e50c23c3681357a290cb7b16</ID>
</Owner>
</Contents>

Both requests were made with the following code (note I did not use an authenticated session, the items are pubilc-readable):

uri = URI('https://storage.googleapis.com/test22?prefix=16-Jul-2013')     <=== prefix changed for each case
req3 = Net::HTTP::Get.new(uri.request_uri)

#req3['Authorization'] = "#{token['token_type']} #{token['access_token']}"
req3['Content-Length'] = 0
req3['content-Type'] = 'text/plain - GB'
req3['Date'] = Time.now.strftime("%a, %d %b %Y %H:%M:%S %Z")
req3['Host'] = 'storage.googleapis.com'
req3['x-goog-api-version'] = 2
req3['x-goog-project-id'] = ###############

Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') { |http|
   resp3 = http.request(req3)
   puts resp3.body.gsub(/>/, ">\n")
}

Why the difference? Is there something basic I'm missing? Thanks in advance...

-Lee

Upvotes: 1

Views: 171

Answers (1)

Travis Hobrla
Travis Hobrla

Reputation: 5511

When you create a folder using the Cloud Console, it creates a placeholder object with the name of the folder + '/' to represent the empty folder. Even if you later add objects to the folder, the placeholder remains.

On the other hand, if you directly upload an object with a '/' in the name using the API (for example an upload to 'folder/object.txt') no placeholder object is created because the presence of the object is enough to infer the existence of the folder. If you delete 'folder/object.txt', the folder will no longer be listed in the root listing of the Cloud Console as there is no placeholder object.

To answer your question explicitly, that means that '16-Jul-2013/0371.txt' was created via a direct upload to '16-Jul-2013/0371.txt'. By contrast, '22-Aug-2013/' was created by the New Folder button in the Cloud Console. In the latter case a placeholder object is created, in the former, it is not.

All of this is because the GCS namespace is flat, not hierarchical. The folder abstraction is there to help you visualize things hierarchically, but it has some limitations.

Upvotes: 2

Related Questions