ire
ire

Reputation: 581

Splitting a string based on another variable

I get the desired output with the following code:

row='s3://bucket-name/qwe/2022/02/24/qwe.csv'
new_row = row.split('s3://bucket-name/')[1]

print(new_row)
qwe/2022/02/24/qwe.csv

I want to achieve this while having the bucket name saved in a variable, like this:

bucket_name="bucket-name"
new_row = row.split('s3://'+bucket_name+'/')[1]

This doesn't work (says invalid syntax).

Is there another way I can define this or will I have to use a different function to split?

Upvotes: 0

Views: 60

Answers (3)

tdelaney
tdelaney

Reputation: 77337

I don't see any advantage to split when you could just slice the url to get the part you want.

>>> row='s3://bucket-name/qwe/2022/02/24/qwe.csv'
>>> bucket_name = "bucket-name"
>>> row[len("s3://" + bucket_name + "/"):]
'qwe/2022/02/24/qwe.csv'

But since this is a URL, you will have more robust solution if you parse the url. You can use the parts to verify that you got the string you want and it will deal with other issues such appended query strings.

from urllib.parse import urlsplit
row='s3://bucket-name/qwe/2022/02/24/qwe.csv'
parts = urlsplit(row)
if parts.scheme != "s3":
    raise ValueError("not s3 bucket")
if parts.netloc != "bucket-name":
    raise ValueError("not my bucket")
print(parts.path[1:])

Upvotes: 1

Bhargav
Bhargav

Reputation: 4062

Oops you have missed quotes

bucket_name='bucket-name'
new_row = row.split('s3://'+bucket_name+'/')[1]

ouytput

'qwe/2022/02/24/qwe.csv'

enter image description here

Upvotes: 1

Daniele Scalco
Daniele Scalco

Reputation: 191

You can also do like this:

row='s3://bucket-name/qwe/2022/02/24/qwe.csv'
bucket_name='bucket-name'
new_row = row.split(f"""s3://{bucket_name}/""")[1]

Upvotes: 1

Related Questions