Reputation: 61
So I am working on a project which requires specific data from the cosmic-2 satellite.
The data is stored in compressed tar.gz and there are thousands of files so I don't want to download them all and then process them one by one due to time and storage constraints.
Instead I would like to look for an alternative way that allow me to read data from files directly without having to download them first.
Maybe requests or urllib can do that
Currently I tried
url = https://sitename.com/data.tar.gz
File = response.get(url, stream= True)
With tarfile.open(file, "r:gz") as f: f.extractall()
Upvotes: 0
Views: 841
Reputation: 186
You can read data from a tar.gz file online without downloading it locally in Python by using the urllib module to fetch the file and tarfile module to extract its contents.
Here's an example of how you can do this:
import urllib.request
import tarfile
import io
url = "http://example.com/your_file.tar.gz" # Replace with the actual URL of the tar.gz file
# Fetch the tar.gz file
response = urllib.request.urlopen(url)
tar_bytes = io.BytesIO(response.read())
# Extract the contents
with tarfile.open(fileobj=tar_bytes, mode="r:gz") as tar:
for member in tar.getmembers():
f = tar.extractfile(member)
if f is not None:
content = f.read()
print(content.decode("utf-8"))
Upvotes: 2
Reputation: 14
I looked up a few options and found this - https://extract.me/
You can use the url directly, so maybe just right click on the link of the file for ftp and copy and paste to check.
Hope it helps
Upvotes: 0