babel113
babel113

Reputation: 59

Reading files from FTP server to DataFrame in Python

I would like to load a file from an FTP server into Pandas dataframe without downloading it to disk first. I have written a script that executes this command but with downloading to disk. Is this possible in the ftplib library? Do you see any solution to this problem?

from ftplib import FTP
import os
import pandas as pd
ftps = FTP('gssc.esa.int')
ftps.login()
ftps.cwd('/gnss/data/daily/2019/001/')
filename = '19001.V3status'
local_filename = os.path.join(r"C:/path/where/download/files", filename) #example
lf = open(local_filename, "wb")
ftps.retrbinary('RETR ' + filename, lf.write)
file = "C:/path/where/download/files/" +filename #example
dataV3status = pd.read_fwf(file,
                           names = ('Mon_ID', 'Full_Mon_ID', 'RNX_Ver.', 'Dly(H)',
                                    'Dly(M)', 'V', 'Receiver_Type', 'Antenna_Type',
                                    'Mkr_Name', 'Marker_Number', 'Typ', 'G', 'R',
                                    'E', 'C', 'J', 'S', 'I', 'MD5_Checksum'), 
                           widths = [5,9,5,5,6,2,20,22,5,10,3,3,2,2,2,2,2,2,32],
                           header = None,
                           skiprows = 5,
                           skipfooter = 16)

Upvotes: 3

Views: 2464

Answers (2)

Martin Prikryl
Martin Prikryl

Reputation: 202272

If you want to stick with ftplib, you can do something like this:

from io import BytesIO

flo = BytesIO()
ftp.retrbinary('RETR ' + filename, flo.write)
flo.seek(0)
pd.read_fwf(flo, ...)

Though, pandas.read_fwf documentation claims that it supports FTP directly.
So this should do too:

pd.read_fwf("ftp://gssc.esa.int//gnss/data/daily/2019/001/19001.V3status", ...)

Upvotes: 4

Aravind tronix
Aravind tronix

Reputation: 51

Yes you can stream the file using the package tentaclio

with tentaclio.open("ftp://user:password@host/path/.../19001.V3status") as reader:
    df = pd.read_fwf(reader)

Upvotes: 1

Related Questions