ftplib error encoding. latin-1 utf-8: utf-8' codec can't decode byte 0xf1 in position 132: invalid continuation byte

I have a function (code attached below) that should download the most recent day's files and save them in Azure Blob.

The problem is that the encoding of these files is 'latin-1' as they may contain characters like "ñ".

I use the 'ftplib' library to be able to work over FTP. If I don't put

ftp.encoding = 'latin-1'

I get an error when listing the files in the line

files = ftp.nlst()"

However, with this configuration I get the following error

'utf-8' codec can't decode byte 0xf1 in position 132: invalid continuation byte

in the line

ftp.retrbinary('RETR ' + original_file_name, audio_buffer.write)

I need that the funtion retrbinary

ftp.retrbinary('RETR ' + original_file_name, audio_buffer.write) 

to be able to interpret latin-1 encoded files. Or some other alternative to be able to download the files and save them in Azure.

def download_audios_from_ftp_to_azure_blob(ftp_details, connection_name, container_name):
    ftp = FTP(ftp_details['host'])
    ftp.login(ftp_details['username'], ftp_details['password'])
    ftp.set_pasv(True)
    ftp.encoding = 'latin-1'
    ftp.cwd(ftp_details['remote_directory'])

    blob_service_client = BlobServiceClient.from_connection_string(connection_name)
    container_client = blob_service_client.get_container_client(container_name)

    files = ftp.nlst()

    df_files = pd.DataFrame(files, columns=['file_name'])

    for _, row in df_files.iterrows():
        original_file_name = row["file_name"]
        with BytesIO() as audio_buffer:
            ftp.retrbinary('RETR ' + original_file_name, audio_buffer.write)
            audio_buffer.seek(0)
            if is_stereo(audio_buffer):
                audio_buffer = convert_file_to_mono(audio_buffer)

        blob_client = container_client.get_blob_client(blob=original_file_name)
        blob_client.upload_blob(audio_buffer, overwrite=True)

    ftp.quit()

I have tried using UTF-8 encoding but I get an error when listing the files.

I have also tried to encode and decode them but when I rename the file (when I encode) it does not find it in the ftp to download it.

I have also tried to use the Paramiko library, but the FTP configuration is not SFTP.

Upvotes: 0

Views: 183

Answers (0)

Related Questions