Is there a way to convert ANSI (Windows only) encoded files to UTF-8 using python?

Question

The reason I am opening up a new question here is because all answers I can find seem to be using code that runs on Windows.
Here is the situation...
I receive new files every month for work that I need to convert to UTF-8 from an ANSI encoding. I have enough files for the need for automation so I have resorted to a python script. Until recently, I was on Windows and everything worked fine. After switching to Mac, I realized that ANSI is a Windows only encoding type and now my script no longer works.
Question: Is there a way to convert ANSI encoded CSVs to UTF-8 encoded while using a Mac?

Here is the code that WAS working on my Windows machine.

import sys
import os

if len(sys.argv) != 2:
  print(f"Converts the contents of a folder to UTF-8 from ASCI.")
  print(f"USAGE: 
\
    python ANSI_to_UTF8.py  
\
    If targeting a nested folder, make sure to use an escaped \. ie: parent\\child")
  sys.exit()

from_encoding = "ANSI"
to_encoding = "UTF-8"
list_of_files = []
current_dir = os.getcwd()
folder = sys.argv[1]
suffix = "_utf8"
target_folder = folder + "_utf8"


try:
  os.mkdir(target_folder)
except FileExistsError:
  print("Target folder already exists.")
except:
  print("Error making directory!")

for root, dirs, files in os.walk(folder):
    for file in files:
        list_of_files.append(os.path.join(root,file))


for file in list_of_files:
  print(f"Converting {file}")

  original_path = file

  filename = file.split("\")[-1].split(".")[0]
  extension = file.split("\")[-1].split(".")[1]
  folder = "\".join(original_path.split("\")[0:-1])
  new_filename = filename + "." + extension
  new_path = os.path.join(target_folder, new_filename)

  f= open(original_path, 'r', encoding=from_encoding)
  content= f.read()
  f.close()
  f= open(new_path, 'w', encoding=to_encoding)
  f.write(content)
  f.close()

print(f"Finished converting {len(list_of_files)} files to {target_folder}")

It seems that no matter what approach I take, my Mac does not recognize the ANSI encoding type. Any help would be much appreciated. Thank you.

Edit 1: Reference Convert from ANSI to UTF-8
This question has two answers and neither work for me. Answer one, I get a utf8 error.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 25101: invalid continuation byte

Answer two, I believe the root cause is because I am on Mac and this OS does not understand mbcs encoding.

LookupError: unknown encoding: mbcs

Is there a way to convert ANSI (Windows only) encoded files to UTF-8 using python?

Answers (1)

Related Questions