Reputation: 27
I have python code as below:
import os
from os import listdir
def find_csv_filenames( path_to_dir, suffix=".csv" ):
filenames = listdir(path_to_dir)
return [ filename for filename in filenames if filename.endswith( suffix ) ]
#always got the error this below code
filenames = find_csv_filenames('C:\casperjs\project\teleservices\csv')
for name in filenames:
print name
I meet the error :
filenames = find_csv_filenames('C:\casperjs\project\teleservices\csv')
Error message: `TabError: inconsistent use of tabs and spaces in indentation`
What I need : I want to read all csv files and convert it from encoding ansi to utf8 but the code above is only read path of each csv files. I don't know what's wrong with it?
Upvotes: 0
Views: 7506
Reputation: 4462
Below will convert each line in ascii-file:
import os
from os import listdir
def find_csv_filenames(path_to_dir, suffix=".csv" ):
path_to_dir = os.path.normpath(path_to_dir)
filenames = listdir(path_to_dir)
#Check *csv directory
fp = lambda f: not os.path.isdir(path_to_dir+"/"+f) and f.endswith(suffix)
return [path_to_dir+"/"+fname for fname in filenames if fp(fname)]
def convert_files(files, ascii, to="utf-8"):
for name in files:
print "Convert {0} from {1} to {2}".format(name, ascii, to)
with open(name) as f:
for line in f.readlines():
pass
print unicode(line, "cp866").encode("utf-8")
csv_files = find_csv_filenames('/path/to/csv/dir', ".csv")
convert_files(csv_files, "cp866") #cp866 is my ascii coding. Replace with your coding.
Upvotes: 1
Reputation: 43096
Your code is just listing csv files. It doesn't do anything with it. If you need to read it, you can use the csv module. If you need to manage encoding, you can do something like this:
import csv, codecs
def safe_csv_reader(the_file, encoding, dialect=csv.excel, **kwargs):
csv_reader = csv.reader(the_file, dialect=dialect, **kwargs)
for row in csv_reader:
yield [codecs.decode(cell, encoding) for cell in row]
reader = safe_csv_reader(csv_file, "utf-8", delimiter=',')
for row in reader:
print row
Upvotes: 0
Reputation: 364
Refer to documentation: http://docs.python.org/2/howto/unicode.html
If you need a string, say it is stored as s, that you want to encode as a specific format, you use s.encode()
Upvotes: 0