Reputation: 131
I have a txt file containing pipe | delimited values, but unfortunately, I have encountered some errors , the csv DictReader assumed that the comma in a column is actually n values for n columns. I have tried the
skipinitialspace = True
parameter, but it didn't work.
My txt file, "test.csv";
Name|Company|Age|Gender|Fav Color|Fav Food
John|Custom Inc|23|Male|Red, Blue, Yellow|Chocolate
Mary|Troopers , Inc|25|Female|Black|Chocolate
Riki|Dorw Technology|33|Male|White, Yellow, Black|Ice cream, pudding
My code:
import csv
name= []
company = []
age = []
gender = []
fav_color = []
fav_food = []
with open("test.csv", "rU") as f:
reader = csv.Dictreader(f, delimiter = '|')
for row in reader:
name.append(row['Name'])
company.append(row['Company'])
age.append(row['Age'])
gender.append(row['Gender'])
fav_color.append(row['Fav Color'])
fav_food.append(row['Fav Food'])
RESULT_ADD = [[] for _ in range(len(name))]
for attr in range(len(name)):
RESULT_ADD[attr].append(name[attr])
RESULT_ADD[attr].append(company[attr])
RESULT_ADD[attr].append(age[attr])
RESULT_ADD[attr].append(gender[attr])
RESULT_ADD[attr].append(fav_color[attr])
RESULT_ADD[attr].append(fav_food[attr])
with open("result.csv", "w") as newfile:
header = ['NAME_OF_CEO', 'NAME_OF_COMPANY' , 'AGE', 'GENDER', 'FAV_COLOR', 'FAV_FOOD']
wr.csv.writer(newfile, delimiter= ';', quoting = csv.QUOTE_MINIMAL)
wr.writerow(header)
for item in RESULT_ADD:
wr.writerow(item)
What did I get;
NAME_OF_CEO;NAME_OF_COMPANY;AGE;GENDER;FAV_COLOR;FAV_FOOD
John;Custom Inc;23;Male;Red,Blue,Yellow;Chocolate
Mary;T,r,o,o,p,e,r,s,,, ,I,n,c;25;Female;Black;Chocolate
Riki;Dorw Technology;33;Male;White,Yellow,Black;Ice cream,pudding
The first problem here is;
For John, FAV_COLOR contains "Red", while FAV_FOOD contains "Blue", and the next nameless column contains "Yellow" and same as "Banana", located to the next empty nameless column.
For Mary, T,r,o,o,p,e,r,s,,, ,I,n,c , when opening in LibreOffice , the word "Troopers" was located in the NAME_OF_THE_COMPANY but "Inc" was located in the AGE column, then the Age 25 was located in the next column, FAV_COLOR, etc,
For Riki, same problems as John
Upvotes: 0
Views: 774
Reputation: 126
You can try using pandas, as dataframes are very useful to work with tabular data:
import pandas as pd
df = pd.read_csv("test.csv", sep = "|")
df.to_csv("result.csv", sep=";")
In this way you will not need to maintain and populate individual lists for each attribute. Once you get accustomed to dataframes, they are great.
About importing in LibreOffice, I haven't tested this but according to the docs, https://help.libreoffice.org/Calc/Importing_and_Exporting_CSV_Files:
If the field or cell contains a comma, the field or cell must be enclosed by single quotes (') or double quotes (").
I hope that helps!
Upvotes: 1