Reputation: 41
I have this CSV (file):
uid,aabreo
objectClass,top
objectClass,inetOrgPerson
objectClass,UnabPerson
cn,Angela Abreo Garcia
sn,Abreo Garcia
administrativo,no
AdmPortales,no
AdmSisInformacion,no
uid,aabreo265
objectClass,top
objectClass,inetOrgPerson
objectClass,UnabPerson
cn,ANDRES FELIPE ABREO SERRANO
sn,ABREO SERRANO
administrativo,no
uid,aabreo602
objectClass,top
objectClass,inetOrgPerson
objectClass,UnabPerson
cn,ANDRES FELIPE ABREO SERRANO
sn,ABREO SERRANO
administrativo,no
uid,aabril
objectClass,top
objectClass,inetOrgPerson
objectClass,UnabPerson
cn,ALEYDA SMITH ABRIL RINCON
sn,ABRIL RINCON
administrativo,no
I want in another csv, the first column is headers , and another is value
there is my code
import csv
import pandas as pd
f= open(r"C:\Users\USER\Downloads\LDAP_1.csv",encoding="utf-8")
print (f.read())
datos = pd.read_csv(r"C:\Users\USER\Downloads\LDAP_1.csv",header=0)
#print(datos)
dict_data={}
with open(r"C:\Users\USER\Downloads\LDAP_1.csv",encoding="utf-8") as file:
dict_data= dict(filter(None,csv.reader(file)))
print(dict_data)
#print(dict_data.values())
#print(dict_data.keys())
#csv_columns=['uid','objectClass','objectClass','objectClass','cn','sn']
csv_file = "Names.csv"
try:
with open(csv_file, 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=dict_data.keys())
writer.writeheader()
for data in dict_data.values():
writer.writerow(dict_data)
except IOError:
print("I/O error")
But the result is take last value, and put space in blank, so i dont know what make wrong.
uid,objectClass,cn,sn,administrativo,AdmPortales,AdmSisInformacion
aabril,UnabPerson,ALEYDA SMITH ABRIL RINCON,ABRIL RINCON,no,no,no
aabril,UnabPerson,ALEYDA SMITH ABRIL RINCON,ABRIL RINCON,no,no,no
aabril,UnabPerson,ALEYDA SMITH ABRIL RINCON,ABRIL RINCON,no,no,no
aabril,UnabPerson,ALEYDA SMITH ABRIL RINCON,ABRIL RINCON,no,no,no
aabril,UnabPerson,ALEYDA SMITH ABRIL RINCON,ABRIL RINCON,no,no,no
aabril,UnabPerson,ALEYDA SMITH ABRIL RINCON,ABRIL RINCON,no,no,no
aabril,UnabPerson,ALEYDA SMITH ABRIL RINCON,ABRIL RINCON,no,no,no
Upvotes: 0
Views: 52
Reputation: 148900
Your initial file is not a csv file. In a csv file, a record should be contained in one single row, while in you file a record only ends on an empty line. Using pandas csv to process it is close to using a hammer to drive a screw: if the hammer is heavy enough, the screw will end into the board yet it is not the right tool.
That means that you have a text file with a custom structure, so my opinion is that you should use a custom parser to build the records and then write those records to a true csv file directly with the csv module. You could of course use pandas here, but (still IMHO) the added value is not worth it.
But your problem is directly caused by the objectClass
field to be multi-valued in the ldap database. You are trying to build a csv with duplicated column names which should be avoided, and use a dict for that which is not possible because a key has to be unique in a dict.
You have different ways to solve that:
concatenate the various objectClass
value into a single field with a different separator. Easy to build, but slightly harder to decode
add something to have column names to be distinct, for example objectClass
, objectClass1
, objectClass2
. Easy to process, but if you cannot know in advance the possible number of objectClass values for all the records, it will be harded to format your file
duplicate the records to write one record per objectClass value
uuid, objectClass
aabreo,top
aabreo,InetOrgPerson
aabrea,UnagPerson
Without knowing more of the way you want to use the final file, I cannot guess which way is better for you...
Upvotes: 1