Reputation: 21
I need to convert
Name | Org
a | 5
a | 6
b | 5
c | 7
into
Name | Org
a | 5,6
b | 5
c | 7
my first attempt was with this code
while i < len(nameColumn):
if nameColumn[i] not in resultC1:
resultC1.append(nameColumn[i])
while l < len(nameColumn):
if nameColumn[l] == nameColumn[i]:
tempdata += organizationColumn[l] + ','
l += 1
resultC2.append(tempdata[:-1])
tempdata = ''
k += 1
i += 1
which just ends up with the result
Name | Org
a |
b |
c |
Any help would be greatly appreciated. I havent had luck finding anything on this yet. I'm reading the data from a .CSV file into lists and working with that data and storing results into resultC1 and resultC2
Upvotes: 0
Views: 54
Reputation: 24153
Use OrderedDict
calling setdefault
with an empty list
, and the csv
module:
import csv
from collections import OrderedDict
organizations = OrderedDict()
with open(filename) as infile:
for name, org in csv.reader(infile, delimiter='|'):
organizations.setdefault(name, []).append(org)
Then you can write the dictionary:
with open(filename, 'w') as outfile:
writer = csv.writer(outfile, delimiter='|')
for name, orgs in organizations.items():
writer.writerow([name, ','.join(orgs)])
Upvotes: 0
Reputation: 4187
Here is another solution that can be generic to have delimiter for input and output file.
def parseData(fileName, delimiter):
dictionary={}
with open(fileName, 'r') as iFile:
for line in iFile.readlines():
row = line.split(delimiter)
values = []
if (row[0] in dictionary.keys()):
values = dictionary[row[0]]
values.append(row[1].replace('\n',''))
else:
values.append(row[1].replace('\n',''))
dictionary[row[0]] = values
dictionary[row[0]] = values
## print for debugging purpose
print(dictionary)
return dictionary
def writeData(fileName, odelimiter, idelimiter, dictionary):
with open(fileName, 'w') as oFile:
for key, values in dictionary.items():
data=""
for value in values:
data = data + value + idelimiter
data=data[:-1]
## print for debugging purpose
print(key, data)
oFile.write(key + odelimiter + data + "\n")
## main
dictionary=parseData('inputPipe.txt', "|")
writeData('output.txt', "|", ",", dictionary)
inputPipe.txt
a|5
a|6
b|5
c|7
output.txt
a|5,6
b|5
c|7
Sample Run
{'a': ['5', '6'], 'b': ['5'], 'c': ['7']}
a 5,6
b 5
c 7
Upvotes: 0
Reputation: 92874
The solution using itertools.groupby()
function:
import csv, itertools
with open('yourfile.csv', 'r') as f:
reader = csv.reader(f, delimiter='|', skipinitialspace=True)
head = next(reader) # header line
items = [list(g) for k,g in itertools.groupby(sorted(reader), key=lambda x: x[0])]
fmt = '{0[0]:<5} | {0[1]:^5}' # format spec
print(fmt.format(head))
for item in items:
print(fmt.format([item[0][0], ','.join(i[1] for i in item)] if len(item) > 1 else item[0]))
The output:
Name | Org
a | 5,6
b | 5
c | 7
Upvotes: 0
Reputation: 402844
Here's a solution using collections.OrderedDict
:
import csv
from collections import OrderedDict
data = OrderedDict()
with open('test.csv') as f:
reader = csv.reader(f)
for i, line in enumerate(reader):
if i == 0:
continue
if line[0] not in data:
data[line[0]] = []
data[line[0]].append(line[1])
for k, v in data.items():
print(k, '|', ', '.join(v))
An OrderedDict
retains its order. The keys are the Name
s, and the values are lists of all Org
s associated with it.
Output:
a | 5, 6
b | 5
c | 7
If your csv has different delimiters from comma, then you'll have to specify that delimiter. I've assumed commas in my example.
Here's a much simpler solution with pandas
:
In [443]: df.head()
Out[443]:
Name Org
0 a 5
1 a 6
2 b 5
3 c 7
In [445]: for k, v in df.groupby('Name').apply(lambda x: list(x['Org'])).iteritems():
...: print(k, '|', ', '.join(map(str, v)))
...:
a | 5, 6
b | 5
c | 7
Upvotes: 1
Reputation: 7030
Assuming you're starting with the two arrays implied in your sample code, I'd go with something like this:
from collections import defaultdict
nameColumn = ['a', 'a', 'b', 'c']
organizationColumn = ["5", "6", "5", "7"]
merged = defaultdict(list)
for name, org in zip(nameColumn, organizationColumn):
merged[name].append(org)
for k, v in merged.items():
print(f'{k} | {v}'))
Upvotes: 0