Reputation: 1875
I want to split a file that contains a tab delimited list of words into a "list" with the word preceeded by a number.
So if the input file contains this (where the space between words is a tab): tree car house blanket
I'd like this output:
1 tree
2 car
3 house
4 blanket
I've got this code working that prints out the "list of words" but not sure how to get the counter in front of the words:
#!/usr/bin/env python
import csv
with open ("commonwords.tsv") as file:
for line in file:
print line.replace("\t","\n")
Thanks
Upvotes: 1
Views: 261
Reputation: 180391
You can use enumerate:
import csv
with open("commonwords.tsv") as f:
for line in f:
line = line.replace("\t", "\n")
for ind, word in enumerate(line.split(), 1):
print "{0} {1}".format(ind, word)
1 tree
2 car
3 house
4 blanket
Not sure if you want the count to reset each line or continue to the end:
with open("commonwords.tsv") as f:
line = f.read().replace("\t", "\n")
for ind, word in enumerate(line.split(), 1):
print "{0} {1}".format(ind, word)
You can also just split without replacing:
with open("commonwords.tsv") as f:
lines = f.read().split()
for ind, word in enumerate(lines, 1):
print "{0} {1}".format(ind, word)
Upvotes: 1
Reputation: 77337
The enumerate function can count the words for you but you need an iterator or a list of the words, not just the lines of the file. Here's a generator that goes through rows of a csv file and outputs each column individually. its fed through enumerate to get the result.
import csv
def yield_col(reader):
for row in reader:
for item in row:
yield item
with open ("commonwords.tsv") as fp:
reader = csv.reader(fp, dialect='excel-tab')
for num, word in enumerate(yield_col(reader), 1):
print num, word
Upvotes: 1
Reputation: 113905
import csv
import itertools
with open('commonwords.tsv') as infile, open('/path/to/output', 'w') as outfile:
writer = csv.writer(outfile, delimiter='\t')
count = itertools.count(1)
for row in csv.reader(infile, delimiter'\t'):
for word in row:
writer.writerow([next(count), word])
Upvotes: 0