Reputation: 552
i want to iterate through a list so i can find the index number where the first item in the list finds its first match. my results should print mylist[0:first_match]
here is what i mean:
.APT 5B APT 5B .
.BUSINESS JOEY BUSINESS.
. 1ST FL .
. NATE JR SAM .
. JOE 7 .
. .
.2ND FLR TOM 2ND FLR .
.A1 2FL APT 71E .
.APT E205 APT 1R .
. CONSTRUCTION .
.APT 640 APT 545.
.PART1 SYNC PART2 .
. NATE JR SAM .
the problem im running into is the program keeps adding items to dictionary even after the first match is found therefore appending data that i want to ignore/bypass..
here is what i have:
dictt = {}
with open(path + 'sample33.txt', 'rb') as txtin:
for line in txtin:
part2 = line[1:29].split()
uniq = []
print '%r' % part2
for key in part2:
if key not in dictt:
dictt[key] = key
uniq.append(key)
dictt = {}
print ' '.join(uniq)
Results:
['APT', '5B', 'APT', '5B']
APT 5B
['BUSINESS', 'JOEY', 'BUSINESS']
BUSINESS JOEY
['1ST', 'FL']
1ST FL
['NATE', 'JR', 'SAM']
NATE JR SAM
['JOE', '7']
JOE 7
[]
['2ND', 'FLR', 'TOM', '2ND', 'FLR']
2ND FLR TOM
['A1', '2FL', 'APT', '71E']
A1 2FL APT 71E
['APT', 'E205', 'APT', '1R']
APT E205 1R # Would like to stop adding items after first 'APT' match
['CONSTRUCTION']
CONSTRUCTION
['APT', '640', 'APT', '545']
APT 640 545 # same here...
['PART1', 'SYNC', 'PART2']
PART1 SYNC PART2
['NATE', 'JR', 'SAM']
NATE JR SAM
[Finished in 0.1s]
i hope i have explained this correctly and someone can fine tune it
thank you
Edit #1 here is an example of what i would like to print:
listt:
['APT', '640', 'APT', '1', '2', '3']
found 'APT' match so:
print:
APT 640
ignore ...'APT', '1', '2', '3']
Upvotes: 0
Views: 781
Reputation: 23
I'm not sure I completely understand what you need, but this can be useful.
def read_text(name_file, string):
index_found = [0, 0]
result = [0, 0]
with open (name_file) as f:
read_temp = [word for line in f for word in line.split()]
for s in read_temp:
if string in str(s):
index_str = read_temp.index(s)
index_found[0] = index_str
index_found[1] = index_str + 1
result[0] = read_temp[index_found[0]]
result[1] = read_temp[index_found[1]]
return result
os.chdir('Path to your .txt')
result_list = read_text("your_file.txt", "APT") # "APT" or whatever string you need to find.
print result_list
Output:
['APT', '5B']
Upvotes: 0
Reputation: 19763
here you go:
>>> f = open('your_file.txt')
>>> for x in f:
line = re.findall('\w+',x.strip())
print line
try:
print " " .join(line[:line[1:].index(line[0])+1])
except: print " ".join(line)
output:
['APT', '5B', 'APT', '5B']
APT 5B
['BUSINESS', 'JOEY', 'BUSINESS']
BUSINESS JOEY
['1ST', 'FL']
1ST FL
['NATE', 'JR', 'SAM']
NATE JR SAM
['JOE', '7']
JOE 7
[]
['2ND', 'FLR', 'TOM', '2ND', 'FLR']
2ND FLR TOM
['A1', '2FL', 'APT', '71E']
A1 2FL APT 71E
['APT', 'E205', 'APT', '1R']
APT E205 # not printing after match
['CONSTRUCTION']
CONSTRUCTION
['APT', '640', 'APT', '545']
APT 640 # not printing after match
['PART1', 'SYNC', 'PART2']
PART1 SYNC PART2
['NATE', 'JR', 'SAM']
NATE JR SAM
Upvotes: 1
Reputation: 1935
If your concern is about removing duplicate entries from your list then "set" is there to rescue you.
uniqlist = list(set(dupelist))
I should also mention there is another article that references the ability to remove duplicates from a list.
Upvotes: 0