Reputation: 61
I have a text file ("name_data.txt"
) that has the following contents:
name: Kelo
family name: Lam
location: Asia
members: Kelo, Kiko, Jil
name: Miko
family name: Naiton
location: Japan
members: Miko,Kayati
The text file keeps going with the same pattern (name, family name, location, members)
I want to print out the first line and then print every 5th line so I would be printing only the line with "name" in the beginning. I then want to have a list of the names
I want my output to be :
["Kelo","Miko"]
So far, I have gotten (although, it is wrong):
name_data= load_local_file('name_data.txt',ignore_header=False,delimiter='\t')
def __init __(name_reader):
names=list()
count=0
name_line=5
line_number=0
for name in name_data:
if line_number<5:
line_number +=1
if line_number ==5:
names.append(line_number)
Upvotes: 6
Views: 2402
Reputation: 2929
You can use regular expressions - Python's module for that is re
.
Then with name_data.txt
being:
name: Kelo
family name: Lam
location: Asia
members: Kelo, Kiko, Jil
name: Miko
family name: Naiton
location: Japan
members: Miko,Kayati
Getting the names is simple one-liner:
import re
def get_names():
with open('name_data.txt', 'r') as f:
print(re.findall(r'^name:\s*(\w+)', f.read(), flags=re.MULTILINE))
if __name__ == '__main__':
get_names()
Note the multiline flag setting - when the setting is global, the regex would also match lines with family name: ...
.
See the regex in interactive mode here.
Upvotes: 0
Reputation: 152647
You can identify every fifth line by comparing the linenumber modulo 5
against a number. In your case this should be 0
because you want the first line and the 6th, the 11th, ... (note that python starts with index 0)
To get the line-numbers as well as the content you can iterate over the file with enumerate
.
Then to discard the name:
part of the string and keep what comes after, you can use str.split()
.
A working implementation could look like this:
# Create an empty list for the names
names = []
# Opening the file with "with" makes sure it is automatically closed even
# if the program encounters an Exception.
with open('name_data.txt', 'r') as file:
for lineno, line in enumerate(file):
# The lineno modulo 5 is zero for the first line and every fifth line thereafter.
if lineno % 5 == 0:
# Make sure it really starts with "name"
if not line.startswith('name'):
raise ValueError('line did not start with "name".')
# Split the line by the ":" and keep only what is coming after it.
# Using `maxsplit=1` makes sure you don't run into trouble if the name
# contains ":" as well (may be unnecessary but better safe than sorry!)
name = line.split(':', 1)[1]
# Remove any remaining whitespaces around the name
name = name.strip()
# Save the name in the list of names
names.append(name)
# print out the list of names
print(names)
Instead of enumerate you could also use itertools.islice
with a step argument:
from itertools import islice
with open('name_data.txt', 'r') as file:
for line in islice(file, None, None, 5):
... # like above except for the "if lineno % 5 == 0:" line
Depending on your needs you might consider using the re
module to completly parse the file:
import re
# The regular expression
group = re.compile(r"name: (.+)\nfamily name: (.+)\nlocation: (.+)\nmembers: (.+)\n", flags=re.MULTILINE)
with open(filename, 'r') as file:
# Apply the regex to your file
all_data = re.findall(group, file)
# To get the names you just need the first element in each group:
firstnames = [item[0] for item in all_data]
The firstnames
will be ['Kelo', 'Miko']
for your example and similar if you use [item[1] for item in all_data]
then you get the last names: ['Lam', 'Naiton']
.
To successfully use a regular expression you must ensure it really matches your file layout otherwise you'll get wrong results.
Upvotes: 4
Reputation: 355
Having a name_data.txt
file with data as follows:
1
2
3
4
5
6
7
8
9
10
Here's how you can print the first and every 5th line of it:
content = [line.rstrip('\n') for line in open('name_data.txt')]
names = []
limit = 4
fp = open("name_data.txt")
names.append(content[0])
for i, line in enumerate(fp):
if i == limit:
names.append(line)
limit += 5
fp.close()
print(names)
Checkout http://shortcode.pro/code/read-txt-file-and-print-first-and-every-5th-line/
Upvotes: 0
Reputation: 1686
A simple way to do this would be as follows:
with open('name_data.txt', 'r') as file:
index = 0
for line in file:
if index % 5 == 0:
print(line.split()[1])
index += 1
Upvotes: 2
Reputation: 1567
You could do this in one line with a list comprehension
c = open('test.txt', 'r').readlines()
# for every fifth line extract out name and store in list
a = [i.replace('name: ', '').replace('\n', '') for i in c[::5]]
print(a) # ['Kelo', 'Miko']
Upvotes: 2
Reputation: 1537
Assuming that name_data
is a list of lines in the file, you can do
names = []
for i in range(1, len(name_data), 5):
names.append(name_data[i].split(":")[1].strip())
Upvotes: 1