Reputation: 75
link of input .txt file The code searches for the Lines starting with "From " and then splits the line into words and adds 6th subwords (i.e the hrs part from hr:min:sec)
fhand=open("mbox-short.txt")
words=list()
for line in fhand:
if line.startswith("From "):
word=line.split()
words=word.append(word[6])
print(words)
Upvotes: 0
Views: 43
Reputation: 1323
If you wanted to just get the time section of it. You can try with the following code.
f = open('mbox-short.txt')
words = []
for x in f:
if x.startswith('From'):
w = x.split()
if len(w) > 5:
words.append(w[5])
print(words)
It returns the data as follows:
['09:14:16', '18:10:48', '16:10:39', '15:46:24', '15:03:18', '14:50:18', '11:37:30', '11:35:08', '11:12:37', '11:11:52', '11:11:03', '11:10:22', '10:38:42', '10:17:43', '10:04:14', '09:05:31', '07:02:32', '06:08:27', '04:49:08', '04:33:44', '04:07:34', '19:51:21', '17:18:23', '17:07:00', '16:34:40', '16:29:07', '16:23:48']
Hope it helps.
Upvotes: 0
Reputation: 1178
I think, this is what you wanted. You were appending in the word, which was initialized inside the loop and its value changes in every iteration.
fhand=open("/home/user/Downloads/mbox-short.txt")
words=list()
for line in fhand:
if line.startswith("From "):
word=line.split()
word.append(word[6])
words.append(word)
print(words)
It prints:
[['From', '[email protected]', 'Sat', 'Jan', '5', '09:14:16', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '18:10:48', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '16:10:39', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '15:46:24', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '15:03:18', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '14:50:18', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '11:37:30', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '11:35:08', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '11:12:37', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '11:11:52', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '11:11:03', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '11:10:22', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '10:38:42', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '10:17:43', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '10:04:14', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '09:05:31', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '07:02:32', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '06:08:27', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '04:49:08', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '04:33:44', '2008', '2008'], ['From', '[email protected]', 'Fri', 'Jan', '4', '04:07:34', '2008', '2008'], ['From', '[email protected]', 'Thu', 'Jan', '3', '19:51:21', '2008', '2008'], ['From', '[email protected]', 'Thu', 'Jan', '3', '17:18:23', '2008', '2008'], ['From', '[email protected]', 'Thu', 'Jan', '3', '17:07:00', '2008', '2008'], ['From', '[email protected]', 'Thu', 'Jan', '3', '16:34:40', '2008', '2008'], ['From', '[email protected]', 'Thu', 'Jan', '3', '16:29:07', '2008', '2008'], ['From', '[email protected]', 'Thu', 'Jan', '3', '16:23:48', '2008', '2008']]
Upvotes: 1