NapoleonB
NapoleonB

Reputation: 47

Extracting Symbol line by line

I have a question w.r.t extracting a string, with varied len, from individual line breaks that are only demarcated by '|' and spaces. Take a look at the following link

http://ftp.nasdaqtrader.com/dynamic/SymDir/nasdaqlisted.txt

I am trying to extract all the company symbols under the first column of the above link. However, I cant think of a logic loop that will do that and store it in a way that is easy for extraction in the future.

I was hoping any pr0s may have an opinion!

EDIT:

Hi I understand some of your reservations. I would be very satisfied with how to think about the solution logically.

Upvotes: 1

Views: 82

Answers (2)

teraflik
teraflik

Reputation: 72

Have a look at the python csv module:

import csv

with open('nasdaqlisted.txt', 'r') as csvFile:
    reader = csv.reader(csvFile, delimiter='|')
    for row in reader:
        print(row[0])

csvFile.close()

You just need to change the delimiter to '|' and it works out of the box.

Upvotes: 0

Devanshu Misra
Devanshu Misra

Reputation: 813

I hope this helps your case where you directly scrape data off the text page:

import requests

response = requests.get('http://ftp.nasdaqtrader.com/dynamic/SymDir/nasdaqlisted.txt')
document = response.text.splitlines()

for line in document[1:-1]:      #This helps you skip unnecessary lines
    data = line.split('|')
    symbol = data[0]
    print(symbol)

You can skip the first and last line of the document since they are not associated with the symbols you are looking for. Also, splitlines creates a list of lines for you and you can use list index to skip the first and last lines.

Upvotes: 1

Related Questions