Amit Pal
Amit Pal

Reputation: 11052

Getting "newline inside string" while reading the csv file in Python?

I have this utils.py file in Django Architecture:

def range_data(ip):
    r = []
    f = open(os.path.join(settings.PROJECT_ROOT, 'static', 'csv ', 
                          'GeoIPCountryWhois.csv'))
    for num,row in enumerate(csv.reader(f)):
        if row[0] <= ip <= row[1]:
            r.append([r[4]])
            return r
        else:
            continue
    return r

Here the ip parameter is just the IPv4 Address, I am using open source MAXMIND GeoIPCountrywhois.csv file.

Some starting content of GeopIOCountrywhois.csv:

"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"

I have also read about the issue, But didn't found so much understandable. Would you please help me to solve that error?

According to my method in utils, I am checking country name of paasing parameter IP address to the method.

Upvotes: 9

Views: 31389

Answers (3)

Jim Geovedi
Jim Geovedi

Reputation: 321

had similar problem earlier today, there was an end quote missing from a line and the solution is by instructing reader to perform no special processing of quote characters (quoting=csv.QUOTE_NONE).

Upvotes: 14

shiva
shiva

Reputation: 2770

  1. You can preprocess the csv by removing the newline like below.

    import csv
    
    content = open("GeoIPCountryWhois.csv", "r").read().replace('\r\n','\n')
    
    with open("GeoIPCountryWhois2.csv", "w") as g:
        g.write(content)
    

    Then Use GeoIPCountryWhois2 for csv reader.

  2. A wild Guess using a lineterminator may solve your problem

    for num,row in enumerate(csv.reader(f,lineterminator='\n'))
    

    See also: http://docs.python.org/lib/csv-fmt-params.html

Upvotes: 8

Martijn Pieters
Martijn Pieters

Reputation: 1122452

You must open your files as binary:

def range_data(ip):
    r = []
    f = open(os.path.join(settings.PROJECT_ROOT, 'static', 'csv ', 
                          'GeoIPCountryWhois.csv'), 'rb')
    for num,row in enumerate(csv.reader(f)):
        # Your things.

Note the 'rb' mode there; otherwise the file could be opened with native line endings, and the CSV reader doesn't handle the various forms very well. Certainly the copy of GeoIPCountryWhois.csv that I downloaded has clean \n line endings.

This is documented for the .reader() method:

If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.

If, however, your csv file is so corrupted as to still contain unexpected newline characters in unexpected places, use this file subclass instead as a stop-gap measure:

class CleanlinesFile(file):
    def next(self):
        line = super(CleanlinesFile, self).next()
        return line.replace('\r', '').replace('\n', '') + '\n'

This class guarantees there will be no newlines anywhere in the returned results except as the very last character (just the way the csv module wants it). Use it instead of the open call; the 'rb' mode modifier becomes optional in this case:

def range_data(ip):
    r = []
    f = CleanlinesFile(os.path.join(settings.PROJECT_ROOT, 'static', 'csv ', 
                          'GeoIPCountryWhois.csv'))
    for num,row in enumerate(csv.reader(f)):
        # Your things.

Upvotes: 4

Related Questions