ThoseKind
ThoseKind

Reputation: 804

Finding Row with No Empty Strings

I am trying to determine the type of data contained in each column of a .csv file so that I can make CREATE TABLE statements for MySQL. The program makes a list of all the column headers and then grabs the first row of data and determines each data type and appends it to the column header for proper syntax. For example:

ID   Number   Decimal   Word
0    17       4.8       Joe

That would produce something like CREATE TABLE table_name (ID int, Number int, Decimal float, Word varchar());.

The problem is that in some of the .csv files the first row contains a NULL value that is read as an empty string and messes up this process. My goal is to then search each row until one is found that contains no NULL values and use that one when forming the statement. This is what I have done so far, except it sometimes still returns rows that contains empty strings:

def notNull(p): # where p is a .csv file that has been read in another function
    tempCol = next(p)
    tempRow = next(p)
    col = tempCol[:-1]
    row = tempRow[:-1]
    if any('' in row for row in p):
        tempRow = next(p)
        row = tempRow[:-1]
    else:
        rowNN = row
    return rowNN

Note: The .csv file reading is done in a different function, whereas this function simply uses the already read .csv file as input p. Also each row is ended with a , that is treated as an extra empty string so I slice the last value off of each row before checking it for empty strings.

Question: What is wrong with the function that I created that causes it to not always return a row without empty strings? I feel that it is because the loop is not repeating itself as necessary but I am not quite sure how to fix this issue.

Upvotes: 0

Views: 114

Answers (1)

totoro
totoro

Reputation: 2456

I cannot really decipher your code. This is what I would do to only get rows without the empty string.

import csv

def g(name):
    with open('file.csv', 'r') as f:
        r = csv.reader(f)
        # Skip headers
        row = next(r)

        for row in r:
            if '' not in row:
                yield row

for row in g('file.csv'):
    print('row without empty values: {}'.format(row))

Upvotes: 2

Related Questions