All NAN values when filtering pandas DataFrame to one column

Question

I'm importing data from a .csv file, which is stored in one dataframe. Looks fine there:

After which I try to store only one column of the dataframe elsewhere. However, it returns all NaN values:

The exact same code works fine for a .xls file earlier in the same Python script. So I'm unsure of what is happening here. Any clarification would be appreciated. Here is the source code:

    # ------------------------------------------------------------------------------
print("
SELECT Q MEASUREMENT FILE TO FIX: ")
time.sleep(1)
# Allow User to pick file that which needs X-Y data to be FIXED
tkinter.Tk().withdraw()  # Close the root window
input2 = filedialog.askopenfilename()
print("
You selected file:")
print(input2)
print("
")
input2 = str(input2)

# Check to see if directory/file exists
assert os.path.exists(input2), "File does not exist at, "+str(input2)

# Import data below and store in df
print("
Importing Excel Workbook...")
time.sleep(1)
# You can check encoding of file with notepad++
dfQ = pd.read_csv(input2, encoding="ansi")
dfQ.values
print(dfQ)  # This DataFrame (dfQ) contains the entire excel workbook
print("

Workbook Successfully Imported")
time.sleep(.5)
print("...")

# Search Q measurements CSV for "Chip ID" and matches it to corresponding
# "PartID" in the master table created from manually fixed file.
print("Matching PartID's to update proper X-Y values")
time.sleep(.5)
print("...")
IDs = pd.DataFrame(dfQ, columns=['Chip ID'])
time.sleep(.5)
print(IDs)
s = IDs.size
print("
Successfully extracted", s, "Chip ID's!")
print(dfQ.columns)

Maximilian Janisch · Accepted Answer

The problem is that your column is actually named Chip ID (with a space) and not Chip ID.

So either IDs = dfQ[" Chip ID"] for a Series or IDs = dfQ[[" Chip ID"]] should both work.

All NAN values when filtering pandas DataFrame to one column

Answers (2)

Related Questions