Reputation: 5664
I'm working on a project using pandas library, in which I need to read an Excel file which has following columns:
'invoiceid', 'locationid', 'timestamp', 'customerid', 'discount', 'tax',
'total', 'subtotal', 'productid', 'quantity', 'productprice',
'productdiscount', 'invoice_products_id', 'producttax',
'invoice_payments_id', 'paymentmethod', 'paymentdetails', 'amount'
But when I read this file by using the Python code below:
df_full = pd.read_excel('input/invoiced_products_noinvoiceids_inproduct_v2.0.xlsx', sheet_name=0,)
df_full.head()
it returns some rows along with 6 unnamed
columns with values as NAN
.I don't know why these columns are displaying here?
Below is the link to a sample file as requested:
https://mega.nz/#!0MlXCBYJ!Oim9RF56h6hUitTwqSG1354dIKLZEgIszzPrVpfHas8
Why are these extra columns appearing?
Upvotes: 7
Views: 8423
Reputation: 11032
As discussed in comments the problem seems to be that, there is extra data after last named
columns. That's why you are getting Unnamed
columns.
If you wanna drop these columns this is how you can ignore these columns
df_full = df_full[df_full.filter(regex='^(?!Unnamed)').columns]
Upvotes: 8