duffymo
duffymo

Reputation: 308763

Read a .csv into pandas from F: drive on Windows 7

I have a .csv file on my F: drive on Windows 7 64-bit that I'd like to read into pandas and manipulate.

None of the examples I see read from anything other than a simple file name (e.g. 'foo.csv').

When I try this I get error messages that aren't making the problem clear to me:

import pandas as pd

trainFile = "F:/Projects/Python/coursera/intro-to-data-science/kaggle/data/train.csv"
trainData = pd.read_csv(trainFile)

The error message says:

IOError: Initializing from file failed

I'm missing something simple here. Can anyone see it?

Update:

I did get more information like this:

import csv

if __name__ == '__main__':
    trainPath = 'F:/Projects/Python/coursera/intro-to-data-science/kaggle/data/train.csv'
    trainData = []
    with open(trainPath, 'r') as trainCsv:
        trainReader = csv.reader(trainCsv, delimiter=',', quotechar='"')
        for row in trainReader:
            trainData.append(row)
    print trainData

I got a permission error on read. When I checked the properties of the file, I saw that it was read-only. I was able to read 892 lines successfully after unchecking it.

Now pandas is working as well. No need to move the file or amend the path. Thanks for looking.

Upvotes: 12

Views: 86295

Answers (6)

Shtefan
Shtefan

Reputation: 808

Try this:

import os
import pandas as pd


trainFile = os.path.join('F:',os.sep,'Projects','Python','coursera','intro-to-data-science','train.csv' )
trainData = pd.read_csv(trainFile)

Upvotes: -1

sheldonzy
sheldonzy

Reputation: 5961

This happens to me quite often. Usually I open the csv file in Excel, and save it as an xlsx file, and it works.

so instead of

df = pd.read_csv(r"...\file.csv")

Use:

df = pd.read_excel(r"...\file.xlsx")

Upvotes: 5

user3126530
user3126530

Reputation: 89

I also got the same issue and got that resolved .

Check your path for the file correctly

I initially had the path like

dfTrain = pd.read_csv("D:\\Kaggle\\labeledTrainData.tsv",header=0,delimiter="\t",quoting=3)

This returned an error because the path was wrong .Then I have changed the path as below.This is working fine.

dfTrain = dfTrain = pd.read_csv("D:\\Kaggle\\labeledTrainData.tsv\\labeledTrainData.tsv",header=0,delimiter="\t",quoting=3)

This is because my earlier path was not correct.Hope you get it reolved

Upvotes: 4

numbers are fun
numbers are fun

Reputation: 473

If you're sure the path is correct, make sure no other programs have the file open. I got that error once, and closing the Excel file made the error go away.

Upvotes: 2

Hanan Shteingart
Hanan Shteingart

Reputation: 9078

A better solution is to use literal strings like r'pathname\filename' rather than 'pathname\filename'. See Lexical Analysis for more details.

Upvotes: 5

zwol
zwol

Reputation: 140609

I cannot promise that this will work, but it's worth a shot:

import pandas as pd
import os

trainFile = "F:/Projects/Python/coursera/intro-to-data-science/kaggle/data/train.csv"

pwd = os.getcwd()
os.chdir(os.path.dirname(trainFile))
trainData = pd.read_csv(os.path.basename(trainFile))
os.chdir(pwd)

Upvotes: 13

Related Questions