user9807244
user9807244

Reputation:

Getting KeyError after reading in pipe-separated CSV

I read in a pipe-separated CSV like this

test = pd.read_csv("http://kejser.org/wp-content/uploads/2014/06/Country.csv")
test.head()

This returns

  SK_Country|"Number"|"Alpha2Code"|"Alpha3Code"|"CountryName"|"TopLevelDomain"
0                    1|20|"ad"|"and"|"Andorra"|".ad"                          
1                 2|4|"af"|"afg"|"Afghanistan"|".af"                          
2        3|28|"ag"|"atg"|"Antigua and Barbuda"|".ag"                          
3                  4|660|"ai"|"aia"|"Anguilla"|".ai"                          
4                     5|8|"al"|"alb"|"Albania"|".al"

When I try and extract specific data from it, like below:

 df = test[["Alpha3Code"]]

I get the following error:

KeyError: ['Alpha3Code'] not in index

I don't understand what goes wrong - I can see the value is in the CSV when I print the head, likewise when I open the CSV, everything looks fine.

I've tried to google around and read some posts regarding the issue here on the stack and tried different approaches, but nothing seems to fix this annoying problem.

Upvotes: 3

Views: 415

Answers (2)

Ivanovitch
Ivanovitch

Reputation: 368

As pointed out in the comment by @chrisz, you have to specify the delimiter:

test = pd.read_csv("http://kejser.org/wp-content/uploads/2014/06/Country.csv",delimiter='|')
test.head()
SK_Country  Number Alpha2Code Alpha3Code          CountryName  \
0           1      20         ad        and              Andorra   
1           2       4         af        afg          Afghanistan   
2           3      28         ag        atg  Antigua and Barbuda   
3           4     660         ai        aia             Anguilla   
4           5       8         al        alb              Albania   

  TopLevelDomain  
0            .ad  
1            .af  
2            .ag  
3            .ai  
4            .al  

Upvotes: 0

miradulo
miradulo

Reputation: 29680

Notice how everything is crammed into one string column? That's because you didn't specify the delimiter separating columns to pd.read_csv, which in this case has to be '|'.

test = pd.read_csv("http://kejser.org/wp-content/uploads/2014/06/Country.csv", 
                   sep='|')
test.head()

#    SK_Country  Number Alpha2Code Alpha3Code          CountryName  \
# 0           1      20         ad        and              Andorra   
# 1           2       4         af        afg          Afghanistan   
# 2           3      28         ag        atg  Antigua and Barbuda   
# 3           4     660         ai        aia             Anguilla   
# 4           5       8         al        alb              Albania   
# 
#   TopLevelDomain  
# 0            .ad  
# 1            .af  
# 2            .ag  
# 3            .ai  
# 4            .al 

Upvotes: 1

Related Questions