Reputation: 113
I have the data file which looks like this -
[Table 1]
Terms Author Frequency
Hepatitis Christopher 2
Acid Subrata 1
Acid Kal 3
Kinase Pramod 31
Kinase Steve 5
Kinase Sharon 10
Acid Rob 5
Acid Christopher 2
Hepatitis Sharon 3
which I want to convert in a frequency matrix like this -
Terms Christopher Subrata Kal Pramod Steve Sharon Rob
Hepatitis 2 0 0 0 0 3 0
Acid 2 0 3 0 0 0 5
Kinase 0 0 0 31 5 10 0
Now I have figured out how to do that and I am using this code for that -
a = pd.read_csv("C:\\Users\\robert\\Desktop\\Python Project\\Publications Data\\New Merged Title Terms Corrected\\Python generated file\\Terms_Frequency_File.csv")
b = a.groupby(['Terms']).apply(lambda x:x.set_index(['Terms','Author']).unstack()['Frequency'])
and this worked absolutely fine till yesterday but today I generated the [Table 1] data again as I had to add one additional author to the data and trying to make a frequency matrix again like in [Table 2] but it's giving me this silly error -
KeyError: 'Terms'
I am pretty sure this has to do something with the index column in the dataframe or some white space issues in the index column(in this case 'Terms' column). I tried to read several answers on this like this - KeyError: 'column_name' and this - Key error when selecting columns in pandas dataframe after read_csv and tried those methods but these aren't helping.
Any help on this will be much appreciated! Thanks much!
Upvotes: 0
Views: 6207
Reputation: 11
I've got the same problem as you. I've observed that if I change the data in .csv format in OpenOffice program then the error occurs. Instead of that I've downloaded the data from the Internet and I edited the data in simple Notepad++ editor. Then it works normally. I know that perhaps this solution doesn't help in you case, but maybe you should change the text editor or program that supports .csv files.
Upvotes: 1