Reputation: 41
whenever am reading my csv file am getting an output like this
['At Home', '0.0023042115']
['Family', '0.0001275907']
['Time', '0.0005935242']
['Work', '0.0012768792']
['Past Actions', '0.0001357854']
['Games', '0.0032438747']
['Internet', '0.0008639338']
['Location', '0.0001233796']
['Fun', '0.0035238147']
['Food/Clothes', '0.0080727641']
['Poetic', '4.691183298570359e+AC0-06']
['Books/Movies', '2.1300704813858456e+AC0-06']
['Religion', '0']
['Romance', '0.0005935134']
['Swearing', '3.217031124518803e+AC0-05']
['Politics', '0.0075492962']
['Music', '7.224535286344926e+AC0-05']
['School', '2.0853873920424672e+AC0-06']
['Business', '0.0056130667']
['end+AF8-with+AF8-able', '0.001345825']
['end+AF8-with+AF8-al', '0.0024110161']
['end+AF8-with+AF8-ful', '0.0013767934']
['end+AF8-with+AF8-ible', '0.0022098726']
['end+AF8-with+AF8-ic', '0.0023514306']
['end+AF8-with+AF8-ive', '0.0037701555']
['end+AF8-with+AF8-less', '0.0010593697']
['end+AF8AXw-with+AF8-ly', '7.89403499813603e+AC0-05']
['end+AF8-with+AF8-ous', '9.940547915993254e+AC0-05']
['sorry+AF8-word', '5.662225052323463e+AC0-05']
['Starting+AF8-with+AF8-Apolog', '0.0003042999']
['Help+ACE-', '0.0003773039']
['I understand', '0.0001320813']
['+ACI-Attention', ' please+ACE-+ACI-', '0']
['+ACI-Ok', ' I see+ACI-', '0']
['Damn+ACE-', '0']
['How sweet+ACE-', '2.0201595541210387e+AC0-06']
["That's too bad", '0']
['Come on+ACE-', '0.0014614134']
['Whatever', '0']
["That's bad", '0']
["It's cold", '0']
["That's dumb", '0']
['Help+ACE-', '0.0003773039']
['Oh no+ACE-', '0']
['What?', '0']
['Is that right?', '0']
['Disgusting', '0.0001809821']
['This is hopeless', '0']
['Really?', '0.0004255353']
["I'm angry", '0']
['I wonder', '0']
["I don't like this", '0']
['Really?', '0.0004255353']
["Let's celebrate+ACE-", '0']
['Disgusting', '0.0001809821']
["I don't know", '0']
['Yes', '0']
['Lovely', '0.0001642891']
["I'm so evil+ACE-", '0']
['No', '0.0005265143']
['+ACI-No', " it isn't+ACE-/Did not+ACE-+ACI-", '0']
['I see', '0.0013579047']
['Fancy+ACE-', '0']
['Wonderful+ACE-', '0.0006769606']
["I'm exerting myself", '0']
["I didn't mean to do that", '0']
['That hurts', '0']
['+ACI-Hey', ' you+ACE-+ACI-', '0']
['Oh no...', '0']
['It stinks+ACE-', '0']
["That's nothing", '0']
['That was close+ACE-', '0.0010088511']
['+ACI-Whispering+ACE-Hey', ' you+ACE-+ACI-', '0']
["I can't believe this+ACE-", '0']
['Be quiet', '0']
['Go away', '0']
['Disappointing', '0']
['Yes', '0']
['Oh no+ACE-', '0']
['No', '0.0005265143']
['+ACI-Wait', " I'm thinking+ACI-", '0']
['This is fun+ACE-', '0.0016848004']
['Unbelievable+ACE-', '0']
['Amazing+ACE-', '0.0005387173']
["Let's celebrate+ACE-", '0']
['Yes+ACE-', '0.0003639123']
['Yes+ACE-', '0.0003639123']
["I'm excited+ACE-", '0']
['Hey you+ACE-', '0']
['+ACI-Yes', ' it is+ACE- Or Did so+ACE-+ACI-', '0']
['Disgusting+ACE-', '0']
['+ACI-Haha', ' well said+ACE-+ACI-', '0']
["+ACI-+AFs-'CD'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0003041055']
["+ACI-+AFs-'DT'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0003041055']
["+ACI-+AFs-'EX'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0002694121']
["+ACI-+AFs-'JJ'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0003041055']
["+ACI-+AFs-'JJR'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0003427676']
["+ACI-+AFs-'JJS'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0029483492']
["+ACI-+AFs-'NN'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0001060578']
["+ACI-+AFs-'RBR'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0002081336']
["+ACI-+AFs-'VBD'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0001925713']
["+ACI-+AFs-'VBG'", " 'CC'", " 'CD'+AF0-+ACI-", '9.068096527379936e+AC0-06']
["+ACI-+AFs-'VBN'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0011585203']
["+ACI-+AFs-'VBP'", " 'CC'", " 'CD'+AF0-+ACI-", '0.000277034']
["+ACI-+AFs-'VBZ'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0009891459']
["+ACI-+AFs-'WDT'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0015045239']
["+ACI-+AFs-'WP'", " 'CC'", " 'CD'+AF0-+ACI-", '0.0013222853']
["+ACI-+AFs-'CD'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0002694121']
["+ACI-+AFs-'DT'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0003605792']
["+ACI-+AFs-'EX'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0003605792']
["+ACI-+AFs-'JJ'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0003605792']
["+ACI-+AFs-'JJR'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0003623141']
["+ACI-+AFs-'JJS'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0028113198']
["+ACI-+AFs-'NN'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0002380414']
["+ACI-+AFs-'RBR'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0002524272']
["+ACI-+AFs-'VBD'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0002694121']
["+ACI-+AFs-'VBG'", " 'CD'", " 'DT'+AF0-+ACI-", '6.8763849316489e+AC0-07']
["+ACI-+AFs-'VBN'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0013536177']
["+ACI-+AFs-'VBP'", " 'CD'", " 'DT'+AF0-+ACI-", '0.000228122']
["+ACI-+AFs-'VBZ'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0011060342']
["+ACI-+AFs-'WDT'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0018250935']
["+ACI-+AFs-'WP'", " 'CD'", " 'DT'+AF0-+ACI-", '0.0013798266']
["+ACI-+AFs-'CD'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0005956604']
["+ACI-+AFs-'DT'", " 'DT'", " 'EX'+AF0-+ACI-", '0.000739265']
["+ACI-+AFs-'EX'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0005909166']
["+ACI-+AFs-'JJ'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0005909166']
["+ACI-+AFs-'JJR'", " 'DT'", " 'EX'+AF0-+ACI-", '0.000548832']
["+ACI-+AFs-'NN'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0007274935']
["+ACI-+AFs-'RBR'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0006394694']
["+ACI-+AFs-'VBD'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0003994929']
["+ACI-+AFs-'VBG'", " 'DT'", " 'EX'+AF0-+ACI-", '0.000165679']
["+ACI-+AFs-'VBN'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0007330517']
["+ACI-+AFs-'VBP'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0003164934']
["+ACI-+AFs-'VBZ'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0006952293']
["+ACI-+AFs-'WDT'", " 'DT'", " 'EX'+AF0-+ACI-", '0.0009373422']
["+ACI-+AFs-'CD'", " 'IN'", " 'JJ'+AF0-+ACI-", '2.3855039963693414e+AC0-05']
["+ACI-+AFs-'DT'", " 'IN'", " 'JJ'+AF0-+ACI-", '0.0004035301']
["+ACI-+AFs-'EX'", " 'IN'", " 'JJ'+AF0-+ACI-", '3.612301342137506e+AC0-05']
["+ACI-+AFs-'JJ'", " 'IN'", " 'JJ'+AF0-+ACI-", '0.0001610457']
mycode is as below:
with open("Dataset/MALE_Training/MI_score_Male.csv","r")as finp :
fv_raeder = csv.reader(finp)
heading = fv_raeder.next()
for row in fv_raeder :
print row
what is this +ACI-+AFs
, 'EX'+AF0-+ACI
etc are mean? how can I remove this? Can you suggest some solution for my problem?
csv file contains:
Conversation 0.0010855412
At Home 0.0023042115
Family 0.0001275907
Time 0.0005935242
Work 0.0012768792
Past Actions 0.0001357854
Games 0.0032438747
Internet 0.0008639338
Location 0.0001233796
Fun 0.0035238147
Food/Clothes 0.0080727641
Poetic 4.691183298570359e-06
Books/Movies 2.1300704813858456e-06
Religion 0
Romance 0.0005935134
Swearing 3.217031124518803e-05
Politics 0.0075492962
Music 7.224535286344926e-05
School 2.0853873920424672e-06
Business 0.0056130667
end_with_able 0.001345825
end_with_al 0.0024110161
end_with_ful 0.0013767934
end_with_ible 0.0022098726
end_with_ic 0.0023514306
end_with_ive 0.0037701555
end_with_less 0.0010593697
end__with_ly 7.89403499813603e-05
end_with_ous 9.940547915993254e-05
sorry_word 5.662225052323463e-05
Starting_with_Apolog 0.0003042999
Help! 0.0003773039
I understand 0.0001320813
Attention, please! 0
Ok, I see 0
Damn! 0
How sweet! 2.0201595541210387e-06
That's too bad 0
Come on! 0.0014614134
Whatever 0
That's bad 0
It's cold 0
That's dumb 0
Help! 0.0003773039
Oh no! 0
What? 0
Is that right? 0
Disgusting 0.0001809821
This is hopeless 0
Really? 0.0004255353
I'm angry 0
I wonder 0
I don't like this 0
Really? 0.0004255353
Let's celebrate! 0
Disgusting 0.0001809821
I don't know 0
Yes 0
Lovely 0.0001642891
I'm so evil! 0
No 0.0005265143
No, it isn't!/Did not! 0
I see 0.0013579047
Fancy! 0
Wonderful! 0.0006769606
I'm exerting myself 0
I didn't mean to do that 0
That hurts 0
Hey, you! 0
Oh no... 0
It stinks! 0
That's nothing 0
That was close! 0.0010088511
Whispering!Hey, you! 0
I can't believe this! 0
Be quiet 0
Go away 0
Disappointing 0
Yes 0
Oh no! 0
No 0.0005265143
Wait, I'm thinking 0
This is fun! 0.0016848004
Unbelievable! 0
Amazing! 0.0005387173
Let's celebrate! 0
Yes! 0.0003639123
Yes! 0.0003639123
I'm excited! 0
Hey you! 0
Yes, it is! Or Did so! 0
Disgusting! 0
Haha, well said! 0
['CD', 'CC', 'CD'] 0.0003041055
['DT', 'CC', 'CD'] 0.0003041055
['EX', 'CC', 'CD'] 0.0002694121
['JJ', 'CC', 'CD'] 0.0003041055
['JJR', 'CC', 'CD'] 0.0003427676
['JJS', 'CC', 'CD'] 0.0029483492
['NN', 'CC', 'CD'] 0.0001060578
['RBR', 'CC', 'CD'] 0.0002081336
['VBD', 'CC', 'CD'] 0.0001925713
['VBG', 'CC', 'CD'] 9.068096527379936e-06
['VBN', 'CC', 'CD'] 0.0011585203
['VBP', 'CC', 'CD'] 0.000277034
['VBZ', 'CC', 'CD'] 0.0009891459
['WDT', 'CC', 'CD'] 0.0015045239
['WP', 'CC', 'CD'] 0.0013222853
['CD', 'CD', 'DT'] 0.0002694121
['DT', 'CD', 'DT'] 0.0003605792
['EX', 'CD', 'DT'] 0.0003605792
['JJ', 'CD', 'DT'] 0.0003605792
['JJR', 'CD', 'DT'] 0.0003623141
['JJS', 'CD', 'DT'] 0.0028113198
['NN', 'CD', 'DT'] 0.0002380414
['RBR', 'CD', 'DT'] 0.0002524272
['VBD', 'CD', 'DT'] 0.0002694121
['VBG', 'CD', 'DT'] 6.8763849316489e-07
['VBN', 'CD', 'DT'] 0.0013536177
['VBP', 'CD', 'DT'] 0.000228122
['VBZ', 'CD', 'DT'] 0.0011060342
['WDT', 'CD', 'DT'] 0.0018250935
['WP', 'CD', 'DT'] 0.0013798266
['CD', 'DT', 'EX'] 0.0005956604
['DT', 'DT', 'EX'] 0.000739265
['EX', 'DT', 'EX'] 0.0005909166
['JJ', 'DT', 'EX'] 0.0005909166
['JJR', 'DT', 'EX'] 0.000548832
['NN', 'DT', 'EX'] 0.0007274935
['RBR', 'DT', 'EX'] 0.0006394694
['VBD', 'DT', 'EX'] 0.0003994929
['VBG', 'DT', 'EX'] 0.000165679
['VBN', 'DT', 'EX'] 0.0007330517
['VBP', 'DT', 'EX'] 0.0003164934
['VBZ', 'DT', 'EX'] 0.0006952293
['WDT', 'DT', 'EX'] 0.0009373422
['CD', 'IN', 'JJ'] 2.3855039963693414e-05
['DT', 'IN', 'JJ'] 0.0004035301
['EX', 'IN', 'JJ'] 3.612301342137506e-05
['JJ', 'IN', 'JJ'] 0.0001610457
['JJR', 'IN', 'JJ'] 0.0001610457
['JJS', 'IN', 'JJ'] 0.0033707076
['NN', 'IN', 'JJ'] 0.0007419509
['RBR', 'IN', 'JJ'] 7.329654519463405e-08
['VBD', 'IN', 'JJ'] 5.950344732515763e-05
['VBG', 'IN', 'JJ'] 0.0007109534
['VBN', 'IN', 'JJ'] 0.0006920679
['VBP', 'IN', 'JJ'] 3.6209960493105995e-05
['VBZ', 'IN', 'JJ'] 0.0018789858
['WDT', 'IN', 'JJ'] 0.0037640204
['WP', 'IN', 'JJ'] 0.0011824782
['CD', 'JJ', 'JJR'] 0.003625445
['DT', 'JJ', 'JJR'] 0.0028113198
['VB', 'WDT'] 0.0013284551
['VBD', 'WDT'] 0.0005987498
['VBG', 'WDT'] 0.0017245536
['VBN', 'WDT'] 0.0008969116
['VBP', 'WDT'] 0.0016863023
['VBZ', 'WDT'] 0.0020814437
CC 5.111969137490955e-05
CD 0.0004017943
DT 0
EX 0.0005909166
IN 0
JJ 0.0001610457
JJR 0.003469816
JJS 0.0010446082
CC 5.111969137490955e-05
CD 0.0004017943
DT 0
EX 0.0005909166
IN 0
JJ 0.0001610457
JJR 0.003469816
JJS 0.0010446082
MD 0.0006791731
NN 0
NNS 5.149809586568776e-05
PRP 0.0014294222
RB 0.0004972152
RBR 0.0007214601
RP 0.0001474277
TO 1.8888999939422496e-05
VB 3.6038046948563265e-05
VBD 0.0011007132
VBG 0.0006148222
VBN 2.3115793700478806e-07
VBP 0.0016502422
VBZ 0.0032336121
WDT 0.0011788791
WP 3.35853998744088e-05
WRB 2.4547859289217235e-05
Upvotes: 0
Views: 822
Reputation: 97565
As @tripleee mentioned, that's UTF-7 encoding. By using codecs.open
, you can decode it in the background:
from codecs import open
with open("Dataset/MALE_Training/MI_score_Male.csv", encoding='utf-7') as finp:
csv_reader = csv.reader(finp)
heading = next(csv_reader)
for row in csv_reader:
print row
Upvotes: 1