elahehab
elahehab

Reputation: 335

Convert utf8 to windows-1256

I have two files. one in utf-8 and the other I think is in windows-1256. I want to unify their encoding (One is train set and the other is test set)

utf-8 file:

سلمانی را به توافق بگیر
وقتی یک مرد محترم شصت ساله ، در یک جامه قهوه‌ای رسمی ، خوش لباس ، ولی خیلی خوب نگه داشته

windows-1256 file:

äÇåí Èå äãÇíÔÇå ÂËÇÑ åäÑí ÇÍãÏ ØÈÇØÈÇíí 
ãæÖæÚ ÂËÇÑ ØÈÇØÈÇíí ãæÑÇä åÓÊäÏ æáí ÏÑ ÈÇØä äíä ÙÇåÑí¡ Çíä 

I tried multiple online tools but when I convert utf-8 to 1256 it looks completely different from the other file and when I convert 1256 to utf-8 it doesn't change a bit!

Upvotes: 0

Views: 2244

Answers (1)

elahehab
elahehab

Reputation: 335

The problem is solved. I used this command:

iconv -f UTF-8 -t WINDOWS-1256//TRANSLIT --output=Ham.txt Ham-utf

The problem was that my windows-1256 file was so big. I copied part of it in a separate file named ham-mini. Copying part of it was the problem and damaged the file. I used above command for original file and the problem get solved.

Upvotes: 1

Related Questions