user2244536
user2244536

Reputation: 67

Change encoding of a file with grails

In my view i have a upload form:

<input type="file" name="file" value="search file" /><br />

In my controller I load it like that:

def file = request.getFile('file')
def f = file.getInputStream()
def input = f.getText()

So I have now a String called input with the content of the file.

I want it in UTF-8. How is this possible ?

Edit:

My problem is, that the file to be uploaded is in "Windows-1252" and German characters like äöü are different now in the string called "input". If i convert the file with "Notepad++" in UTF-8 and then upload it, it works. But I cant do that every time.

Edit2:

def file = request.getFile('file')                      //get file from view
def File tmpfile = new File('C:/tmp/tmpfile.txt')       //create temporary file
file.transferTo(tmpfile)                                //copy into tmpfile
CharsetToolkit toolkit = new CharsetToolkit(tmpfile)    //toolkit with tmpfile
def charset = toolkit.getCharset()                      //save charset in a variable
def input = tmpfile.getText(charset)                    //get text with right charset

I tried this with a few different documents. But the variable charset is always UTF_8

Upvotes: 0

Views: 1546

Answers (2)

user2244536
user2244536

Reputation: 67

I found a solution:

I used java-bib called jUniversalChardet and wrote the following method:

String getEncoding ( def inputstream ) {
    def byte[] buf = new byte[4096]

    def UniversalDetector detector = new UniversalDetector(null)

    def nread
    while ((nread = inputstream.read(buf)) > 0 && !detector.isDone()) {
      detector.handleData(buf, 0, nread)
    }
    detector.dataEnd();

    def encoding = detector.getDetectedCharset()

    return encoding
}

In my code i have the following now:

def file = request.getFile('file')
def f = file.getInputStream()
def encoding = getEncoding(file.getInputStream())
def input = f.getText(encoding)

And it works :)

Upvotes: 1

dmahapatro
dmahapatro

Reputation: 50265

You can use getText(String charset)

def input = f.getText('UTF-8')

Upvotes: 3

Related Questions