griboedov
griboedov

Reputation: 886

Mojibakes in SOAP message

On my java web-service I've implemented WebServiceProvider and trying to get the original request that client has done. The problem is that I'm getting unreadable characters like <Applicant_Place_Born>Ð&#156;оÑ&#129;ква</Applicant_Place_Born> inside soap message body's xml tags instead of normal cyrillic letters. So I am seeking ways how to fix this. Probably I could use <Source> generic type instead of <SOAPMessage>, but I don't know how to turn it to bytes.
Q1: Is it possible to get client's request as original array of bytes (raw binary data) so that I could decode it manually?
Q2: Is there direct way to fix wrong characters by specifying decoding character set for SOAP message?

My current code is given below:

@WebServiceProvider(
    portName="SoaprequestImplPort",
    serviceName="services/soaprequest",
    targetNamespace="http://tempuri.org/soaprequest",
    wsdlLocation="/wsdl/SoaprequestImpl.wsdl"
)
@BindingType(value="http://schemas.xmlsoap.org/wsdl/soap/http")
@ServiceMode(value=javax.xml.ws.Service.Mode.MESSAGE)
public class SoaprequestImpl implements Provider<SOAPMessage> {

    private static final String hResponse = "<soapenv:Envelope xmlns:soapenv=\\";

    public SOAPMessage invoke(SOAPMessage req)  {
        getSOAPMessage(req);
            SOAPMessage res = null;
        try {
                res = makeSOAPMessage(hResponse);
        } catch (Exception e) {
            System.out.println("Exception: occurred " + e);
        }
        return res;
    }

    private String getSOAPMessage(SOAPMessage msg)  {
        ByteArrayOutputStream baos = null;
        try {
            baos = new ByteArrayOutputStream();
            msg.writeTo(baos);
            OutputStream outputStream = new FileOutputStream ("/opt/data/tomcat/end.txt"); 
            baos.writeTo(outputStream);     
        } catch(Exception e) {
            e.printStackTrace();
        }
        return s;
    }

    private SOAPMessage makeSOAPMessage(String msg) {
        try {
                MessageFactory factory = MessageFactory.newInstance();
                SOAPMessage message = factory.createMessage();
                message.getSOAPPart().setContent((Source)new StreamSource(new StringReader(msg)));
                message.saveChanges();
                return message;
        } catch (Exception e) {
            return null;
        }
    }
}

Upvotes: 0

Views: 86

Answers (1)

patthoyts
patthoyts

Reputation: 33223

What you have shown is just the UTF-8 encoded representation of "Москва". Your SOAP data is most likely to be in an XML file that has <?xml version='1.0' encoding='UTF-8' ?> at the top which shows that the content is encoded using UTF-8. To turn such data back into Unicode you need to decode it. You also have some HTML escapes in there so you must unescape that first. I used Tcl to test this:

# The original string reported
set s "Ð&#156;оÑ&#129;ква"
# substituting the html escapes
set t "Ð\x9cоÑ\x81ква"
# decode from utf-8 into Unicode
encoding convertfrom utf-8 "Ð\x9cоÑ\x81ква"
Москва

So your SOAP information is probably fine but you most likely need to deal with the HTML escapes before allowing anything to try to decode the string from utf-8.

Upvotes: 1

Related Questions