Reputation: 886
On my java web-service I've implemented WebServiceProvider and trying to get the original request that client has done. The problem is that I'm getting unreadable characters like <Applicant_Place_Born>Москва</Applicant_Place_Born>
inside soap message body's xml tags instead of normal cyrillic letters. So I am seeking ways how to fix this. Probably I could use <Source>
generic type instead of <SOAPMessage>
, but I don't know how to turn it to bytes.
Q1: Is it possible to get client's request as original array of bytes (raw binary data) so that I could decode it manually?
Q2: Is there direct way to fix wrong characters by specifying decoding character set for SOAP message?
My current code is given below:
@WebServiceProvider(
portName="SoaprequestImplPort",
serviceName="services/soaprequest",
targetNamespace="http://tempuri.org/soaprequest",
wsdlLocation="/wsdl/SoaprequestImpl.wsdl"
)
@BindingType(value="http://schemas.xmlsoap.org/wsdl/soap/http")
@ServiceMode(value=javax.xml.ws.Service.Mode.MESSAGE)
public class SoaprequestImpl implements Provider<SOAPMessage> {
private static final String hResponse = "<soapenv:Envelope xmlns:soapenv=\\";
public SOAPMessage invoke(SOAPMessage req) {
getSOAPMessage(req);
SOAPMessage res = null;
try {
res = makeSOAPMessage(hResponse);
} catch (Exception e) {
System.out.println("Exception: occurred " + e);
}
return res;
}
private String getSOAPMessage(SOAPMessage msg) {
ByteArrayOutputStream baos = null;
try {
baos = new ByteArrayOutputStream();
msg.writeTo(baos);
OutputStream outputStream = new FileOutputStream ("/opt/data/tomcat/end.txt");
baos.writeTo(outputStream);
} catch(Exception e) {
e.printStackTrace();
}
return s;
}
private SOAPMessage makeSOAPMessage(String msg) {
try {
MessageFactory factory = MessageFactory.newInstance();
SOAPMessage message = factory.createMessage();
message.getSOAPPart().setContent((Source)new StreamSource(new StringReader(msg)));
message.saveChanges();
return message;
} catch (Exception e) {
return null;
}
}
}
Upvotes: 0
Views: 86
Reputation: 33223
What you have shown is just the UTF-8 encoded representation of "Москва". Your SOAP data is most likely to be in an XML file that has <?xml version='1.0' encoding='UTF-8' ?>
at the top which shows that the content is encoded using UTF-8. To turn such data back into Unicode you need to decode it. You also have some HTML escapes in there so you must unescape that first. I used Tcl to test this:
# The original string reported
set s "Москва"
# substituting the html escapes
set t "Ð\x9cоÑ\x81ква"
# decode from utf-8 into Unicode
encoding convertfrom utf-8 "Ð\x9cоÑ\x81ква"
Москва
So your SOAP information is probably fine but you most likely need to deal with the HTML escapes before allowing anything to try to decode the string from utf-8.
Upvotes: 1