Reputation: 431
I'm using Appache Tika in server mode. I need to develop java rest client for parsing files. For pdf file upload i'm using code:
fileBody = new FileBody(file, "application/pdf");
multiPartEntity.addPart("uploaded_file", fileBody);
pdfPutRequest.setEntity(multiPartEntity);
response = client.execute(pdfPutRequest);
Using apache.http library. Now i try to develop docx part, but i don't know which mimeType i need to provide (application/docx give me the error). Without mimeTipe i receive the exception " Unsupported Media Type" in the Tika server. So which type i need to provide and do i need to do some other changes.
Solved!
Upvotes: 0
Views: 1184
Reputation: 431
I found the solution:
HttpPost docxPutRequest new HttpPost(url);
docxPutRequest.setHeader("Accept", "text/plain");
MultipartEntity multiPartEntity = new MultipartEntity();
FileBody fileBody = new FileBody(file);
multiPartEntity.addPart("uploaded_file", fileBody);
docxPutRequest.setEntity(multiPartEntity);
response = client.execute(docxPutRequest);
May be this will help to someone
Upvotes: 1
Reputation: 48346
The official mime type for .docx
files is
application/vnd.openxmlformats-officedocument.wordprocessingml.document
If you use the Tika CLI tool in --detect
mode it can tell you that
Alternately, the Tika Server has a detection mode available as documented in the Tika Server wiki.
Finally, Tika will auto-detect the mime type for you if none is given, see the text extraction part of the Tika Server docs for info on giving or not giving a mimetype hint with your file
Upvotes: 0