Reputation: 7008
This is my class for reading mime types. I am trying to add a new mime type(properties file) and read it.
This is my class file:
/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package check_mime;
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.apache.tika.Tika;
import org.apache.tika.mime.MimeTypes;
public class TikaFileTypeDetector {
private final Tika tika = new Tika();
public TikaFileTypeDetector() {
super();
}
public String probeContentType(Path path) throws IOException {
// Check contents first
String fileContentDetect = tika.detect(path.toFile());
if (!fileContentDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileContentDetect;
}
// Try file name only if content search was not successful
String fileNameDetect = tika.detect(path.toString());
if (!fileNameDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileNameDetect;
}
return null;
}
public static void main(String[] args) throws IOException {
Tika tika = new Tika();
if (args.length != 1) {
printUsage();
return;
}
Path path = Paths.get(args[0]);
TikaFileTypeDetector detector = new TikaFileTypeDetector();
String contentType = detector.probeContentType(path);
System.out.println("File is of type - " + contentType);
}
public static void printUsage() {
System.out.print("Usage: java -classpath ... "
+ TikaFileTypeDetector.class.getName()
+ " ");
}
}
From the docs I have created a custom xml:
<?xml version="1.0" encoding="UTF-8"?>
<mime-info>
<mime-type type="text/properties">
<glob pattern="*.properties"/>
</mime-type>
</mime-info>
Now how do I add to my program and read it. Do I have to create a parser? I'm stuck here.
Upvotes: 9
Views: 7786
Reputation: 1979
In your resources
folder add package org\apache\tika\mime
and create file custom-mimetypes.xml
.
Put the following code
<?xml version="1.0" encoding="UTF-8"?>
<mime-info>
<mime-type type="custom-mime-type">
<glob pattern="*.custom-extension"/>
</mime-type>
</mime-info>
Replace custom-mime-type
with your mime type and custom-extension
with your extension.
Please check bellow the directory structure.
Btw you can also load tike mime-types locally by downloading that file and placing alongside custom-mimetypes.xml
. This is helpful only when you need to change standard tike mime-types. One thing to remember you can not have same mime-type/extension in both xml.
Upvotes: 4
Reputation: 1
MediaType mediaType = detector.detect(stream, metadata);
System.out.println("Detected Media Type: " + mediaType.toString());
MimeType mimeType = config.getMimeRepository().forName(mediaType.toString());
String extension = mimeType.getExtension();
Upvotes: -3
Reputation: 48326
This is covered in the Apache Tika 5 minute parser instructions. To add support for Java .properties files, you should first create a file called custom-mimetypes.xml
and populate it with something like:
<?xml version="1.0" encoding="UTF-8"?>
<mime-info>
<mime-type type="text/properties">
<_comment>Java Properties</_comment>
<glob pattern="*.properties"/>
<sub-class-of type="text/plain"/>
</mime-type>
</mime-info>
Next, you need to put that somewhere that Tika can find it, with the right name. It must be stored as org/apache/tika/mime/custom-mimetypes.xml
on your classpath. The easiest thing to do is to create that directory structure, move the new file in, then add the root directory to your classpath. For deployment, you should wrap that up into a jar and put it on the classpath
You can use the Tika App to check your mime type file was loaded, if you're careful. With your code pacakged as a jar, run it as something like:
java -classpath tika-app-1.10-SNAPSHOT.jar:my-custom-mimetypes.jar org.apache.tika.cli.TikaCLI --list-supported-types | grep text/properties
Alternately, if you have it in a local directory, try something like
ls -l org/apache/tika/mime/custom-mimetypes.xml
# Check a file was found, with some content in it
java -classpath tika-app-1.10-SNAPSHOT.jar:. org.apache.tika.cli.TikaCLI --list-supported-types | grep text/properties
If that isn't showing your mime type, then you didn't get the path or filename correct, double check them
(Alternately, upgrade to a newer version of Apache Tika, as since r1686315 Tika has a Java Properties mimetype built in!)
Upvotes: 7
Reputation: 32980
Tika will detect your custom definition via Java resource loading and automatically add it to its own definitions: For that you need to name it custom-mimetypes.xml and put it into package org.apache.tika.mime within your codebase.
If you create a jar file from your classes, you also need to include your custom-mimetypes.xml in the jar.
Upvotes: 1