Gautam
Gautam

Reputation: 7958

Parsing YAML Front matter in Java

I have to parse YAML Front Matter in java like jekyll, So Iooked into the source code, and found this but I can't make much sense of it(I don't know much ruby).

So My Question is, How do I parse YAML Front Matter in java ?

I have snakeyaml in my classpath and I would be parsing YAML Front Matter from a markdown file, for which I use pegdown

Upvotes: 6

Views: 2785

Answers (3)

Skyr
Skyr

Reputation: 1010

If you are just interested in the front matter, you can use SnakeYaml's loadAll method:

Object yamlFrontMatter(Yaml yaml, InputStream in) {
    return yaml.loadAll().iterator().next();
}

SnakeYaml will only read the first yaml structure (the front matter) and ignore the trailing non-yaml text.

Unfortunately, SnakeYaml has no elegant way to output the remaining text, so if you want to parse both the front matter and the body at the same time, there is no advantage in this approach :-(

Upvotes: 2

Cephalopod
Cephalopod

Reputation: 15145

void parse(Reader r) throws IOException {
    BufferedReader br = new BufferedReader(r);

    // detect YAML front matter
    String line = br.readLine();
    while (line.isEmpty()) line = br.readLine();
    if (!line.matches("[-]{3,}")) { // use at least three dashes
        throw new IllegalArgumentException("No YAML Front Matter");
    }
    final String delimiter = line;

    // scan YAML front matter
    StringBuilder sb = new StringBuilder();
    line = br.readLine();
    while (!line.equals(delimiter)) {
        sb.append(line);
        sb.append("\n");
        line = br.readLine();
    }

    // parse data
    parseYamlFrontMatter(sb.toString());
    parseMarkdownOrWhatever(br);
}

To get a obtain Reader, you will probably need a FileReader or an InputStreamReader.

Upvotes: 8

Polygnome
Polygnome

Reputation: 7795

Ok, since your comment clarified what your question is:

The yaml front matter is everything that is inside the lines with three dashes (---). YAML Front matter is ALWAYS at the beginning od the file.

So you just have to parse the file and extract the YAML Front Matter from the start of the file. you can either parse it with an automaton or an RegEx. It's really up to you. It is always structured the same way:

---
some YAML here
---
Markdown / textile / HTML contents of file

Upvotes: 2

Related Questions