Michael
Michael

Reputation: 10303

How to parse a file in INI/JSON-like non-standard format?

Suppose I have a text file in the following (non-standard) format:

xxx { a = v1; b = v2 }
yyy { a = v3; c = v4 } 

I cannot change it to any standard (INI/XML/YAML, etc.) format.

Now I would like to find the value of property a in section xxx (that is v1). What is the simplest way to do it in Java/Groovy?

Upvotes: 1

Views: 2058

Answers (4)

Stephen C
Stephen C

Reputation: 718946

Firstly, you've given an example, not specified a format. Before you go any further, you need to get hold of a complete specification for the format. Or if there isn't one, you need to see the code that generates it, and reverse engineer a specification.

(If you try to implement based on a small example, there's a good chance that your parser will encounter real life examples that don't fit the patterns that you have intuited.)

Having done that you can look for an off-the-shelf parser that can cope with your format. If you are lucky, it might be close enough to INI, or JSON or YAML or something else for the corresponding parser to (mostly) work.

But the chances are that it won't, and that you will need to write your own parser. There are various ways you could do this, for instance:

  • You could split the file into lines and "parse" each line with a regex.
  • You could parse the file using a Scanner with appropriate delimiters.
  • You could use a parser generator to implement a lexer and parser.
  • You could implement a simple lexer and parser by hand.
  • There are probably Groovy specific solutions.

In reality the correct choice(s) depend on how simple or complex the actual format is. We can't tell that from a single example.

Upvotes: 2

ataylor
ataylor

Reputation: 66069

There's likely not going to be an out-of-box solution if you're dealing with a non-standard format. Here's a few approaches you might want to look into:

  • if the format is simple, write a custom recursive descent parser
  • write a filter to transform your format into INI, JSON, etc. and use existing libraries
  • create a groovy DSL that matches your format and execute your file as a groovy script
  • use a parser generator tool like antlr or parboiled to create a parser from a language specification

Upvotes: 2

tim_yates
tim_yates

Reputation: 171114

With Groovy, you could leverage the ConfigSlurper.

However, you would first need to hack a map of valid values together, so that it doesn't choke trying to work out what v1, v2, v3, etc are:

This seems to work:

def input = '''xxx { a = v1; b = v2 }
              |yyy { a = v3; c = v4 }'''.stripMargin()

def slurper = new ConfigSlurper()

// Find all words 'w' and make a map of [ w1:'w1', w2:'w2', ... ]
slurper.binding = ( ( input =~ /\w+/ ) as List ).collectEntries { w -> [ (w):w ] }

def result = slurper.parse( input )
println result

That prints out:

[xxx:[a:v1, b:v2], yyy:[a:v3, c:v4]]

(Groovy 1.8.4)

Upvotes: 3

Chris Cashwell
Chris Cashwell

Reputation: 22859

For a true INI-format file: What is the easiest way to parse an INI file in Java?

What you're showing here looks more like JSON than INI format to me. Perhaps look at JSON parsing libraries. The truth here is that you're not using an established format, so you probably won't be using an established format parser. Your best bet is probably to refactor the file you're dealing with (if possible) into a well-known format to begin with. Don't try to reinvent the wheel unless you absolutely have to.

Upvotes: 2

Related Questions