user3357979
user3357979

Reputation: 627

parse dimacs CNF file python

I have a file in the DIMACS cnf format that I need to manipulate into the necessary format for a SAT Solver.

Specifically, I need to get:

['c horn? no', 'c forced? no', 'c mixed sat? no', 'c clause length = 3', 'c', 'p cnf 20  91', '4 -18 19 0', '3 18 -5 0', '-5 -8 -15 0', '-20 7 -16 0']

to

[[4,-18,19,0], [3,18,-5,0],[-5,-8,-15,0],[-20,7,-16,0]]

Thanks for the help!

Upvotes: 1

Views: 6049

Answers (1)

CliffordVienna
CliffordVienna

Reputation: 8255

as a quick hack you can simply use

in_data = ['c horn? no', 'c forced? no', 'c mixed sat? no', 'c clause length = 3', 'c', 'p cnf 20  91', '4 -18 19 0', '3 18 -5 0', '-5 -8 -15 0', '-20 7 -16 0']
out_data = [[int(n) for n in line.split()] for line in in_data if line[0] not in ('c', 'p')]
print(out_data)

which will output

[[4, -18, 19, 0], [3, 18, -5, 0], [-5, -8, -15, 0], [-20, 7, -16, 0]]

however, you might want to use something like

out_data = [[int(n) for n in line.split() if n != '0'] for line in in_data if line[0] not in ('c', 'p')]

instead to remove the terminating zeros from the clauses:

[[4, -18, 19], [3, 18, -5], [-5, -8, -15], [-20, 7, -16]]

but a real dimacs parser should actually use the terminating zero, instead of assuming one clause per line. so here is a proper dimacs parser:

in_data = ['c horn? no', 'c forced? no', 'c mixed sat? no', 'c clause length = 3', 'c', 'p cnf 20  91', '4 -18 19 0', '3 18 -5 0', '-5 -8 -15 0', '-20 7 -16 0']

cnf = list()
cnf.append(list())
maxvar = 0

for line in in_data:
    tokens = line.split()
    if len(tokens) != 0 and tokens[0] not in ("p", "c"):
        for tok in tokens:
            lit = int(tok)
            maxvar = max(maxvar, abs(lit))
            if lit == 0:
                cnf.append(list())
            else:
                cnf[-1].append(lit)

assert len(cnf[-1]) == 0
cnf.pop()

print(cnf)
print(maxvar)

Upvotes: 3

Related Questions