Shreya Agarwal
Shreya Agarwal

Reputation: 716

How to merge three Conllu files with Conllu python library?

This is my first time working with conllu files. I'm not able to find any way to merge these files in the Conllu python library. Any leads would be helpful. Thanks.

Upvotes: 0

Views: 434

Answers (1)

Emil Stenström
Emil Stenström

Reputation: 14126

Each time you call parse() you get a list of TokenLists back. Merging several files can therefore be done by merging those tokenlists.

Example:

from io import open
from conllu import parse_incr

files = ["file1.conllu", "file2.conllu", "file3.conllu"]

merged_tokenlists = []
for file in files:
    data_file = open("file1.conllu", "r", encoding="utf-8")
    for tokenlist in parse_incr(data_file):
        merged_tokenlists.append(tokenlist)

Author of the conllu library here, happy to see people using it!

Upvotes: 1

Related Questions