Count Distinct XML Pattern in Notepad++ or in Linux Grep

Question

Its a silly question, but I can't find the answer yet.

In my XML i've got the below lines:


123456
123456
adfadfaf
gdsgdhghd

Distinct count of patterns in * is 3.

Basically I want to count unique values between and 3 when I do a find & count in notepad++ or in Linux grep command.

Bodo · Accepted Answer

Assuming that the input has a format as shown in the example, you can use the code below.

That means every combination of corresponding and tags must be in one line with a text-only value in between.

grep -o '[^<]*' input.xml |sort -u|wc -l

The command may not work if the input is formatted in other ways or if the value between and contains other tags.

With the example input from the question it will print

It even works when there is more than one pair of and in a line.

With


123456foo1234567
123456
adfadfaf
gdsgdhghd

it prints

Answers (2)