Jeff Sereno
Jeff Sereno

Reputation: 113

Using sed to pull out the last parts of a string

I'm sure there must be a simple answer to this, but I'm having trouble working it out and the similar questions I've found here aren't quite getting me across the line (though they have helped).

I have a TestFile.txt text file in Windows with a varying number of OU's per line as follows:

"CN=John Doe,OU=Milk,OU=Dairy,OU=Food,DC=company,DC=com"
"CN=Jane Doe,OU=Red,OU=Apples,OU=Fruit,OU=Food,DC=company,DC=com"
"CN=Pete Doe,OU=Forks,OU=Cutlery,OU=NotFood,DC=company,DC=com"
"CN=Fred Doe,OU=Spoons,OU=Cutlery,OU=NotFood,DC=company,DC=com"
"CN=Alex Doe,OU=Biscuits,OU=Chocolate,OU=Candy,OU=Food,DC=company,DC=com"
"CN=Peta Doe,OU=Buttons,OU=Chocolate,OU=Candy,OU=Food,DC=company,DC=com"

I want to strip the extraneous data such that I am left with only the last two OUs like this:

OU=Dairy,OU=Food
OU=Fruit,OU=Food
OU=Cutlery,OU=NotFood
OU=Cutlery,OU=NotFood
OU=Candy,OU=Food
OU=Candy,OU=Food

I've stripped out the start and end of each line easily enough with the following using the Windows port of sed:

sed -e "s/[^,]*,//" -e "s/,DC\=.*//" TestFile.txt

...which gives me:

OU=Milk,OU=Dairy,OU=Food
OU=Red,OU=Apples,OU=Fruit,OU=Food
OU=Forks,OU=Cutlery,OU=NotFood
OU=Spoons,OU=Cutlery,OU=NotFood
OU=Biscuits,OU=Chocolate,OU=Candy,OU=Food
OU=Buttons,OU=Chocolate,OU=Candy,OU=Food

So now I just need to isolate the last two OU's on each line and ignore everything else. If I had a fixed number of OU's for each line, that would simplify things a lot, but how do I make a sed expression to accommodate a varying number of OU's?

Upvotes: 0

Views: 43

Answers (3)

Firefly
Firefly

Reputation: 459

Assuming that after the final OU of each line will always be only two more fields. This might be a solution in AWK. Just AWK.

awk -F, '{OU=$(NF-3)","$(NF-2); print OU}' file > outfile

Which outputs:

OU=Dairy,OU=Food
OU=Fruit,OU=Food
OU=Cutlery,OU=NotFood
OU=Cutlery,OU=NotFood
OU=Candy,OU=Food
OU=Candy,OU=Food

Upvotes: 0

Cyrus
Cyrus

Reputation: 88756

With GNU sed:

sed -r 's/.*(OU=[^,]*,OU=[^,]*),DC=.*/\1/' file

Output:

OU=Dairy,OU=Food
OU=Fruit,OU=Food
OU=Cutlery,OU=NotFood
OU=Cutlery,OU=NotFood
OU=Candy,OU=Food
OU=Candy,OU=Food

Upvotes: 2

Jeff Sereno
Jeff Sereno

Reputation: 113

OK, thanks to josifoski, I've got it:

cat TestFile.txt | sed -e "s/[^,]*,//" -e "s/,DC\=.*//" | gawk -F, "{ print $(NF-1), $NF; }"

Perfect, thanks. =)

Upvotes: 1

Related Questions