Reputation: 113
I'm sure there must be a simple answer to this, but I'm having trouble working it out and the similar questions I've found here aren't quite getting me across the line (though they have helped).
I have a TestFile.txt text file in Windows with a varying number of OU's per line as follows:
"CN=John Doe,OU=Milk,OU=Dairy,OU=Food,DC=company,DC=com"
"CN=Jane Doe,OU=Red,OU=Apples,OU=Fruit,OU=Food,DC=company,DC=com"
"CN=Pete Doe,OU=Forks,OU=Cutlery,OU=NotFood,DC=company,DC=com"
"CN=Fred Doe,OU=Spoons,OU=Cutlery,OU=NotFood,DC=company,DC=com"
"CN=Alex Doe,OU=Biscuits,OU=Chocolate,OU=Candy,OU=Food,DC=company,DC=com"
"CN=Peta Doe,OU=Buttons,OU=Chocolate,OU=Candy,OU=Food,DC=company,DC=com"
I want to strip the extraneous data such that I am left with only the last two OUs like this:
OU=Dairy,OU=Food
OU=Fruit,OU=Food
OU=Cutlery,OU=NotFood
OU=Cutlery,OU=NotFood
OU=Candy,OU=Food
OU=Candy,OU=Food
I've stripped out the start and end of each line easily enough with the following using the Windows port of sed:
sed -e "s/[^,]*,//" -e "s/,DC\=.*//" TestFile.txt
...which gives me:
OU=Milk,OU=Dairy,OU=Food
OU=Red,OU=Apples,OU=Fruit,OU=Food
OU=Forks,OU=Cutlery,OU=NotFood
OU=Spoons,OU=Cutlery,OU=NotFood
OU=Biscuits,OU=Chocolate,OU=Candy,OU=Food
OU=Buttons,OU=Chocolate,OU=Candy,OU=Food
So now I just need to isolate the last two OU's on each line and ignore everything else. If I had a fixed number of OU's for each line, that would simplify things a lot, but how do I make a sed expression to accommodate a varying number of OU's?
Upvotes: 0
Views: 43
Reputation: 459
Assuming that after the final OU of each line will always be only two more fields. This might be a solution in AWK. Just AWK.
awk -F, '{OU=$(NF-3)","$(NF-2); print OU}' file > outfile
Which outputs:
OU=Dairy,OU=Food
OU=Fruit,OU=Food
OU=Cutlery,OU=NotFood
OU=Cutlery,OU=NotFood
OU=Candy,OU=Food
OU=Candy,OU=Food
Upvotes: 0
Reputation: 88756
With GNU sed:
sed -r 's/.*(OU=[^,]*,OU=[^,]*),DC=.*/\1/' file
Output:
OU=Dairy,OU=Food OU=Fruit,OU=Food OU=Cutlery,OU=NotFood OU=Cutlery,OU=NotFood OU=Candy,OU=Food OU=Candy,OU=Food
Upvotes: 2
Reputation: 113
OK, thanks to josifoski, I've got it:
cat TestFile.txt | sed -e "s/[^,]*,//" -e "s/,DC\=.*//" | gawk -F, "{ print $(NF-1), $NF; }"
Perfect, thanks. =)
Upvotes: 1