OpenDataAlex
OpenDataAlex

Reputation: 1475

RegEx to Remove Unwanted text

I'm still kind of new to RegEx in general. I'm trying to retrieve the names from a field so I can split them for further use (using Pentaho Data Integration/Kettle for the data extraction). Here's an example of the string I'm given:

CN=Name One/OU=Site/O=Domain;CN=Name Two/OU=Site/O=Domain;CN=Name Three/OU=Site/O=Domain

I would like to have the following format returned:

Name One;Name Two;Name Three

Kettle uses Java Regular Expressions.

Upvotes: 1

Views: 5200

Answers (2)

Sec
Sec

Reputation: 7352

That sounds like you want substitute&replace based on a regex. How to correctly do that depends on your language. But with sed I would do it like this:

echo "CN=Name One/OU=Site/O=Domain;CN=Name Two/OU=Site/O=Domain;CN=Name Three/OU=Site/O=Domain" |\
sed 's/CN=\([^\/]*\)[^;]*/\1/g'

If you intend to split it later anyway, you probably want to just match the names and return them im a loop. Example code in perl:

#!/usr/bin/perl
$line="CN=Name One/OU=Site/O=Domain;CN=Name Two/OU=Site/O=Domain;CN=Name Three/OU=Site/O=Domain";
for $match ($line =~ /CN=([^\/]*)/g ){
  print "Name: $match\n";
}

Upvotes: 1

Tomasz Kowalczyk
Tomasz Kowalczyk

Reputation: 10467

assuming you have it in file.txt:

sed -e  's/\/OU=Site\/O=Domain//g' -e 's/CN=//g' file.txt

Upvotes: 0

Related Questions