Reputation: 3876
I have a string with emails, some duplicated. For example only:
"[email protected],[email protected],[email protected],[email protected],[email protected]"
I would like string to contain only unique emails, comma separated. Result should be:
"[email protected],[email protected],[email protected]"
Any easy way to do this?
P.S. emails vary, and I don't know what they will contain.
Upvotes: 1
Views: 60
Reputation: 39406
Getting the strings in an array:
IFS=','; read -r -a lst <<< "[email protected],[email protected],[email protected],[email protected],[email protected]"
Sorting and filtering:
IFS=$'\n' sort <<< "${lst[*]}" | uniq
Upvotes: 0
Reputation: 23677
With perl
$ s="[email protected],[email protected],[email protected],[email protected],[email protected]"
$ echo $s | perl -MList::MoreUtils=uniq -F, -le 'print join ",",uniq(@F)'
[email protected],[email protected],[email protected]
Upvotes: 0
Reputation: 85780
Using awk
and process-substitution
only than to use sort
and other tools.
awk -vORS="," '!seen[$1]++' < <(echo "[email protected],[email protected],[email protected],[email protected],[email protected]" | tr ',' '\n')
[email protected],[email protected],[email protected]
Or another way to use pure-bash and avoid tr
completely would be
# Read into a bash array with field-separator as ',' read with '-a' for reading to an array
IFS=',' read -ra myArray <<< "[email protected],[email protected],[email protected],[email protected],[email protected]"
# Printing the array elements new line and feeding it to awk
awk -vORS="," '!seen[$1]++' < <(printf '%s\n' "${myArray[@]}")
[email protected],[email protected],[email protected]
Upvotes: 1
Reputation: 23870
How about this:
echo "[email protected],[email protected],[email protected],[email protected],[email protected]" |
tr ',' '\n' |
sort |
uniq |
tr '\n' ',' |
sed -e 's/,$//'
I convert the separating commas into newlines so that I can then use tools (like sort
, uniq
, and grep
) that work with lines.
Upvotes: 4