Reputation: 115
I have a list of MAC vendors and I need to parse the text to delete information not necesary.
If I have this
F8FEA8 Technico # Technico Japan Corporation
F8FF5F Shenzhen # Shenzhen Communication Technology Co.,Ltd
FC0012 ToshibaS # Toshiba Samsung Storage Technolgoy Korea Corporation
FC019E Vievu
FC01CD Fundacio # FUNDACION TEKNIKER
FC0647 Cortland # Cortland Research, LLC
FC0877 PrentkeR
FC0A81 Motorola # Motorola Solutions Inc.
I need to delete all [space][word][#] to have this
F8FEA8 Technico Japan Corporation
F8FF5F Shenzhen Communication Technology Co.,Ltd
FC0012 Toshiba Samsung Storage Technolgoy Korea Corporation
FC019E Vievu
FC01CD FUNDACION TEKNIKER
FC0647 Cortland Research, LLC
FC0877 PrentkeR
FC0A81 Motorola Solutions Inc.
Can it be done with grep or sed ? :S
Sorry for my bad english
Upvotes: 1
Views: 158
Reputation: 41456
More awk
awk -F" # [^ ]+ " '{$1=$1}1' file # more robust
awk -F" # [^ ]+ " '$1=$1' file # some dangerous, do not use if $1=0
This sets the field separator equal to what we like to remove then print the rest.
awk '{sub(/ # [^ ]+/,x)}1' file
This just remove what we do not want.
Upvotes: 2
Reputation: 58401
This may work for you (GNU sed):
sed -ri 's/\s\S+\s#//' file
or:
sed -i 's/ [^ ][^ ]* #//' file
Which mean: Look for a space followed by one or more non-spaces, followed by another space, followed by a #
and then delete that expression. The file is update in place which is what the -i
option means.The -r
option in the first solution, allows syntatic sugar
to be used, in this case the allows you to write \S+
instead of \S\+
or [^ ][^ ]*
.
Upvotes: 4
Reputation: 7802
Here is a shell only solution:
while read A B C D;do
[ "$C" == "#" ] && echo "$A $D" || echo "$A $B $C $D"
done < infile.txt >outfile.txt
Upvotes: 4
Reputation: 23364
Assuming #
stands by itself in field 3 when it occurs, the following awk solution may work
awk '$3 == "#"{t=$1; $1=$2=$3=""; sub(/^[[:space:]]+/, ""); $0=t" "$0};
{print}' file.txt
Upvotes: 2
Reputation: 36262
It's seems an easy parsing. Here a solution using perl. It splits line in fields based in white spaces and if the third one is #
remove it and the previous one:
perl -lane 'if ( $F[2] eq q|#| ) { @F = @F[0,3..$#F] }; print qq|@F|' infile
It yields:
F8FEA8 Technico Japan Corporation
F8FF5F Shenzhen Communication Technology Co.,Ltd
FC0012 Toshiba Samsung Storage Technolgoy Korea Corporation
FC019E Vievu
FC01CD FUNDACION TEKNIKER
FC0647 Cortland Research, LLC
FC0877 PrentkeR
FC0A81 Motorola Solutions Inc.
Upvotes: 2