Reputation: 115
i need some help with a regex that follows up this format:
First part of the string is a email address, followed by eight columns divided by ";"
.
[email protected];Alex;Test;Alex A.Test;Alex;12;34;56;78
the first part i have is (.*@.*com
)
these are also possible source strings:
[email protected];Alex;;Alex A.Test;;12;34;56;78
[email protected];Alex;;Alex A.Test;Alex;;34;;78
[email protected];Alex;Test;;Alex;12;34;56; and so on
Upvotes: 0
Views: 382
Reputation: 307
You can try this regex:
^(.*@.*com)(([^";\n]*|"[^"\n]*");){8}(([^";\n]*|"[^"\n]*"))$
If you have a different number of columns after the adress change the number between {
and }
For your data here the catches:
1. `[email protected]`
2. `56;`
3. `56`
4. `78`
If you are sure there will be no "
in your strings you can use this:
^(.*@.*com)(([^;\n]*);){8}([^;\n]*)$
Edit:
OP suggested this usage:
For use the first regex with sed
you need -i -n -E
flags and escape the "
char.
The result will look like this:
sed -i -n -E "/(.*@.*com)(([^\";\n]*|\"[^\"\n]*\");){8}(([^\";\n]*|\"[^\"\n]*\"))/p"
Upvotes: 1
Reputation: 785108
Using awk you can do this easily:
awk -F ';' '$1 ~ /\.com$/{print NF}' file
9
9
9
cat file
[email protected];Alex;;Alex A.Test;;12;34;56;78
[email protected];Alex;;Alex A.Test;Alex;;34;;78
[email protected];Alex;Test;;Alex;12;34;56; and so on
Upvotes: 0
Reputation: 1482
you can have something like
".*@.*\.com;[A-Z,a-z]*;[A-Z,a-z]*;[A-Z,a-z, ,.,]*;[A-Z,a-z]*;[0-9][0-9];[0-9][0-9];[0-9][0-9];[0-9][0-9]"
Assuming the numbers are only two digit
Upvotes: 0