Imad Sid
Imad Sid

Reputation: 35

Match the line after delimiter and shorter than 9

me and regex never get along i get every day an email from my supervisor contains about 1000+ lines need to be sorted its like :

name|code

the goal is to separate them to 2 files

example :

what i do

i look after | character i remove the whole line : if code contains numbers only and or contains letters only and or is shorter than 9

the example list becomes :

Garry Cooper|abc123h1n1

this steps i do them daily sometimes i get 2000 lines :/ real pain

i used to work with regex in notepad++ but i cant found the match for this one i am not very bad also in php help me please

UPDATE 01 : regex found (?i)^[^|]\|\h[a-z\d]{0,8}$\R? Current question : writting a small php script or maybe reusable classes

  1. interface:

submit the data from text box (html form) or from txt file

  1. processing :

lines that match the regex downloadable in txt file. others in a files

  1. output:

2 links of the files

Thank u all for your help in advance

Upvotes: 2

Views: 120

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627101

If you just use a greedy dot matching with .* you do not check the length. It can be checked with the limiting quantifier. To match just 0 to 8 symbols, you can use {0,8}. All but | can be matched with [^|]* negated character class.

Use

(?i)^[^|]*\|\h*[a-z\d]{0,8}$\R?

See regex demo (note that gm flags are used by default in Notepad++ regex-based search and replace).

Explanation:

  • ^ - start of a line
  • [^|]* - zero or more symbols other than a pipe
  • \| - a literal pipe symbol
  • \h* - zero or more horizontal whitespace
  • [a-z\d]{0,8} - letters a to z and A to Z (due to (?i) case insensitive modifier) or digits, zero to 8 occurrences
  • $ - end of line and
  • \R? - one or zero (otpional) line break.

enter image description here

Upvotes: 2

Related Questions