Reputation: 307
I have a string that looks a little like
Name: xxx xxx
Company Name: xxx xxx xx
Company Type: xxxx
Tel: xxxx
Email: xxxxxxx
Postcode: xxxxxx
I am trying to pull out the xxx
I am using preg_match_all to do so but the regular expression I need is not something I can grasp :( I have been reading various tutorials around the web and now I understand it all less.
I presume I could do something like
find ^Name:(then any amount of words spaces etc till I get to) Company Name$ then ^Company Name:(then any amount of words spaces etc till I get to) Company Type$
if somebody could just start me off, maybe with a small explanation to help me understand things more, such as the term "matches" how do I define what is a match and what is ignored, as I just want the xxx parts in an array so if I did ^Name:[a-zA-Z0-9]$ would that all be a match or just the bit in [].
Regards.
Edit: Adding the php code I am using.
foreach( $value as $k => &$v ){
if( $k == "history_date_created" ){
$v = date( "D jS M Y @ H:i:s", strtotime($v) );
}
if( $k == "history_text" ){
//Name: xxx xxxx Company Name: xxxx xxxx Company Type: xxxx xxxx Tel: xxxx xxxx Email: xxxx xxxx Postcode: xxxx xxxx To Email: xxxx xxxx Subscription: none
$pattern = "/Name: (.*) Company Name: (.*) Company Type: (.*) Tel: (.*) Email: (.*)/U";
preg_match_all( $pattern, $v, $matches, PREG_SET_ORDER );
print_r( $matches );
}
}
basically I have pulled a row from a database, unfortunately "history_text" is a text field that in my opinion is stored wrong but I can do nothing to change this now so need to pull the different values with regex, the history_text field is created by a form so "Name:" "Company Name:" etc will always be the same, the values of each will not and are user in-putted so could be anything including blank.
Edit My answer:
No Reg Ex needed This is what I did in the end
foreach( $value as $k => &$v ){
if( $k == "history_date_created" ){
$v = date( "D jS M Y @ H:i:s", strtotime($v) );
}
if( $k == "history_text" ){
$matches = explode("\n", $v);
foreach( $matches as $match){
$boom = explode( ":", $match );
$value[$boom[0]] = $boom[1];
}
}
}
Upvotes: 0
Views: 955
Reputation: 1751
There isnt really a good way to separate your data because there is no separator between xxxx and Company Name. if it was company_name instead, then this might not be such a problem.
look into a regex solution, or use the explode function (maybe twice) with ":" and with spaces " ".
Upvotes: 0
Reputation: 22660
Try this:
preg_match_all("/Name: (.*) Company Name: (.*) Company Type: (.*) Tel: (.*) Email: (.*)/U", $x, $matches, PREG_SET_ORDER);
A few notes about this:
.
captures any single character - except newlines (by default except
newlines)*
will extend it to capture multiple characters()
will capture those in submatches You can also use other character classes if you want to
limit it further.U
modifier (after the //
) makes the matching non-greedy. This
can be helpful to avoid .* matching parts of your "control text",
e.g. when you have multiple matches on a single line.PREG_SET_ORDER
usually makes it more convenient to iterate through the matches array which you can access e.g. by $matches[4][2]
for the Company name of
the 5th match instead of $matches[2][4]
with the default pattern ordering.EDIT: I assume that you know the actual "description terms" such as "Company Name" otherwise it will be impossible to generally distinguish between "(XXX XXX Company) Name:" and "(XXX XXX) Company Name:"
Also note that you will need only a preg_match
to capture a single instance of such a 'line' while preg_match_all
will be helpful to capture multiple 'lines'.
Upvotes: 1
Reputation: 2212
It looks a little hard and complex to do this by only regex. But you can use regex for : (colon) symbols.
/[^:]*/
This will give you all strings before each colon symbol. Than you can cut last parts of all those strings. eg. If subpos of "Company Name:" !== FALSE
, cut last part of that string. That gives you value of Name.
You can use same logic for other parts.
Upvotes: 1