user187580
user187580

Reputation: 2315

how to retrieve substring from string having variable length of character in php?

I have some data in the format of

C222 = 50
C1234P687 = 'some text'
C123YYY = 'text'
C444 = 89
C345 = 3
C122P687 = 'some text'
C122YYY = 'text'
....
....

so basically 3 different forms

  1. "C" number = value, example - C444 = 89
  2. "C" number "P" number = value, example - C123P687 = 'some text'
  3. "C" number "YYY" = value

Only number is of variable length on the left side of (=) sign. Values vary.

I want to store the data in db as

INSERT INTO datatable 
    c_id = "number after C"
    p_id = "number after P" // if it exists for a line of data
    value = 'value'
    yyy = 'value'

Any ideas how to retrieve these numbers?

Thanks

Upvotes: 3

Views: 790

Answers (3)

Lizard
Lizard

Reputation: 45032

I am assuming the data is read in from a file, so using something like file_get_contents or fopen.

For the data regular Expressions is you best bet..

Take a look at http://php.net/manual/en/function.preg-match-all.php

# UPDATE REG EX UPDATED
preg_match_all("|C([\d]+)(?:P([\d]+))?(YYY)? = (.+)|", $yourDataLine, $out, PREG_SET_ORDER);

$out[0] = TESTED STRING e.g. C123 = 456;    
$out[1] = C;
$out[2] = P;
$out[3] = YYY;
$out[4] = VALUE;

reg expression thanks to: Andy Shellam

Upvotes: 3

Andy Shellam
Andy Shellam

Reputation: 15545

Use regular expressions in PHP. The following regular expression pattern will match all cases you've provided - use it with preg_match_all (pass in an array and PREG_SET_ORDER) and the array you passed in will then contain an array per row of your data, and each row's array will contain 5 elements.

(C[\d]+)(P[\d]+)?(YYY)? = (.+)

The first element of the array will contain the complete string that was tested.

The second element of the array will contain the "C" number.

The third element of the array will contain the "P" number if present, otherwise it'll be blank.

The fourth element of the array will contain "YYY" if present, otherwise it'll be blank.

The fifth element of the array will contain the value.

In response to an earlier comment, the following regex is a modified version of the above, but will not include the C and P in the matched values:

C([\d]+)(?:P([\d]+))?(YYY)? = (.+)

Upvotes: 5

zaf
zaf

Reputation: 23264

A solution without using regular expressions:

<?php

$s="C23YYY = 'hello world'";

$p1=explode('=',$s);

$left=trim($p1[0]);
$right=trim($p1[1]);

$value=$right;

if(substr($left,-3)!='YYY'){
    $pos=strpos($left,'P');
    if($pos!==false){
        $c_id=substr($left,1,$pos-1);
        $p_id=substr($left,1+$pos-strlen($left));
    }else{
        $c_id=substr($left,1-strlen($left));
    }
}else{
    $c_id=substr($left,1,strlen($left)-4);
    $yyy='YYY';
}

print("c_id = $c_id <br/>");
print("p_id = $p_id <br/>");
print("value = $value <br/>");
print("yyy = $yyy <br/>");

?>

Upvotes: 1

Related Questions