Nemo
Nemo

Reputation: 513

PHP - Find words of array in strings in array return value in both cases

Problem

I have two arrays. 1 array contains words, the other array contains strings. I have written a code that will find the word in the string. If one of the words is found in the string. I return the word. But if the no of the words in the array are found in the string I also would like to return one value. I have written several codes and I end up always with same cases.

The cases are:

Question

I got it somewhere wrong. How could I store for each string the word if found and if not found the value is set to missing value?

Code

Input
$csv_specie = array("Mouse","Human");
$CDNA = 'Human interleukin 2 (IL2)a;Ampicillin resistance gene (amp)a;Mouse amp gene';
# Split string by ; symbol into an array
  $CDNA_new = preg_split("/\b;\b/", $CDNA);
Output (I would like to end with something like this
foreach ($CDNA_new as $string){
    $specie = $result ## Human
    echo $specie."-"$sring.  "<br \>\n"; 
}

Result in web browser:

Human-Human interleukin 2 (IL2)a

NA-Ampicillin resistance gene (amp)a

Mouse-Mouse amp gene

First try

# Go through the string
  foreach($CDNA_new as $t){
# Go through the specie array
    foreach ($csv_specie as $c){ 
# Find specie in string
        if (strpos($t, $c) !== FALSE ){ 
            $match = $c;
            $specie = $c;
        }
    }
# If no match found set values to missing values
    if (isset($specie) !== TRUE){
        $match = "NA";
        $specie = "NA";
        }
    echo "----------------------".  "<br \>\n"; 
    echo '+'.$specie.  "<br \>\n"; 
    echo '+'.$match.  "<br \>\n"; 
    echo '+'.$t.  "<br \>\n";
    # Work further with the values to retrieve gene ID using eSearch

   } 

Second try

# use function to find match
function existor_not($str, $character) {
    if (strpos($str, $character) !== false) {
        return $character;
    }
    return $character = "0";
}
foreach ( $CDNA_new as $string ){
    
    foreach ( $csv_specie as $keyword ){
        
        $test = existor_not($string,$keyword);
    }
    echo "-".$test."|" . $string.  "<br \>\n"; 
    # Work further with the values to retrieve gene ID using eSearch
}

Third try

foreach ( $CDNA_new as $string ){
  foreach ( $csv_specie as $keyword ){
    $result = stripos($string, $keyword);
    if ($result === false) {
        $specie = "NA";
    }
    else {
        $specie = $keyword;
    }
}
if ($specie !== "NA"){
echo "match found";
}else{
   $match = "NA";
   $specie = "NA";
}
    echo $specie. "<br \>\n"; 
    # Work further with the values to retrieve gene ID using eSearch
    }

Upvotes: 1

Views: 1671

Answers (2)

Andreas
Andreas

Reputation: 23958

You can use preg_grep to match in a case insensitive way inside a loop of specie's.
I then use array_diff remove the items from $cdna to make sure I don't match again or waste time.
What is left in $cdna after the loop is the items that did not match, I add them to "N/A" item.

$csv_specie = array("Mouse","Human");
$CDNA = 'Human interleukin 2 (IL2)a;Ampicillin resistance gene (amp)a;Mouse amp gene;Some other stuff unknown to man kind';


$csv_specie = array("Mouse","Human");
$CDNA = 'Human interleukin 2 (IL2)a;Ampicillin resistance gene (amp)a;Mouse amp gene;Some other stuff unknown to man kind;some other human stuff';

$cdna = explode(";", $CDNA);

Foreach($csv_specie as $specie){
    $matches[$specie] = preg_grep("/\b" . $specie . "\b/i", $cdna);
    Echo $specie . " - " . implode("\n" . $specie . " - " , $matches[$specie]) . "\n";

    // Remove matched items from $cdna
    // This makes $cdna smaller for each 
    // iteration and make it faster.
    $cdna = array_diff($cdna, $matches[$specie]);
}

// What is left in $cdna is not matched
$matches["N/A"] = $cdna;

Echo "\nN/A - " . implode("\nN/A - ", $matches["N/A"]);

Output:

Mouse - Mouse amp gene
Human - Human interleukin 2 (IL2)a
Human - some other human stuff

N/A - Ampicillin resistance gene (amp)a
N/A - Some other stuff unknown to man kind

https://3v4l.org/64Qmq

Upvotes: 1

Nigel Ren
Nigel Ren

Reputation: 57121

Just using your first version as a basis, there were a couple of problems. You weren't resetting the field you are using to store the match, so the next time round it still had the match from the previous loop.

You were also using $qspecie and you were setting $specie.

foreach($CDNA_new as $t){
    $match = null;    // Reset value for match
    # Go through the specie array
    foreach ($csv_specie as $c){
        # Find specie in string
        if (strpos($t, $c) !== FALSE ){
            $match = $c;
            break;       // Don't carry on if you found a match
        }
    }
    # If no match found set values to missing values
    if ($match == null){
        $match = "NA";
    }
    echo "----------------------".  "<br \>\n";
    echo '+'.$match.  "<br \>\n";
    echo '+'.$t.  "<br \>\n";
    # Work further with the values to retrieve gene ID using eSearch
}

Or you could rely on setting the dummy value and then it's only overwritten if a genuine match is found...

foreach($CDNA_new as $t){
    $match = "NA";
    # Go through the specie array
    foreach ($csv_specie as $c){
        # Find specie in string
        if (strpos($t, $c) !== FALSE ){
            $match = $c;
            break;
        }
    }
    echo "----------------------".  "<br \>\n";
    echo '+'.$match.  "<br \>\n";
    echo '+'.$t.  "<br \>\n";
    # Work further with the values to retrieve gene ID using eSearch
} 

Upvotes: 1

Related Questions