Reputation: 3237
I'm processing records in PHP and was wondering if there is an efficient method to pull out the genre: values from each of the following records. genre: can be anywhere in the string.
In the following string I need to pull out the word "alternative" (last word)
[media:keywords] => upc:00602527365589,Records,mercury,artist:Neon
Trees,Alternative,trees,neon,genre:alternative
In the following string I need to pull out "Latin / Pop,latino,Pop"
[media:keywords] => genre:Latin / Pop,latino,Pop,upc:00602527341217,artist:Luis
Fonsi,luis,universal,Fonsi,Latin
In the following record I need to pull out "other"
[media:keywords] => upc:793018101530,andy,razor,Other,tie,genre:other,artist:Andy
McKee,McKee,&
In the following record I need to pull out "rock,flotsam,jetsam"
[media:keywords] => and,upc:00602498572061,genre:rock,flotsam,jetsam,artist:Flotsam
And Jetsam,rock,geffen
I'm pulling my hair out on this (what is left anyway).
Upvotes: 1
Views: 169
Reputation: 3280
your problem with parsing this string is that you don't have normal delimiter and/or quotes (i.e. comma separates fields, but may be as well included in a field - it's the same problem that exist with CSV files without quotes).
If performance does not matter a lot for you I would suggest parsing it in more bullet proof way, like make some assumption about what is a key (like artist, genre, ups, etc.) and introduce some normal delimiter, the proof of concept code would be: (i have left echoes so you can see whats happening)
$string = "genre:Latin / Pop,latino,Pop,upc:00602527341217,artist:Luis Fonsi,luis,universal,Fonsi,Latin";
//introduce a delimiter
$delimiter = '|';
$withDelimiter = preg_replace('/([a-z]+):/', $delimiter . '$0', $string);
echo $withDelimiter . "\n";
$fields = explode($delimiter, $withDelimiter);
foreach ($fields as $field) {
if (strlen($field)) {
echo $field . "\n";
list ($key, $valueWithPossiblyTrailingComma) = explode(':', $field);
if ($key === 'genre') {
$genre = rtrim($valueWithPossiblyTrailingComma, ',');
break;
}
}
}
echo $genre;
you can make it work in nearly all cases, and it allows you to find any key not only genre - but it's performance will be low.
I have made following assumptions about your string:
Upvotes: 0
Reputation: 28891
Use the following regular expression coupled with preg_match():
~\bgenre:(.+?)(?=(,[^:,]+:|$))~
Your desired result will be in the first element of the matches array (paremeter 3).
Upvotes: 2
Reputation: 6394
$mystring = 'abc';
$findme = 'a';
$pos = strpos($mystring, $findme);
// Note our use of ===. Simply == would not work as expected
// because the position of 'a' was the 0th (first) character.
if ($pos === false) {
echo "The string '$findme' was not found in the string '$mystring'";
} else {
echo "The string '$findme' was found in the string '$mystring'";
echo " and exists at position $pos";
}
From the PHP Documentation for strpos
So you can just use $findme = "alternative"
Upvotes: 0
Reputation: 145482
You can indeed use a bit of pattern detection. You are always looking for the fixed genre:
followed by one or more words or phrases, neither of which may itself contain a :
So this might suffice:
preg_match('~\bgenre:(,?[^:,]+(?=,|$))+~', $media_keywords, $match);
print $match[1];
Upvotes: 0
Reputation: 124
I shall use a strpos to define where the genre starts. The only problem you have is where to end it because you do not have a delimeter. I should use the known other keywords like "upc","artist" etc to check if the string needs to be cut of at the end.
Upvotes: 0