Reputation: 75
There is a list of names and initials, which are separated by a comma and stored in a variable called $author
Shevchuk T.I., Piskun R.P., Vasenko T.B.
It is necessary to separate the initials and surnames separately into variables.
Example of Names:
Belemets N.I. / N.I. Belemets / N. I. Belemets / Belemets N. I. / Belemets N. / N. Belemets / Nu. Belemets / Belemets Nu.
Now I try to do this as follows:
$str_arr1= explode(", ", $author);
$initials= preg_split('([A-Z]\.[A-Z]\.|[A-Z]\.\s+[A-Z]\.|[A-Z][a-z]\.)', $str_arr1);
$surnames= preg_split('\w{3,15}', $str_arr1);
Example of print_r ($str_arr1):
Array
(
[0] => Gunas I. V.
[1] => Babych L. V.
[2] => Cherkasov E. V.
)
But $initials
and $surnames
do not output anything. What could be the problem? CMS MODX.
Thanks in advance!
UPD:
Now code looks like this:
$str_arr= explode(", ", $author);
foreach($str_arr as $value){
$preinitial= preg_split('/([A-Z]\.[A-Z]\.|[A-Z]\.\s+[A-Z]\.|[A-Z][a-z]\.\s+[A-Z]\.|[A-Z][a-z]\.)/', $value, -1, PREG_SPLIT_NO_EMPTY);
$presurname= preg_split('/\w{3,15}/', $value, -1, PREG_SPLIT_NO_EMPTY);
$initial = implode("", $preinitial);
$surname = implode("", $presurname);
echo '<given_name>'.$surname.'</given_name>';
echo '<surname>'.$initial.'</surname>';
echo "\r\n";
}
Upvotes: 0
Views: 476
Reputation: 23892
You have a few issues with your implementation. preg_split
doesn't take arrays, and requires delimiters. You also should use the PREG_SPLIT_NO_EMPTY
so you don't get back empty values. Your variable names also are inverted, the split
removes what is matched so $initials
is really the surname, and $surnames
are really the initials.
$author = 'Shevchuk T.I., Piskun R.P., Vasenko T.B.';
$str_arr1= explode(", ", $author);
foreach($str_arr1 as $str_arr) {
$initials= preg_split('/([A-Z]\.[A-Z]\.|[A-Z]\.\s+[A-Z]\.|[A-Z][a-z]\.)/', $str_arr, -1, PREG_SPLIT_NO_EMPTY);
$surnames= preg_split('/\w{3,15}/', $str_arr, -1, PREG_SPLIT_NO_EMPTY);
print_r($initials);
print_r($surnames);
}
Demo: https://3v4l.org/1sgmX
I'd recommend this library which I've used successfully to parse full references, https://github.com/knmnyn/ParsCit. You can probably pull out the logic to just parse the authors.
The surname
check with 3,15
also won't work in all cases. For example https://www.ncbi.nlm.nih.gov/pubmed/29052443, Hong Yu
won't be matched because the surname is only 2 characters.
Upvotes: 2