Reputation: 519
I am trying to write the php code that goes over the .CSV file and identifies the similar words between the first line and second, first line and third and so one.
My test.csv file is look likes as following:
1,"most of the birds lives in the village"
2,"birds in the village are free"
3,"no birds can be trapped in the village"
4,"people of the village are so kind"
5,"these birds are special species where most of them are so small"
Note: the numbers as shown in above example is in the test.csv file but they are the key for comparison which it called sentences ID and i will use them in comparison.
So by using the above test.csv file, what i want to do is to compare the line 1 with line 2 and tell how many words are similar and then compare the line 1 with line 3 and tell how many similar word are there and so on, and complete the following format:
1:2 = ?
1:3 = ?
1:4 = ?
1:5 = ?
and
2:3 =?
2:4 =?
2:5 =?
and
3:4 =?
3:5 =?
and
4:5 =?
Upvotes: 0
Views: 196
Reputation: 1397
Another snippet:
<?php
$csv = array_map('str_getcsv', file('test.csv'));
foreach ($csv as $key => $row) {
compareLine($csv, $key);
}
function compareLine($csv, $key)
{
$temp = array_slice($csv, $key);
foreach ($temp as $index => $row) {
if ($index === 0) continue;
$firstWord = $csv[$key][1];
$secondWorld = $row[1];
$diff = array_intersect(explode(" ", $firstWord), explode(" ", $secondWorld));
echo "{$csv[$key][0]}:{$row[0]}" .' = ' . implode(", ", $diff) . PHP_EOL;
}
}
Upvotes: 1
Reputation: 13442
Try this one, not perfect but it does the job:
// You can use PHP's array_intersect
function checkSimilarity ($string1, $string2) {
$arr1 = explode(" ",$string1 );
$arr2 = explode(" ",$string2 );
$result = array_intersect(array_unique($arr1) , array_unique($arr2)); //matched elements with duplicates removed
return count($result); //number of matches
}
$sentences = [
1 => "most of the birds lives in the village",
2 => "birds in the village are free",
3 => "no birds can be trapped in the village",
4 => "people of the village are so kind",
5 => "these birds are special species where most of them are so small"
];
// loop through array
foreach ($sentences as $mainKey => $value) {
// for each number, loop others check similarity
foreach ($sentences as $key => $v) {
// if number key exist
$compareTo = $key + 1;
if(array_key_exists($compareTo, $sentences) && $compareTo != $mainKey && $mainKey < $compareTo) {
echo $mainKey . ":" . $compareTo . " = " . checkSimilarity($value, $sentences[$compareTo]) . "\n";
}
}
}
Sample sandbox: http://sandbox.onlinephpfunctions.com/code/c571beb140a1dc114b42bfd884fbe33e348f76c5
Upvotes: 2