Reputation: 141
well, im a newbie in php, so i was making a program that counts words from a specific text file. This is my text file:
Hello Hello Hello Hello
Hello Word array sum
Hello Find
This is my code (php:
/*Open file*/
$handle = fopen($_FILES['file']['tmp_name'], 'r');
/*read all lines*/
while (! feof($handle)) {
$line = fgets($handle);
/*using array_count_values with str_word_count to count words*/
$result= (array_count_values(str_word_count(strip_tags(strtoupper($line)), 1)));
/*sort array*/
arsort($result);
/*show the first ten positions and print array*/
$top10words2 = array_slice($result, 0, 10);
print "<pre>";
print_r ($top10words2);
print "</pre>";
}
fclose($handle);
but my output is like this:
Array{
[Hello] => 4
}
Array{
[Hello] => 1
[Word] => 1
[array] => 1
[sum] => 1
}
Array{
[Hello] => 1
[Find] => 1
}
I need the output to be like this:
Array{
[Hello] => 6
[Word] => 1
[array] => 1
[sum] => 1
[find] => 1
}
Any tips?
Upvotes: 1
Views: 265
Reputation: 78994
I agree with the file_get_contents()
answer from Ayaou, however for very large files you may need to do it as you've started. You want to build the array of words in the loop and then count, sort and slice afterward:
$result = array();
while(!feof($handle)) {
$line = fgets($handle);
$result = array_merge($result, str_word_count(strip_tags(strtoupper($line)), 1));
}
$result = array_count_values($result);
arsort($result);
$top10words2 = array_slice($result, 0, 10);
Upvotes: 0
Reputation: 975
You're not doing anything to combine the word counts that you calculate on each line. By setting $result = array_count_values(...)
you're abolishing the results from the previous loop. Additionally, because you're performing your splice and dump from within the loop, you're never acting upon the full result set, and thus are never getting a real idea about what are the top 10 most used words.
Your code needs two changes:
Using file_get_contents()
will work, but depending on how large the file is that you're processing, this can cause memory limit errors. A solution that utilizes your initial method would look like this:
$results = [];
while (!feof($handle)) {
$line = fgets($handle);
$line_results = array_count_values(str_word_count(strip_tags(strtoupper($line)), 1));
foreach ($line_results as $word => $count) {
if (isset($results[$word])) {
$results[$word] += $count;
}
else {
$results[$word] = $count;
}
}
}
arsort($results);
// etc...
Upvotes: 0
Reputation: 1774
Use file_get_contents
instead
$fileContent = file_get_contents($_FILES['file']['tmp_name']);
/* using array_count_values with str_word_count to count words */
$result = (array_count_values(str_word_count(strip_tags(strtoupper($fileContent)), 1)));
/* sort array */
arsort($result);
/* show the first ten positions and print array */
$top10words2 = array_slice($result, 0, 10);
print "<pre>";
print_r($top10words2);
print "</pre>";
Here is the output :
Array
(
[HELLO] => 6
[FIND] => 1
[SUM] => 1
[ARRAY] => 1
[WORD] => 1
)
Upvotes: 1