Reputation: 282
Before I had these two variables (a),(total) in two different class but I couldnt get the properties of the class.
So,
I tried to put both the code into a single class
but neither one of the variable is working
System.out.println("( "+file1.getName() +" )-" +" Total no of words=" + a +"Total repeated words counted:"+total);
Neither one is working:
My present sample output so far:
( Blog 39.txt )-Total repeated words counted:4,total no of words:0
neither
( Blog 39.txt )-Total repeated words counted:0,total no of words:82
The output which i needed is:
( Blog 39.txt )-Total repeated words counted:4,total no of words:82
When I run neither "a" or "total" is working.(vice versa) If i change the code (variable)order.
Anyone tell how should I get both the variable output?? :) I am a beginner to java
Here is my code below.
package ramki;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FilenameFilter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.StringTokenizer;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
public class newrepeatedcount {
public static void main(String[] args) {
FilenameFilter filter = new FilenameFilter() {
public boolean accept(File dir, String name) {
return name.endsWith(".txt");
}
};
File folder = new File("E:\\testfolder\\");
File[] listOfFiles = folder.listFiles(filter);
for (int i = 0; i < listOfFiles.length; i++) {
File file1 = listOfFiles[i];
BufferedReader ins = null;
try {
ins = new BufferedReader(
new InputStreamReader(new FileInputStream(file1)));
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
String line = "", str = "";
String st = null;
try {
st = IOUtils.toString(ins);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
// split text to array of words
String[] words = st.split("\\s");
// frequency array
int[] fr = new int[words.length];
// init frequency array
for (int i1 = 0; i1 < fr.length; i1++)
fr[i1] = -1;
// count words frequency
for (int i1 = 0; i1 < words.length; i1++) {
for (int j = 0; j < words.length; j++) {
if (words[i1].equals(words[j])) {
fr[i1]++;
}
}
}
// clean duplicates
for (int i1 = 0; i1 < words.length; i1++) {
for (int j = 0; j < words.length; j++) {
if (words[i1].equals(words[j])) {
if (i1 != j)
words[i1] = "";
}
}
}
int a = 0;
try {
while ((line = ins.readLine()) != null) {
str += line + " ";
}
} catch (IOException e) {
e.printStackTrace();
}
StringTokenizer st1 = new StringTokenizer(str);
while (st1.hasMoreTokens()) {
String s = st1.nextToken();
a++;
}
int total = 0;
for (int i1 = 0; i1 < words.length; i1++) {
if (words[i1] != "") {
// System.out.println(words[i1]+"="+fr[i1]);
total += fr[i1];
}
}
System.out.println("( " + file1.getName() + " )-"
+ "Total repeated words counted:" + total + ","
+ "total no of words:" + a);
// System.out.println("total no of words:"+a);
}
}
}
Upvotes: 0
Views: 1387
Reputation: 388
I am not sure if this is the best solution but this will give you a Set or Map(Internally you can convert and number of times it has appeared in your text. Then you can use it based on your requirement.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
public class NewRepeatedCount {
public static void main(String... arg0)
{
BufferedReader br = null;
Map<String, Integer> counterMap = new HashMap<String, Integer>();
try {
String sCurrentLine;
br = new BufferedReader(new FileReader("C:\\testing.txt"));
while ((sCurrentLine = br.readLine()) != null) {
String[] words = sCurrentLine.split("\\s");
for(String word : words)
{
int count = 1;
if(counterMap.get(word) != null)
{
count = counterMap.get(word);
count++;
counterMap.put(word, count);
}else{
counterMap.put(word, count);
}
}
}
System.out.println(counterMap.entrySet());
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
Upvotes: 1
Reputation: 533530
Whenever you have data processing like this, Java 8's Stream API is likely to be the best choice.
// get all the files under this folder
Files.walk(Paths.get("E:\\testfolder\\"))
// keep all the files ending in .txt
.filter(p -> p.toString().toLowerCase().endsWith(".txt"))
.forEach(p -> {
try {
// process all the lines of the file.
Map<String, Long> wordCount = Files.lines(p)
// break the lines into words
.flatMap(l -> Stream.of(l.split("\\s")))
// collect the words and count them
.collect(Collectors.groupingBy(w -> w, Collectors.counting()));
// find how many values are more than 1
long wordDuplicates = wordCount.values().stream().filter(l -> l > 1).count();
// get the sum of all the values.
long totalWords = wordCount.values().stream().mapToLong(l -> l).sum();
System.out.println(p + " has " + wordDuplicates + " duplicates and " + totalWords + " words");
// catch the IOException at the end because you can't do anything more with the file if this happens.
} catch (IOException e) {
e.printStackTrace();
}
});
Upvotes: 0
Reputation: 1640
If you read a stream to its end, you will not be able to read any further. As your code is not optimized in many ways, i can suggeest a quick and dirty way to make your code work. Just initialize the BufferedReader that is assigned to the variable "ins", before you calculate the value of "a" anew.
...
try {
ins = new BufferedReader ( new InputStreamReader(new FileInputStream(file1)));
while ((line = ins.readLine()) != null) {
str += line + " ";
}
} catch (IOException e) {
e.printStackTrace();
}
...
Upvotes: 1