Reputation: 27
I have a text file that has two paragraphs. it has periods and comas at the end of some words and when I read the file those comas are also added to the list of words read. this is the code that read the file
public static Scanner openTextFile(String fileName) {
Scanner data;
try{
data = new Scanner(new File(fileName));
return data;
}
catch(FileNotFoundException e){
System.out.println(fileName + " did not read correctly");
}
data = null;
return data;
}
But I want it to read only the words and ignore any commas or periods or any brackets next to it. How can I achieve it
i used replaceall method but it didnt work at all
public static void readOtherFile(Scanner data, int g[][], Key[] hashTable, int[] keyWordCounter, int modValue) {
int lineCounter = 0, wordCounter = 0;
String x;
String []y;
while(data.hasNextLine()){
lineCounter += 1;
x = data.nextLine();
/*the following conditional statement takes care of the issue of their being an
* entirely blank line encountered before reaching the end of the text file.
*/
if(x.length() == 0) {
x = data.nextLine();
}
x = x.toLowerCase();
x = x.replaceAll("\\p{Punct}", "");
y = x.split(" ");
wordCounter += y.length;
//method compares a token to a key word to see if they are identical.
checkForKeyWord(y, g, hashTable, keyWordCounter, modValue);
}
//method prints statistical results
printResults(lineCounter, wordCounter, hashTable, keyWordCounter);
}
Sample file sample file link
Upvotes: 0
Views: 702
Reputation: 32
In order to achieve this you can use a regular expression on the data you parse from your file as a string. You need to return the data you read from the first before you do string manipulation. Its bad practice to do the string manipulation within the while-loop.
static String readFile(String path, Charset encoding)
throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, encoding);
}
String data = d.replaceAll("[,.]", "");
Upvotes: 1