Reputation: 19
I am trying to get this to print out all the words that are on a text file in ascending order. When I run it, it prints out in ascending order, but it only prints one occurrence of the word. I want it to print out every occurrence of the word(duplicates wanted). I am not sure what I'm doing wrong. Also I would like it to only print out the words and not the punctuation marks that are in the text file. I know I need to use the "split", just not sure how to properly use it. I've worked with it once before but can not remember how to apply it here.
This is the code I have so far:
public class DisplayingWords {
public static void main(String[] args) throws
FileNotFoundException, IOException
{
Scanner ci = new Scanner(System.in);
System.out.print("Please enter a text file to open: ");
String filename = ci.next();
System.out.println("");
File file = new File(filename);
BufferedReader br = new BufferedReader(new FileReader(file));
StringBuilder sb = new StringBuilder();
String str;
while((str = br.readLine())!= null)
{
/*
* This is where i seem to be having my problems.
* I have only ever used a split once before and can not
* remember how to properly use it.
* i am trying to get the print out to avoid printing out
* all the punctuation marks and have only the words
*/
// String[] str = str.split("[ \n\t\r.,;:!?(){}]");
str.split("[ \n\t\r.,;:!?(){}]");
sb.append(str);
sb.append(" ");
System.out.println(str);
}
ArrayList<String> text = new ArrayList<>();
StringTokenizer st = new StringTokenizer(sb.toString().toLowerCase());
while(st.hasMoreTokens())
{
String s = st.nextToken();
text.add(s);
}
System.out.println("\n" + "Words Printed out in Ascending "
+ "(alphabetical) order: " + "\n");
HashSet<String> set = new HashSet<>(text);
List<String> arrayList = new ArrayList<>(set);
Collections.sort(arrayList);
for (Object ob : arrayList)
System.out.println("\t" + ob.toString());
}
}
Upvotes: 0
Views: 206
Reputation: 13066
The problem is here:
HashSet<String> set = new HashSet<>(text);
Set
doesn't contain duplicates.
You should instead use following code:
//HashSet<String> set = new HashSet<>(text);
List<String> arrayList = new ArrayList<>(text);
Collections.sort(arrayList);
Also for split method I would suggest you to use:
s.split("[\\s\\.,;:\\?!]+");
For example consider the code given below:
String s = "Abcdef;Ad; country hahahahah? ad! \n alsj;d;lajfa try.... wait, which wish work";
String sp[] = s.split("[\\s\\.,;:\\?!]+");
for (String sr : sp )
{
System.out.println(sr);
}
Its output is as follows:
Abcdef
Ad
country
hahahahah
ad
alsj
d
lajfa
try
wait
which
wish
work
Upvotes: 1
Reputation: 31204
your duplicates are probably being stripped out here
HashSet<String> set = new HashSet<>(text);
a set
generally does not contain duplicates, so I'd just sort your text
array list
Collections.sort(text);
for (Object ob : text)
System.out.println("\t" + ob.toString());
Upvotes: 1