Reputation:
I'm working on a program to read in a Wikipedia page view statistics file from a .txt file, so far I have a load method that reads in this file as follows:
public void loadPVSF(String x) throws FileNotFoundException, IOException {
FileInputStream f = new FileInputStream(x); //obtains bytes from an input file
DataInputStream in = new DataInputStream(f); //reads primitive java types
BufferedReader br = new BufferedReader(new InputStreamReader(in));
while ((temp = br.readLine()) != null) {
tempArray = temp.split("\n"); //adds each line to an array tempArray
for (String st : tempArray) //puts each element of tempArray through String st
{
MainArray = st.split(" "); //adds each string after a " " to MainArray
for (String str : MainArray) {
if(linecounter<5){
linecounter++;
System.out.println(linecounter + ": " + str);
Running this, this is a sample of the following command line output:
1: commons.m
2: Category:Gracie_Gold
3: 1
4: 7406
1: commons.m
2: Category:Grad_Maribor
3: 1
4: 7324
1: commons.m
2: Category:Grade_II*_listed_houses_in_Cheshire
3: 1
4: 7781
Basically each set of four lines is:
1 - Language/Project
2 - Article Title
3 - Number of Page views
4 - Size of the Page (bytes)
I need to know how I will go about assigning each one of these read-in lines correctly. Essentially what I need in the end is a hash table that will store a list of the article titles and their corresponding number of views so that I can determine which one has the largest number of views.
Any tips or advice would be greatly appreciated.
Sample of the input .txt file:
nl Andreas_(apostel) 7 103145 nl Andreas_Baader 4 46158 nl Andreas_Bjelland 2 28288 nl Andreas_Burnier 2 11545 nl Andreas_Charles_van_Braam_Houckgeest 1 10373 nl Andreas_Eschbach 1 365 nl Andreas_Grassl 1 365
Upvotes: 2
Views: 306
Reputation: 11298
You can have a simple class like
class Page {
String languageOrProject ;
String articleTitle;
int views;
int size ;
}
then you can sort with a Comparator. Or if you need only maximum page views, add it in a TreeMap
with key as Views and value as pageTitle. At the end you will be able to get least reading page by map.firstKey()
and max reading page by map.lastKey()
Upvotes: 1