user2483015
user2483015

Reputation: 19

Assign certain score according to words

Thanks Solved.

My words.txt file looks like the following :

await   -1

awaited -1

award   3

awards  3

The values are tab-delimited. First, I want to get the result of, for example, await = -1 point and provide score for every sentence from my comment.txt file according to the words.txt file. The output of the program should be like (for example)

-1.0

2.0

0.0

5.0

I am stuck and does not know what exactly I should do next. I only managed to read the words.txt file so far.

    const char DELIM = '\t'; 
    const string FILENAME = @"words.txt"; 

    string record;  
    string[] fields; 

    FileStream inFile; 
    StreamReader reader; 


    inFile = new FileStream(FILENAME, FileMode.Open, FileAccess.Read);

    reader = new StreamReader(inFile);

    record = reader.ReadLine();

    //Spliting up a string using delimiter and
    //storing the spilt strings into a string array
    fields = record.Split(DELIM);

    double values = double.Parse(fields[1]);
    string words = fields[0];

Upvotes: 0

Views: 371

Answers (4)

Rémi
Rémi

Reputation: 3967

you should have a look at dictionary you could match each word you want to put a score on with his value in the dictionary. This way you could just loop all the word you got and output the value

using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
        Dictionary<string, int> dictionary = new Dictionary<string, int>();
        dictionary.Add("await", -1);
        dictionary.Add("awaited", -1);
        dictionary.Add("award", 3);
        dictionary.Add("awards", 3);

        //read your file
        //split content on the splitter (tab) in an array

        for(int i=0; i<array.Length; i++)
        {
            //output the value
        }
    }
}

Upvotes: 1

user2237264
user2237264

Reputation:

A working solution without a dictionary:

using System.IO;
using System.Text.RegularExpressions; 

class Program
{
    static void Main(string[] args)
    {
        foreach (var comment in File.ReadAllLines(@"..\..\comments.txt"))
            Console.WriteLine(GetRating(comment));

        Console.ReadLine();
    }

    static double GetRating(string comment)
    {
        double rating = double.NaN;

        var wordsLines = from line in File.ReadAllLines(@"..\..\words.txt")
                         where !String.IsNullOrEmpty(line)
                         select Regex.Replace(line, @"\s+", " ");

        var wordRatings = from wLine in wordsLines
                          select new { Word = wLine.Split()[0],  Rating = Double.Parse(wLine.Split()[1]) };


        foreach (var wr in wordRatings)
        {
            if (comment.ToLower().Split(new Char[] {' ', ',', '.', ':', ';'}).Contains(wr.Word))
                rating = wr.Rating;
        }

        return rating;
    }
}

Upvotes: 0

nahammel
nahammel

Reputation: 606

Combining both vadz's answer and im_a_noob's answer, you should be able to read your words.txt file and put it into a dictionary.

    Dictionary<string, double> wordDictionary = new Dictionary<string, double>();
    using (FileStream fileStream = new FileStream(FILENAME, FileMode.Open, FileAccess.Read))
        {
            using (StreamReader reader = new StreamReader(fileStream))
            {
                int lineCount = 0;
                int skippedLine = 0;
                while( !reader.EndOfStream)
                {
                    string[] fields = reader.ReadLine().Split('\t');
                    string word = fields[0];
                    double value = 0;
                    lineCount++;

                    //this check verifies there are two elements, tries to parse the second value and checks that the word 
                    //is not already in the dictionary
                    if (fields.Count() == 2 && double.TryParse(fields[1], out value) && !wordDictionary.ContainsKey(word))
                    {
                        wordDictionary.Add(word, value);
                    }
                    else{
                        skippedLine++;
                    }
                }

                Console.WriteLine(string.Format("Total Lines Read: {0}", lineCount));
                Console.WriteLine(string.Format("Lines Skipped: {0}", skippedLine));
                Console.WriteLine(string.Format("Expected Entries in Dictonary: {0}", lineCount - skippedLine));
                Console.WriteLine(string.Format("Actual Entries in Dictionary: {0}", wordDictionary.Count()));

                reader.Close();
            }
            fileStream.Close();
        }

To score the sentences you could use something like the following.

    string fileText = File.ReadAllText(COMMENTSTEXT); //COMMENTSTEXT = comments.txt
    // assumes sentences end with a period, won't take into account any other periods in sentence
    var sentences = fileText.Split('.'); 

    foreach( string sentence in sentences )
    {
        double sentenceScore = 0;

        foreach (KeyValuePair<string, double> word in wordDictionary)
        {
            sentenceScore += sentence.Split(' ').Count(w => w == word.Key) * word.Value; 
        }

        Console.WriteLine(string.Format("Sentence Score = {0}", sentenceScore));
    }

Upvotes: 0

choz
choz

Reputation: 17868

If you feel like using regex approach, try this

using (FileStream fileStream = new FileStream(FILENAME, FileMode.Open, FileAccess.Read)) {
  using (StreamReader streamReader = new StreamReader(fileStream)) {
    String record = streamReader.ReadLine();
    foreach (String str in record.Split('\t')) {
      Console.WriteLine(Regex.Replace(str, @"[^-?\d+]", String.Empty));
    }
    streamReader.Close();
  }
  fileStream.Close();
}

Tested with words.txt

await -1    awaited -1  awaited -1  award 3 award 2 award 1 award 3 awards 3

Upvotes: 1

Related Questions