SLearner
SLearner

Reputation: 1439

Unable to store values in a nested Dictionary

I am trying to store movie ratings by users in a Dictionary. The file from which the data is acquired is of the form

UserID | MovieID | Rating | Timestamp

They are tab separated values

        //Take the first 100 lines from the file and store each line as a array element of text 
        string[] text = System.IO.File.ReadLines(@File path).Take(100).ToArray();

        //extDic[username] - [moviename][rating] is the structure

        Dictionary<string,Dictionary<string,double>> extDic=new Dictionary<string,Dictionary<string,double>>();
        Dictionary<string, double> movie=new Dictionary<string,double>();
        foreach(string s in text)
        {
            int rating;
            string username=s.Split('\t')[0];
            string moviename=s.Split('\t')[1];
            Int32.TryParse(s.Split('\t')[2], out rating);
            movie.Add(moviename,rating);
            if (extDic.ContainsKey(username))
            {   
                //Error line
                extDic[username].Add(moviename, rating);
            }
            else
            {
                extDic.Add(username, movie);
            }
            movie.Clear();
        }

I get the following error "An item with the same key has already been added" on the error line. I understand what the error is and have tried to solve it by checking with an if statement. However that doesn't solve it.

Also, I wonder if there is a significant of movie.clear()?

Upvotes: 0

Views: 159

Answers (3)

Lucian
Lucian

Reputation: 4001

The problem might be caused by the fact that you are using the variable movie as a value for all the entries in the extDic dictionary. movie is nothing but a reference, so when you are doing a movie.Clear() you are clearing all the values from extDic.

You could entirely remove the variable movie and replace it with a fresh instance of new Dictionary<string, double>()

string[] text = System.IO.File.ReadLines(@File path).Take(100).ToArray();

//extDic[username] - [moviename][rating] is the structure

Dictionary<string,Dictionary<string,double>> extDic=new Dictionary<string,Dictionary<string,double>>();   
foreach(string s in text)
{
  int rating;
   //split only once
   string[] splitted = s.Split('\t');

  //UPDATE: skip the current line if the structure is not ok
  if(splitted.Length != 3){
      continue;
  }

  string username=splitted[0];
  string moviename=splitted[1];
  Int32.TryParse(splitted[2], out rating);

  //UPDATE: skip the current line if the user name or movie name is not valid
  if(string.IsNullOrWhiteSpace(username) || string.IsNullOrWhiteSpace(moviename)){
      continue;
  }


   if(!extDic.ContainsKey(username)){
      //create a new Dictionary for every new user
      extDic.Add(username, new Dictionary<string,double>());
   }
   //at this point we are sure to have all the keys set up
   //let's assign the movie rating
   extDic[username][moviename] = rating;

}

Upvotes: 1

Rune FS
Rune FS

Reputation: 21742

Your problem is that you are adding the same dictionary to all users so when two users have rated the same movie you will see this exception

int rating;
var result  = from line in text
              let tokens = s.Split('\t')
              let username=tokens[0];
              let moviename=tokens[1];
              where Int32.TryParse(tokens[2], out rating);
              group new {username, Rating=new{moviename,rating}} by username;

The above code will give you a structure that from a tree perspective is similar to your own. If you need the lookup capability you can simply call .ToDictionary

var extDic = result.ToDictionary(x=x.Key, x=>x.ToDictonary(y=>y.moviename,y=>y.rating))

The reason why I rewrote it in to LINQ is that it's a lot hard to make those kinds of mistakes using something that's side effect free like LINQ

Upvotes: 1

Paul Draper
Paul Draper

Reputation: 83253

There must be duplicates of that user and movie.

To fix the error, you can use this for your "error line":

extDic[username][moviename] = rating;

Though there may be other problems afoot.

Upvotes: 3

Related Questions