yossico
yossico

Reputation: 3523

Matlab: Load Users-Items-Ratings CSV into Ratings matrix

I have the following CSV format for users rating items:

A1YS,8F20,3.0
A3TS,8320,2.0
A3BU,1905,5.0
A3BU,3574,4.0
A14X,185A,1.0

The columns are UserID,ItemID,Rating

I want to load it to a Matlab matrix with rows for users, columns for items and cell values will contain the ratings (unknown rating equals zero) in example:

      8F20,  1905,  3574,  185A
A1YS    3 ,    0 ,    0 ,    0 
A3TS    2 ,    0 ,    0 ,    0 
A3BU    0 ,    5 ,    4 ,    0 
A14X    0 ,    0 ,    0 ,    1 

Another thing, actually the matrix can be formed as:

3 ,    0 ,    0 ,    0 
2 ,    0 ,    0 ,    0 
0 ,    5 ,    4 ,    0 
0 ,    0 ,    0 ,    1 

I'm quite new to Matlab and tried some variations of:

https://stackoverflow.com/a/13775907/1726419 and https://stackoverflow.com/a/19613301/1726419

without big success - I'll be very thankful for any assistance.

EDIT: What I've got so far is:

fid = fopen('ratings_sample.csv');
out = textscan(fid,'%s%s%d%d','delimiter',',');
fclose(fid);

c1 = out{1};
c2 = out{2};
c3 = out{3};

My problem is that I need duplicate removal of both c1 & c2 and to fill in properly the inner cells of the matrix. plus, I don't know if this is the proper way to load it.

Upvotes: 1

Views: 109

Answers (1)

EBH
EBH

Reputation: 10450

If UserID and ItemID are unique, you can use crosstab:

UserID = categorical(c1);
ItemID = categorical(c2);
Rating = crosstab(UserID,ItemID);
Rating(Rating==1) = c3;

and get:

Rating =
     3     0     0     0     0
     0     0     0     0     1
     0     2     5     0     0
     0     0     0     4     0

If you want to organize it in a table, you need to first convert the item's ID to a valid variable name (that starts with a letter):

Items = cellfun(@(s) ['Item_' s],c2,'un',0);

and then you can use a table to hold all the data:

Tbl = array2table(Rating,...
                  'RowNames',unique(c1,'stable'),...
                  'VariableNames',unique(Items,'stable'))

the result:

Tbl =
  4×5 table
            Item_8F20    Item_8320    Item_1905    Item_3574    Item_185A
            _________    _________    _________    _________    _________
    A1YS    3            0            0            0            0        
    A3TS    0            0            0            0            1        
    A3BU    0            2            5            0            0        
    A14X    0            0            0            4            0        

Upvotes: 2

Related Questions