Reputation: 443
I try to read data from a text file. I can do it via import. It works fine. My data imported as: UserID|SportID|Rating
There are a lot of users that can like any sport with any rating for example:
User|SportID|Rating
1 2 10
1 3 5
2 1 10
2 3 2
I try to create a new matrix like below
UserID Sport1 Sport2 Sport3
1 (null) 10 5
2 10 (null) 2
I tried to this via "for" and "loop" however there are almost 2000 user and 1000 sports and their data is almost 100000. How can I do this?
Upvotes: 3
Views: 134
Reputation: 1860
I suppose you have already defined null
as a number for simplification.
Null = -1; % or any other value which could not be a rating.
Considering:
nSports = 1000; % Number of sports
nUsers = 2000; % Number of users
Pre-allocate the result:
Rating_Mat = ones(nUsers, nSports) * Null; % Pre-allocation
Then use sub2ind
(similar to this answer):
Rating_Mat (sub2ind([nUsers nSports], User, SportID) = Rating;
Or accumarray
:
Rating_Mat = accumarray([User, SportID], Rating);
assuming that User
and SportID
are Nx1
.
Hope it helps.
Upvotes: 1
Reputation: 1241
To do this fast, you can use a sparse matrix with one dimension UserID
and the other Sports
. The sparse matrix will behave for most things like a normal matrix. Construct it like so
out = sparse(User, SportID, Rating)
where User
, SportID
and Rating
are the vectors corresponding to the columns of your text file.
Note 1: for duplicate of User
and SportID
the Rating
will be summed.
Note 2: empty entries, as were written as (null)
in the question are not stored in sparse matrices, only the non-zero ones (that is the main point of sparse matrices).
Upvotes: 2
Reputation: 8759
You can do the following:
% Test Input
inputVar = [1 2 10; 1 3 5; 2 1 10; 2 3 2];
% Determine number of users, and sports to create the new table
numSports = max(inputVar(1:end,2));
numUsers = max(inputVar(1:end,1));
newTable = NaN(numUsers, numSports);
% Iterate for each row of the new table (# of users)
for ii = 1:numUsers
% Determine where the user rated from input mat, which sport he/she rated, and the rating
userRating = find(inputVar(1:end,1) == ii);
sportIndex = inputVar(userRating, 2)';
sportRating = inputVar(userRating, 3)';
newTable(ii, sportIndex) = sportRating; % Crete the new table based on the ratings.
end
newTable
Which produced the following:
newTable =
NaN 10 5
10 NaN 2
This would only have to run for the amount of users that are in your input table.
Upvotes: 1