Reputation: 129
I'm trying to figure out how to extract rows from a table if a certain variable of a row has a certain value. I know how to do this in R but can't figure it out in Matlab. For example, let's say this is my table:
Var1 Var2 Var3
_____ _____ ______
1.0 2.0 'class 1'
1.1 2.1 'class 2'
1.2 2.2 'class 3'
1.3 2.3 'class 1'
I'm trying to figure out how to get all the rows where Var3 has the value "class 1". Concretely, I want this:
Var1 Var2 Var3
_____ _____ ______
1.0 2.0 'class 1'
1.3 2.3 'class 1'
So far, I've tried using a keyword argument, outlined in this post, as well as using matlab rows to try and sort everything. Neither has worked.
Let's say T is my table. First, I tried
T(T.Var5 == 'class 1',:)
but got the error:
Undefined operator '==' for input arguments of type 'cell'.
Then, I decided to get a little creative, and saw you could create row names in the Matlab documentation. So I did this:
A = T{:,{1:2}};
B = T{:,{3}};
B = table2array(B);
A.Properties.RowNames = B;
but I got the error:
Duplicate row name: 'class 1'.
Am I doing something wrong here? Is there an easy way to do this in Matlab?
Any help is appreciated. Thanks.
Upvotes: 1
Views: 2888
Reputation: 12214
You can use findgroups
to group your data.
For example:
a = [1.0; 1.1; 1.2; 1.3];
b = [2.0; 2.1; 2.2; 2.3];
c = {'class 1'; 'class 2'; 'class 3'; 'class 1'};
T = table(a, b, c);
[groupidx, group] = findgroups(T.c);
T_class1 = T(groupidx==1, :)
Which returns:
T_class1 =
2×3 table
a b c
___ ___ _________
1 2 'class 1'
1.3 2.3 'class 1'
findgroups
will return the group index of each row, along with an optional output of all the unique rows. In my example I'm assuming that 'class 1'
is the first output, but you can make the explicit comparison with strcmp
to make a more robust solution.
Speaking of strcmp
, you can perform a similar indexing operation if you're looking for a specific string.
For example, you can do:
T_class1 = T(strcmp(T.c, 'class 1'), :)
Which also returns:
T_class1 =
2×3 table
a b c
___ ___ _________
1 2 'class 1'
1.3 2.3 'class 1'
The advantage of findgroups
is that it fits into a splitapply
workflow, allowing you to group and perform options on your table's data.
For example, we can find the mean of our a
data by class in a few lines:
[groupidx, group] = findgroups(T.c);
mean_a = splitapply(@mean, T.a, groupidx);
outT = table(group, mean_a)
Which gives us:
outT =
3×2 table
group mean_a
_________ ______
'class 1' 1.15
'class 2' 1.1
'class 3' 1.2
Upvotes: 6