Reputation: 788
I wish to use the makeValidName
function on the following data:
Id Val Random Desc
a 1.1 0.036835624 Bread Cheese
b 2.2 0.020442492 Fish Bread
c -3.3 0.020050676 Cheese Fish
d #N/A 0.017619332 Bread Cheese
e -5.4 0.014973153 Fish Bread
f 6.6 0.014648887 Cheese Fish
g -7.6 0.014071844 Bread Cheese
h 8 0.014013118 Fish Bread
However when I import the table (read from xlsx using readtable) it looks like this:
inputData =
Id Val Random Desc
____ ____________________ ________ ______________
'a ' '1.1' 0.036836 'Bread Cheese'
'b' '2.2' 0.020442 'Fish Bread'
'c' '-3.3' 0.020051 'Cheese Fish'
'd' 'ActiveX VT_ERROR: ' 0.017619 'Bread Cheese'
'e' '-5.4' 0.014973 'Fish Bread'
'f' '6.6' 0.014649 'Cheese Fish'
'g' '-7.6' 0.014072 'Bread Cheese'
'h' '8' 0.014013 'Fish Bread'
How do I prevent it from turning the entries in Val
from numbers to strings? This makes it impossible to use makeValidName
. I need to apply makeValidName
across all rows and columns as the tables are very large and it is not feasible to name the appropriate columns individually. What would then be the most elegant way of achieving this?
Current code:
varnames = inputData.Properties.VariableNames;
for ii = 1:length(varnames)
inputData.(varnames{ii})= matlab.lang.makeValidName(inputData.(varnames{ii}));
end
Produces the error:
Error using matlab.lang.makeValidName (line 72) First input must be string or vector cell array of strings.
and produces undesirable results in columns such as Val
:
inputData =
Id Val Random Desc
___ __________________ ________ _____________
'a' 'x1_1' 0.036836 'BreadCheese'
'b' 'x2_2' 0.020442 'FishBread'
'c' 'x_3_3' 0.020051 'CheeseFish'
'd' 'ActiveXVT_ERROR_' 0.017619 'BreadCheese'
'e' 'x_5_4' 0.014973 'FishBread'
'f' 'x6_6' 0.014649 'CheeseFish'
'g' 'x_7_6' 0.014072 'BreadCheese'
'h' 'x8' 0.014013 'FishBread'
Upvotes: 0
Views: 261
Reputation: 12214
Since it seems like utilizing Excel in the middle is creating more of a headache. I would recommend utilizing basic
mode which will alleviate some of the parsing errors.
From the documentation:
basic
mode is the default for systems without Excel for Windows. Inbasic
mode,readtable
:
- Reads XLS, XLSX, XLSM, XLTX, and XLTM files only.
- Does not support the
'Range'
name-value pair argument when reading XLS files.- Imports all dates as Excel serial date numbers. Excel serial date numbers use a different reference date than MATLAB® date numbers.
This allows us to utilize the TreatAsEmpty
name-value pair argument since it will correctly parse the numeric columns.
inputData = readtable('test.xlsx', 'Basic', 1, 'TreatAsEmpty', '#N/A');
Which returns for the sample case:
inputData =
Id Val Random Desc
___ ____ ________ ______________
'a' 1.1 0.036836 'Bread Cheese'
'b' 2.2 0.020442 'Fish Bread'
'c' -3.3 0.020051 'Cheese Fish'
'd' NaN 0.017619 'Bread Cheese'
'e' -5.4 0.014973 'Fish Bread'
'f' 6.6 0.014649 'Cheese Fish'
'g' -7.6 0.014072 'Bread Cheese'
'h' 8 0.014013 'Fish Bread'
Theoretically this should mean that the numeric data columns are double
arrays and the strings remain in cell
arrays. Therefore, to use matlab.lang.makeValidName
you can test each column with iscell
to see if it's a cell array:
varnames = inputData.Properties.VariableNames;
for ii = 1:length(varnames)
if iscell(inputData.(varnames{ii}))
% If they're strings they're in a cell array
inputData.(varnames{ii})= matlab.lang.makeValidName(inputData.(varnames{ii}));
end
end
Which returns:
inputData =
Id Val Random Desc
___ ____ ________ _____________
'a' 1.1 0.036836 'BreadCheese'
'b' 2.2 0.020442 'FishBread'
'c' -3.3 0.020051 'CheeseFish'
'd' NaN 0.017619 'BreadCheese'
'e' -5.4 0.014973 'FishBread'
'f' 6.6 0.014649 'CheeseFish'
'g' -7.6 0.014072 'BreadCheese'
'h' 8 0.014013 'FishBread'
Upvotes: 1