k.mat27
k.mat27

Reputation: 163

Workaround for dynamically generating a structure - Matlab/Octave issues

I have many, many sets of data in .csv format that I've organized by a file name standard so I can use regular expressions for the second time ever. I have, however, run into a slight problem. My data files are titled things like, "2012001_C335_2000MHZ_P_1111.CSV". There are four years of interest, two frequencies, and four different C335-style labels to describe locations. I have a significant amount of data processing to do on each of these files, so I'd like to read them all into one giant structure and then do my processing on different parts of it. I'm writing:

for ix_id = 1:length(ids)
 for ix_years = 1:2:length(ids_years{ix_id})
  for ix_frq = 1:length(frqs)
   st = [ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq}'_P_1111.CSV'];
   data.(ids_frqs{ix_id}{ix_frq}).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) =...
        dlmread(st);
  end
 end
end

All ids variables are 1x4 cell arrays where each cell contains strings.

This produces the errors: "Error: a cs-list cannot be further indexed" and "Error: invalid assignment to cs-list outside multiple assignment"

I did an internet search for these errors and found a few posts with dates ranging from 2010 to 2012, such as this one and this one, where the author suggests it's a problem with Octave itself. I can do a workaround that involves defining two separate structures by removing the innermost for loop over ix_frq and replacing the lines beginning with "st" and "data" with

data.1500.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) = ...
  dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids{ix_id} '_1500MHZ_P_1111.CSV']);
data.2000.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) = ...
  dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids{ix_id} '_2000MHZ_P_1111.CSV']);

so it seems that the trouble arises when I try to make a more nested structure. I'm wondering if this is unique to Octave or the same in Matlab, and also if there's a slicker workaround than defining two separate structures since I'd like this to be as portable as possible. If you have any insight as to the meaning of the error message, I'm interested in that too. Thanks!

EDIT: Here is the full script - now generates a few dummy .csv files. Runs on Octave v. 3.8

clear all
%this program tests the creation of various structures.  The end goal is to have a structure of the format frequency.beamname.year(1) = matrix of the appropriate file
A = rand(3,2);
csvwrite('2009103_C115_1500MHZ.CSV',A)
csvwrite('2009103_C115_2000MHZ.CSV',A)
csvwrite('2010087_C115_1500MHZ.CSV',A)
csvwrite('2010087_C115_2000MHZ.CSV',A)
csvwrite('2009103_C335_1500MHZ.CSV',A)
csvwrite('2009103_C335_2000MHZ.CSV',A)
csvwrite('2010087_C335_1500MHZ.CSV',A)
csvwrite('2010087_C335_2000MHZ.CSV',A)

data = dir('*.CSV');  %imports all of the files of a directory
files = {data.name};  %cell array of filenames
nfiles = numel(files);

%find all the years
years = unique(cellfun(@(x)x{1},regexp(files,'\d{7}','match'),'UniformOutput',false));  
%find all the beam names
ids = unique(cellfun(@(x)x{1},regexp(files,'([C-I]\d{3})|([C-I]\d{1}[C-I]\d{2})','match'),'UniformOutput',false));
%find all the frequencies
frqs = unique(cellfun(@(x)x{1},regexp(files,'\d{4}MHZ','match'),'UniformOutput',false));

%now, vectorize to cover all the beams
for id_ix = 1:length(ids)
  expression_yrs = ['(\d{7})(?=_' ids{id_ix} ')'];
  listl_yrs = regexp(files,expression_yrs,'match');
  ids_years{id_ix} = unique(cellfun(@(x)x{1},listl_yrs(cellfun(@(x)~isempty(x),listl_yrs)),'UniformOutput',false));  %returns the years for data collected with both the 1500 and 2000 MHZ antennas along each of thebeams
  expression_frqs = ['(?<=' ids{id_ix} '_)(\d{4}MHZ)']; 
  listfrq = regexp(files,expression_frqs,'match'); %finds every frequency that was collected for C115, C335
  ids_frqs{id_ix} = unique(cellfun(@(x)x{1},listfrq(cellfun(@(x)~isempty(x),listfrq)),'UniformOutput',false));
end

%% finally, dynamically generate a structure data.Beam.Year.Frequency
%this works
for ix_id = 1:length(ids)
  for ix_year = 1:length(ids_years{ix_id})
    data1500.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{1}{1} '.CSV']);
    data2000.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{1}{2} '.CSV']);
  end
end

%this doesn't work
for ix_id=1:length(ids)
  for ix_year=1:length(ids_years{ix_id})
    for ix_frq = 1:numel(frqs)
        data.(['F' ids_frqs{ix_id}{ix_frq}]).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq} '.CSV']);
    end 
  end
end

Hopefully, that helps clarify the question - I am not sure of the etiquette here with posting edits and code.

Upvotes: 1

Views: 3183

Answers (1)

carandraug
carandraug

Reputation: 13081

The problem is that when you get to the for loop that is causing a problem, data already exists and is a struct array.

octave> data
data =

  8x1 struct array containing the fields:

    name
    date
    bytes
    isdir
    datenum
    statinfo

When you select a field from a struct array you will get a cs-list (comma-separate list) unless you also index which of the structs in the struct array. See:

octave> data.name
ans = 2009103_C115_1500MHZ.CSV
ans = 2009103_C115_2000MHZ.CSV
ans = 2009103_C335_1500MHZ.CSV
ans = 2009103_C335_2000MHZ.CSV
ans = 2010087_C115_1500MHZ.CSV
ans = 2010087_C115_2000MHZ.CSV
ans = 2010087_C335_1500MHZ.CSV
ans = 2010087_C335_2000MHZ.CSV
octave> data(1).name
ans = 2009103_C115_1500MHZ.CSV

So when you do:

data.(...) = dlmread (...);

you don't get what you were expecting on the left hand side, you will get a cs-list. But I'm guessing this is accidental, since data at the moment only has filenames, so simply create a new empty struct:

data = struct (); # this will clear your previous data
for ix_id=1:length(ids)
  for ix_year=1:length(ids_years{ix_id})
    for ix_frq = 1:numel(frqs)
        data.(['F' ids_frqs{ix_id}{ix_frq}]).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq} '.CSV']);
    end 
  end
end

I would also recommend to think better about your current solution. This code looks overcomplicated to me.

Upvotes: 1

Related Questions