Reputation: 2613
I have multiple text files that contain data in this format
File1.txt
subID imageCondition trial textItem imageFile response RT
Participant003 images 7 Is there a refrigerator? 07_targetPresent-refrigerator.jpg z 1.436971
Participant003 images 6 Is there an oven mitt? 06_targetPresent-ovenmitt.jpg z 0.519301
Participant003 images 1 Is there a toaster? 01_targetAbsent-toaster.jpg m 1.110664
Participant003 images 3 Is there a wine bottle? 03_targetAbsent-winebottle.jpg m 1.278945
Participant003 images 2 Is there a kettle? 02_targetAbsent-kettle.jpg z 2.672123
Participant003 images 5 Is there a blender? 05_targetPresent-blender.jpg m 2.633802
Participant003 images 8 Is there a bucket? 08_targetPresent-bucket.jpg m 2.596154
Participant003 images 4 Is there a surf board? 04_targetAbsent-surfboard.jpg m 1.072850
File2.txt
subID imageCondition trial textItem imageFile response RT
Participant005 images 1 Is there a toaster? 01_targetAbsent-toaster.jpg 0.000000
Participant005 images 2 Is there a kettle? 02_targetAbsent-kettle.jpg m 8.213927
Participant005 images 6 Is there an oven mitt? 06_targetPresent-ovenmitt.jpg z 3.569293
Participant005 images 4 Is there a surf board? 04_targetAbsent-surfboard.jpg 0.000000
Participant005 images 3 Is there a wine bottle? 03_targetAbsent-winebottle.jpg m 8.538699
Participant005 images 7 Is there a refrigerator? 07_targetPresent-refrigerator.jpg z 0.857319
Participant005 images 5 Is there a blender? 05_targetPresent-blender.jpg 0.000000
Participant005 images 8 Is there a bucket? 08_targetPresent-bucket.jpg z 1.967220
I want to be able to read this data into a cell array so that I can individually access the values that are present in it.
I have the following code that I use to read the data but it's not helping because I am not able to store the data in a way so that I can access the individual values. For example I want all the values from the 'trial' or 'response' column.
function content = load_data(fileName)
fid = fopen(fileName,'r')
if fid > 0
line_no =1;
oneline{line_no} = fgetl(fid);
while ischar(oneline{line_no})
line_no = line_no +1;
oneline{line_no} = fgetl(fid);
endwhile
fclose(fid)
content = oneline;
endif
endfunction
for i= 1:size(txtFiles,2)
data{i} = load_data(txtFiles{1,i});
end
for i=1:1:length(data)
dataMat = cell2mat(data(i));
for j=1:1:length(dataMat)
line = dataMat{1,j};
% Here I'm only able to fetch lines of data as strings that are separated by more than one space characters, making it more difficult access the required data
endfor
endfor
What I'm looking for is a way to read this data from a text file into a cell array or a matrix so that I can easily access the required values but I am restricted to using the traditional methods of importing data from text file. Or if I could just get help with parsing the data in a way I can access what is required.
Note: There are multiple text files like these. Also It'd be a great help if you can show how to access values in individual columns e.g. 'response' column.
Upvotes: 0
Views: 610
Reputation: 378
This would be easy to do with something like strsplit to split the data based on spaces; except your textItem field has spaces in it. So I would suggest using regular expressions. Using named tokens is a convenient way to organize the results when you're looking for several separate pieces at a time. I realize that if you're not familiar with regular expressions, it's a tough thing to jump into. Check out regex101.com for information and a very useful online tool for testing your regular expression. See this specific example on regex101. That said, here's my answer which works on your data:
text = fileread(filename);
data = regexp(data,'^(?<subID>\w+)\s+(?<imageCondition>\w+)\s+(?<trial>\d+)\s+(?<textItem>.*?\?)\s+(?<imageFile>[-\.\w]+)\s+(?<response>\w)\s+(?<RT>[\d\.]+)','names','lineanchors')
Or you could turn it into a table:
dataTable = struct2table(data)
Result looks like:
subID imageCondition trial textItem imageFile response RT
__________________ ______________ _____ ____________________________ _____________________________________ ________ ____________
{'Participant003'} {'images'} {'7'} {'Is there a refrigerator?'} {'07_targetPresent-refrigerator.jpg'} {'z'} {'1.436971'}
{'Participant003'} {'images'} {'6'} {'Is there an oven mitt?' } {'06_targetPresent-ovenmitt.jpg' } {'z'} {'0.519301'}
{'Participant003'} {'images'} {'1'} {'Is there a toaster?' } {'01_targetAbsent-toaster.jpg' } {'m'} {'1.110664'}
{'Participant003'} {'images'} {'3'} {'Is there a wine bottle?' } {'03_targetAbsent-winebottle.jpg' } {'m'} {'1.278945'}
{'Participant003'} {'images'} {'2'} {'Is there a kettle?' } {'02_targetAbsent-kettle.jpg' } {'z'} {'2.672123'}
{'Participant003'} {'images'} {'5'} {'Is there a blender?' } {'05_targetPresent-blender.jpg' } {'m'} {'2.633802'}
{'Participant003'} {'images'} {'8'} {'Is there a bucket?' } {'08_targetPresent-bucket.jpg' } {'m'} {'2.596154'}
{'Participant003'} {'images'} {'4'} {'Is there a surf board?' } {'04_targetAbsent-surfboard.jpg' } {'m'} {'1.072850'}
If you want to turn the numeric fields into numbers:
dataTable.trial = str2double(dataTable.trial);
dataTable.RT = str2double(dataTable.RT);
Which then gives:
subID imageCondition trial textItem imageFile response RT
__________________ ______________ _____ ____________________________ _____________________________________ ________ ______
{'Participant003'} {'images'} 7 {'Is there a refrigerator?'} {'07_targetPresent-refrigerator.jpg'} {'z'} 1.437
{'Participant003'} {'images'} 6 {'Is there an oven mitt?' } {'06_targetPresent-ovenmitt.jpg' } {'z'} 0.5193
{'Participant003'} {'images'} 1 {'Is there a toaster?' } {'01_targetAbsent-toaster.jpg' } {'m'} 1.1107
{'Participant003'} {'images'} 3 {'Is there a wine bottle?' } {'03_targetAbsent-winebottle.jpg' } {'m'} 1.2789
{'Participant003'} {'images'} 2 {'Is there a kettle?' } {'02_targetAbsent-kettle.jpg' } {'z'} 2.6721
{'Participant003'} {'images'} 5 {'Is there a blender?' } {'05_targetPresent-blender.jpg' } {'m'} 2.6338
{'Participant003'} {'images'} 8 {'Is there a bucket?' } {'08_targetPresent-bucket.jpg' } {'m'} 2.5962
{'Participant003'} {'images'} 4 {'Is there a surf board?' } {'04_targetAbsent-surfboard.jpg' } {'m'} 1.0729
You also asked how to access it. Get the third "response" from the table:
dataTable.response{3}
Or from the structure:
data(3).response
Upvotes: 1