Tim Whitfield
Tim Whitfield

Reputation: 3

Counting text strings within a range created by matching

I am combining different sections of research data (different spreadsheets) into one tab on a single excel.

The criteria used for matching the data together is the research participant's ID (a 4-character string, e.g. RXGJ). Unfortunately, each section of data have different structure. The main data are organised in long format, so there are 3 rows per participant, corresponding to 3 visits. I have successfully matched these to data that were only collected at a single visit (i.e. biological sex) from other spreadsheets.

This is my issue: one of the tabs has 18 rows per participant, recording other life events (coded using standardised text strings). I want to count how many of each type of event occurred based on the research participant's ID. I think I need to create a range of cells/array based on the research ID (common to both the main data and each of the 18 rows of life events on the other tab), then use COUNTIFS to count the occurrence of each text string. The issue I am having is that MATCH stops at the first matching participant ID, whereas I need to cover all cells where ID MATCHES and then count... any ideas?

Many thanks, Tim

Upvotes: 0

Views: 107

Answers (2)

pgSystemTester
pgSystemTester

Reputation: 9897

Why won't CountiF's work? You're looking to count the number of events based on two sets of conditions, correct? This should work.

Alternatively, you could use SumProduct to get some results that were specific.

In the below screenshot I used: =SUMPRODUCT((A1:A10=E1)*(LEFT(B1:B10,1)="T")) to get all values with ID_200 and column b has a value that starts with "T". You should be able to modify this as needed.

enter image description here

Upvotes: 0

Jenn
Jenn

Reputation: 647

This can be done pretty easily with Excel's Get and Transform Data function (aka Power Query).

Format your data as tables if it's not already in table format. To do this, highlight the range of cells and then Ctrl+T. Then a new tab on the ribbon will open and you'll see a box where you can name your table. In this example, I have three tables named 1) t_UniqueParticipants 2) t_VisitsByParticipant 3) t_LifeEventsByParticipant

enter image description here

It sounds like one of your tables has only one row for each participants. You should use that as your basis.

Then select a cell in the first table and then from the ribbon, go to Data > Get and Transform Data > From Table. The query editor will then open in a new window.

The only thing you need to do here so far is go to Close & Load > Close & Load To. Then the Query Editor window will close and a dialogue box will open.

enter image description here

In the Import Data dialogue box, only create a connection, Load to the Data Model, Okay.

enter image description here

Repeat this for each table you need to match.

Then from go to Data > Data Tools > Create Relationships > New. Then select the unique participants table and participant ID (goes on the bottom), and one of the other tables and the participant ID field and click okay.

enter image description here

Repeat this for the other table(s). Remember, the unique participant table goes on the bottom.

enter image description here

Once you are done, it looks like this. Click Close.

enter image description here

Now you can insert a pivot table to aggregate the data in all of your tables. Insert > Pivot Table > From this workbook's data model. Choose where you want your pivot table to go > Okay.

enter image description here

Now you can design your pivot table in whatever way you like. Just be sure to pull fields from the tables that are connected to the data model, which have a little data model icon next to them. This example shows the participant ID from the Unique Participant table with the Life Event field from the Life Event table.

enter image description here

Upvotes: 1

Related Questions