Kenneth
Kenneth

Reputation: 1364

Excel - lookup on one column, result from second column

The first three columns exist. I am trying to create a formula for the fourth (HH_ANALYSIS_FLAG).

ACCOUNT_NUMBER   HOUSEHOLD_NUMBER   ACCOUNT_ANALYSIS_FLAG   HH_ANALYSIS_FLAG
1001             1                  1                       0
1002             2                  0                       0
1003             3                  1                       0
1004             3                  0                       0
1005             3                  0                       0
1006             2                  0                       0
1007             4                  0                       0
1008             1                  1                       0

I have 50,000 accounts. They are flagged as being under analysis with the ACCOUNT_ANALYSIS_FLAG column (0,1). All accounts belong to a household. Multiple accounts can belong to the same household. I need the HH_ANALYSIS_FLAG column to evaluate to true or false (0,1) if any account in the same household is under analysis. So with the above data and a working formula, my spreadsheet would look like so:

ACCOUNT_NUMBER   HOUSEHOLD_NUMBER   ACCOUNT_ANALYSIS_FLAG   HH_ANALYSIS_FLAG
1001             1                  1                       1
1002             2                  0                       0
1003             3                  1                       1
1004             3                  0                       1
1005             3                  0                       1
1006             2                  0                       0
1007             4                  0                       0
1008             1                  1                       1

Upvotes: 1

Views: 1025

Answers (4)

Tauren
Tauren

Reputation: 37

Kenneth! Try this one:

=IF(VLOOKUP(B2,$B$2:$C$9,2,0)=1,1,0)

Assuming your table starts from A1, which means Account_Number is in cell A1, and your target column "HH_ANALYSIS_FLAG" is in column D.

Hope it's helpful

Upvotes: 0

torak
torak

Reputation: 5802

The following formula should do the trick. In fact, it will give you the total number of accounts being analysed per household.

    A        B       C                  D
1   ACC_NUM  HH_NUM  ACC_ANALYSIS_FLAG  HH_ANALYSIS_FLAG      
2   1001     1       1                  =SUMIF(B$2:B$50001, B2, C$2:c$50001)
3   1002     2       0                  =SUMIF(B$2:B$50001, B3, C$2:c$50001)
4   1003     3       1                  =SUMIF(B$2:B$50001, B4, C$2:c$50001)

For each row this takes selects the set of rows that share the value in the ACC_NUM column (based on the row conaining the formula) and sums together the values in the corresponding ACC_ANALYSIS_FLAG columns. This gives you the total number of accounts under analysis for the given household. Compare the result to 0 if you only need to use it as a boolean value.

EDIT:

Apparently the performance of this isn't up to snuff. However, assuming the the household numbers are all colocated, it should be possible to speed things up significantly by changin to something like the following.

2    1001     1       1                  =SUMIF(B2:B5,  B2, C2:C5)
3    1002     2       0                  =SUMIF(B2:B6,  B3, C2:C6)
4    1003     2       0                  =SUMIF(B2:B7,  B3, C2:C7)
5    1004     2       0                  =SUMIF(B2:B8,  B3, C2:C8)
6    1005     2       0                  =SUMIF(B3:B9,  B3, C3:C9)
7    1006     2       0                  =SUMIF(B4:B10, B3, C4:C10)
8    1007     2       0                  =SUMIF(B5:B11, B3, C5:C11)
9    1008     2       0                  =SUMIF(B6:B12, B3, C6:C12)
10   1009     2       0                  =SUMIF(B7:B13, B3, C7:C13)

This assumes that there are at most 4 accounts per household, and thus limits the range of the SUMIF to the current cell +/- 3 rows.

To avoid referencing invalid cells you'll the first and last rows have to be treated as special cases. If you need to generate a single forumala for all of these cells I think it should be possible using the OFFSET in combination with MAX, MIN and ROW to generate the appropriate ranges with just a little arithmatic.

Upvotes: 4

Tom
Tom

Reputation: 1

Presuming your HOUSEHOLD_NUMBER column is column B:

=IF(SUMIF(B:B,C:C)>0,1,0)

should do it.

Upvotes: 0

Justin
Justin

Reputation: 6711

Insert another column D (you can hide it later), which is equal to the household number if it is being analyzed, and zero if it is not. The formula for D2 can be =B2*C2. Fill column D with this formula.

Then for your HH_ANALYSIS_FLAG column, you can count the number of values in column D which match the household in column B. The formula would be like IF(COUNTIF(D:D,"="&B2)>0,1,0).

I'm not sure whether this approach is fast enough for the 50,000 accounts, though.

          A                B                    C                    D                 E
1   ACCOUNT_NUMBER  HOUSEHOLD_NUMBER  ACCOUNT_ANALYSIS_FLAG  HH_UNDER_ANALYSIS HH_ANALYSIS_FLAG
2   1001            1                 1                      1 (=B2*C2)        =IF(COUNTIF(D:D,"="&B2)>0,1,0)  
3   1002            2                 0                      0 (=B3*C3)        =IF(COUNTIF(D:D,"="&B3)>0,1,0)
4   1003            3                 1                      3 (=B4*C4)        =IF(COUNTIF(D:D,"="&B4)>0,1,0)        

Upvotes: 0

Related Questions