Reputation: 55
I currently trying to clean up a large dataset using Excel VBA. The dataset structure looks like this.
However, I would like to make it look like this instead, whereby if the cells in columns A:D all contain the same value, transpose the cells in column E. (And remove the duplicated cells from A:D)
Here is the code I did
Dim ws As Worksheet: Set ws = Sheets("test")
lastrow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
Dim j As Integer
j = 6
For i = 2 To lastrow
If (Range("A" & i).Value = Range("A" & i + 1).Value) And (Range("B" & i).Value = Range("B" & i + 1).Value) And (Range("C" & i).Value = Range("C" & i + 1).Value) Then
Cells(i, j).Value = Cells(i + 1, 5).Value
j = j + 1
End If
'Reset J back to 6 if columns A to D does not match previous
If (Range("A" & i).Value <> Range("A" & i + 1).Value) Or (Range("B" & i).Value <> Range("B" & i + 1).Value) Or (Range("C" & i).Value <> Range("C" & i + 1).Value) Then
j = 6
End If
Next i
How can this be done?
Upvotes: 0
Views: 1483
Reputation: 9932
This will work leveraging the newer function and will by dynamic, no vba required.
=LET(alldata,FILTER(A:E,A:A<>""), zSplitter,"?", zColumnCount,4,
allNames,HSTACK(CHOOSECOLS(alldata,SEQUENCE(1,zColumnCount))),
zNames,UNIQUE(allNames),
keysFiltered, BYROW(zNames,LAMBDA(eRow,TEXTJOIN(zSplitter,TRUE,eRow))),
keysFull,BYROW(allNames,LAMBDA(eRow,TEXTJOIN(zSplitter,TRUE,eRow))),
splitNums,BYROW(keysFiltered,LAMBDA(eRow,TEXTJOIN(zSplitter,TRUE,FILTER(CHOOSECOLS(alldata,zColumnCount+1),keysFull=eRow)))),
zElementCount,BYROW(splitNums,LAMBDA(eRow, LEN(eRow)-LEN(SUBSTITUTE(eRow,zSplitter,""))+1)), zMaxCol,MAX(zElementCount),
zSideArray,MAKEARRAY(ROWS(zNames),zMaxCol,LAMBDA(r,c,LET(tempRng,INDEX(splitNums,r,1),tempSplitRng,TEXTSPLIT(tempRng,zSplitter),IF(c>CHOOSEROWS(zElementCount,r),"",CHOOSECOLS(tempSplitRng,c))))),
HSTACK(zNames,zSideArray))
Agree with the Tim Williams this is tricky. I got sort of close to a solution without using VBA in this worksheet (requires spill range enabled). I didn't get a dynamic formula to spill down for the numeric values, but you could make a macro to drag it or something.
You would need the below formula in cell i1
=UNIQUE(FILTER(A:D,NOT(ISBLANK((A:A)))))
The following formula would be in M1
, and dragged down to match the respective columns to the immediate left. You could setup a macro that actually did this for you on a change event. There's probably a way to make this dynamic with an array formula, but I couldn't assemble it in time I tinkered with it.
=TRANSPOSE(FILTER(E:E,(NOT(ISBLANK(E:E))*(A:A&B:B&C:C&D:D=I1&J1&K1&L1))))
Again if you don't have excel spill range capabilities, this won't work. To view with spill range, checkout the excel file via a web browser so it looks like the below image. The gray cells contain the respective formulas.
Upvotes: 1
Reputation: 60389
You can do this pretty easily using Power Query
For the example, I added some rows where the four columns didn't match
To use Power Query
Data => Get&Transform => from Table/Range
Home => Advanced Editor
Applied Steps
to understand the algorithmM Code
let
Source = Excel.CurrentWorkbook(){[Name="Table17"]}[Content],
//set type for all columns as Text
#"Changed Type" = Table.TransformColumnTypes(Source,List.Transform(Table.ColumnNames(Source), each {_, Text.Type})),
//group by first four columns, then aggregate the 5th column semicolon separated
#"Grouped Rows" = Table.Group(#"Changed Type", {"Column1", "Column2", "Column3", "Column4"}, {
{"ColE", each Text.Combine([Column5],";"), Text.Type}
}),
//split the aggregated text into new columns
//may need to edit this step depending on maximum number in the group
#"Split Column by Delimiter" = Table.SplitColumn(#"Grouped Rows", "ColE",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"ColE.1", "ColE.2", "ColE.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{
{"ColE.1", Int64.Type}, {"ColE.2", Int64.Type}, {"ColE.3", Int64.Type}})
in
#"Changed Type1"
Upvotes: 1
Reputation: 166790
This ended up more complex than I'd thought but seems to work OK
Sub Compact()
Const KEY_COLS As Long = 4
Dim ws As Worksheet, i As Long, k As String, nextEmpty As Long
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
Set ws = Sheets("test")
For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
'create a row "key" from first KEY_COLS cells
k = Join(Application.Transpose(Application.Transpose( _
ws.Cells(i, 1).Resize(1, KEY_COLS))), "~~")
If Not dict.exists(k) Then
'move this row up?
If nextEmpty > 0 Then
ws.Cells(i, 1).Resize(1, KEY_COLS + 1).Cut ws.Cells(nextEmpty, 1)
dict.Add k, nextEmpty 'new key - store row#
nextEmpty = 0
Else
dict.Add k, i 'new key - store row#
End If
Else
'seen this key before - move value to that row and clear
ws.Cells(dict(k), Columns.Count).End(xlToLeft).Offset(0, 1).Value = _
ws.Cells(i, KEY_COLS + 1).Value
ws.Rows(i).ClearContents
If nextEmpty = 0 Then nextEmpty = i 'available row
End If
Next i
End Sub
Edit: this is a bit cleaner I think. It's split into separate "read" and "write" parts.
Sub Compact2()
Const KEY_COLS As Long = 4
Const SEP As String = "~~"
Dim ws As Worksheet, i As Long, k, col As Long, v
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
Set ws = Sheets("test")
'collect all the unique combinations and associated values
For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
'create a row "key" from first KEY_COLS cells
k = Join(Application.Transpose(Application.Transpose( _
ws.Cells(i, 1).Resize(1, KEY_COLS))), SEP)
If Not dict.exists(k) Then dict.Add k, New Collection
dict(k).Add ws.Cells(i, KEY_COLS + 1).Value
ws.Rows(i).ClearContents 'clear row
Next i
're-populate the sheet from the dictionary
i = 1
For Each k In dict
ws.Cells(i, 1).Resize(1, KEY_COLS).Value = Split(k, SEP)
col = KEY_COLS + 1
For Each v In dict(k)
ws.Cells(i, col) = v
col = col + 1
Next v
i = i + 1
Next k
End Sub
Upvotes: 4