Reputation: 103
I have a row of data as follows:
header1 header2 header3 header4 header5
row key datavalue1 datavalue2 datavalue3 datavalue4 datavalue5....
so basically, I have a denormalized data set where the datavalues may or may not be empty on a row-by-row basis. I need to normalize them.
ie
12345678 NULL 10 3 NULL 14
would become:
12345678 header2 10
12345678 header3 3
12345678 header5 14
I could do this by using a paste special transform, but I have thousands of rows and I'd need to make sure that I get the right row key for each. furthermore, each row has a bunch of descriptives associated with it that I need copied over with each datavalue.
What is the easiest way to convert each row of columns such that I have multiple rows of a single column with all non-empty datavalues plus the associated datavalue reference? I need to be able to pivot the dataset.
Upvotes: 2
Views: 36872
Reputation: 11
Excel has a transpose feature which may address your needs. It's pretty hidden and a bit clumsy but likely easier than delving into VBA. Here's an excerpt from Excel 2007 Help:
Blockquote Switch (transpose) columns and rows Show AllHide All If data is entered in columns or rows, but you want to rearrange that data into rows or columns instead, you can quickly transpose the data from one to the other.
For example, the regional sales data that is organized in columns appears in rows after transposing the data, as shown in the following graphics.
1.On the worksheet, do the following: To rearrange data from columns to rows, select the cells in the columns that contain the data. To rearrange data from rows to columns, select the cells in the rows that contain the data. 2.On the Home tab, in the Clipboard group, click Copy .
Keyboard shortcut To copy the selected data, you can also press CTRL+C.
Note You can only use the Copy command to rearrange the data. To complete this procedure successfully, do not use the Cut command.
3.On the worksheet, select the first cell of the destination rows or columns into which you want to rearrange the copied data. Note Copy areas (copy area: The cells that you copy when you want to paste data into another location. After you copy cells, a moving border appears around them to indicate that they've been copied.) and paste areas (paste area: The target destination for data that's been cut or copied by using the Office Clipboard.) cannot overlap. Make sure that you select a cell in a paste area that falls outside of the area from which you copied the data.
4.On the Home tab, in the Clipboard group, click the arrow below Paste, and then click Transpose. 5.After the data is transposed successfully, you can delete the data in the copy area. Tip If the cells that you transpose contain formulas, the formulas are transposed and cell references to data in transposed cells are automatically adjusted. To make sure that formulas continue to refer correctly to data in nontransposed cells, use absolute references in the formulas before you transpose them.
For more information, see Switch between relative, absolute, and mixed references.
Blockquote
Upvotes: 1
Reputation: 11
Seems to me that part of what you are trying to do is to "de-pivot" a pivot table. I've found this tip to be a tremendous help when I've had to do similar tasks: http://spreadsheetpage.com/index.php/tip/creating_a_database_table_from_a_summary_table/
Note that in Excel 2007, you can get to the old Excel 2003 pivot table wizard using the keystrokes Alt+D, P .
Upvotes: 1
Reputation: 2666
I would create a VBA macro that loops through each row and output the data to another page. This would let you create your pivot table in the new page once the data has been outputed.
Not sure how familiar you are with VBA, but this could pretty easily be done by loading the data into an array (or collection of objects if you really want to do it correctly) and writing it back out.
Here is a link to a good VBA document.
http://social.msdn.microsoft.com/Forums/en/isvvba/thread/d712dbdd-c876-4fe2-86d2-7d6323b4262c
Edit
Please note this is not meant to be a fully working solution but really a generic framework to help you in the right direction.
As a generic example that does a lot of what you would need to do (not the best way, but probably the easiest for a beginer), something like this should get you started, although it is hard to say without seeing more of your worksheet.
Sub RowsToColumns ()
Application.ScreenUpdating = False
Dim srcWrkSheet As Worksheet
Dim destWrkSheet As Worksheet
Dim excelData as pExcelData
Dim srcRowNumber As Long
Dim srcRolNumber As Long
Dim destRowNumber As Long
Dim destColNumber As Long
SET srcWrkSheet = Sheets("YourSourceWorkSheetName")
SET destWrkSheet = Sheets("YourDestinationWorkSheetName")
srcRowNumber = 1
srcColNumber = 1
destRowNumber = 1
destColNumber = 1
'Loop until blank row is encountered in column 1
Do
destWrkSheet.Cells(destRowNumber ,1).Value = "Header 1 " & srcWrkSheet.Cells(srcRowNumber,srcColNumber )
destWrkSheet.Cells(destRowNumber ,1).Value = "Header 2 " & srcWrkSheet.Cells(srcRowNumber ,srcColNumber)
srcRowNumber = srcRowNumber + 1
srcColNumber = srcColNumber + 1
destRowNumber = destRowNumber + 1
Loop Until srcWrkSheet .Cells(rowNumber, 1).value = ""
End Sub
Upvotes: 0
Reputation: 33118
Let's look at a possible solution in VBA. I think this will really help. Here are a few things you should know about my code.
NULL
. If the cell is empty, you'll want to check for If IsEmpty(rngCurrent.Value) Then
instead.'
Sub NormalizeSheet()
Dim wsOriginal As Worksheet
Dim wsNormalized As Worksheet
Dim strKey As String
Dim clnHeader As Collection
Dim lngColumnCounter As Long
Dim lngRowCounterOriginal As Long
Dim lngRowCounterNormalized As Long
Dim rngCurrent As Range
Dim varColumn As Variant
Set wsOriginal = ThisWorkbook.Worksheets("Original") 'This is the name of your original worksheet'
Set wsNormalized = ThisWorkbook.Worksheets("Normalized") 'This is the name of the new worksheet'
Set clnHeader = New Collection
wsNormalized.Cells.ClearContents 'This deletes the contents of the destination worksheet'
lngColumnCounter = 2
lngRowCounterOriginal = 1
Set rngCurrent = wsOriginal.Cells(lngRowCounterOriginal, lngColumnCounter)
' We'll loop through just the headers to get a collection of header names'
Do Until IsEmpty(rngCurrent.Value)
clnHeader.Add rngCurrent.Value, CStr(lngColumnCounter)
lngColumnCounter = lngColumnCounter + 1
Set rngCurrent = wsOriginal.Cells(lngRowCounterOriginal, lngColumnCounter)
Loop
'Here we'll reset our Row Counter and loop through the entire data set'
lngRowCounterOriginal = 2
lngRowCounterNormalized = 1
lngColumnCounter = 1
Do While Not IsEmpty(wsOriginal.Cells(lngRowCounterOriginal, lngColumnCounter))
Set rngCurrent = wsOriginal.Cells(lngRowCounterOriginal, lngColumnCounter)
strKey = rngCurrent.Value ' Get the key value from the current cell'
lngColumnCounter = 2
'This next loop parses the denormalized values for each row'
Do While Not IsEmpty(wsOriginal.Cells(lngRowCounterOriginal, lngColumnCounter))
Set rngCurrent = wsOriginal.Cells(lngRowCounterOriginal, lngColumnCounter)
'We're going to check to see if the current value'
'is equal to NULL. If it is, we won't add it to'
'the Normalized Table.'
If rngCurrent.Value = "NULL" Then
'Skip it'
Else
'Add this item to the normalized sheet'
wsNormalized.Range("A" & lngRowCounterNormalized).Value = strKey
wsNormalized.Range("B" & lngRowCounterNormalized).Value = clnHeader(CStr(lngColumnCounter))
wsNormalized.Range("C" & lngRowCounterNormalized).Value = rngCurrent.Value
lngRowCounterNormalized = lngRowCounterNormalized + 1
End If
lngColumnCounter = lngColumnCounter + 1
Loop
lngRowCounterOriginal = lngRowCounterOriginal + 1
lngColumnCounter = 1 'We reset the column counter here because we're on a new row'
Loop
End Sub
Upvotes: 0
Reputation: 33175
If you have five "header" columns, enter these formulas
H1: =OFFSET($A$1,INT((ROW()-1)/5)+1,0)
I1: =OFFSET($A$1,0,IF(MOD(ROW(),5)=0,5,MOD(ROW(),5)))
J1: =INDEX($A$1:$F$9,MATCH(H1,$A$1:$A$9,FALSE),MATCH(I1,$A$1:$F$1,FALSE))
Copy H1:J?? and paste special values over the top. Sort on column J and delete anything that's a zero. If you have legitmate zeros in the data, then you first need to replace blank cells with some unique string that you can then delete later.
If you have more columns, then replace the '5' in all the above formulas with whatever number you have.
Upvotes: 3