Reputation: 51
I have been googling at this all day, but as I have zero VBA skills, I can't find anything that works properly and that I can adapt to my needs.
I have a folder with about 4500 text files with the outputs of a hardware test run on computers we process. The text file has a lot of info I don't need. What I do need is to import all these files into a single spreadsheet and then parse them down to 3 desired fields each - and in such a way that is easily readable.
I have had a hard time even finding a VBA macro that import the text files without an error I can't get past. I did find the following one, however it is set up to look for a second column in the file, which in my case screws up the inputted data and throws around values all over the place (think minor formatting issue times 4500 records).
Here is the macro:
Sub test()
Dim myDir As String, fn As String, ff As Integer, txt As String
Dim delim As String, n As Long, b(), flg As Boolean, x
myDir = "c:\test" '<- change to actual folder path
delim = vbTab '<- delimiter (assuming Tab delimited)
Redim b(1 To Rows.Count, 1 To 2)
fn = Dir(myDir & "\*.txt")
Do While fn <> ""
ff = FreeFile
Open myDir & "\" & fn For Input As #ff
Do While Not EOF(ff)
Line Input #ff, txt
x = Split(txt, delim)
n = n + 1
If Not flg Then b(n,2) = fn
If UBound(x) > 0 Then
b(n,1) = x(1)
End If
flg = True
Loop
Close #ff
flg = False
fn = Dir()
Loop
ThisWorkbooks.Sheets(1).Range("a1").Resize(n,2).Value = b
End Sub
And here is a sample of one of the text files:
ILPN Number: I01128204
MAC Address: E0DB55820F85
Hardware:
CPU:Intel(R) Core(TM) i3-2370M CPU @ 2.40GHz
MOTHERBOARD:Dell Inc. - 0G8TPV - A02
BIOS VERSION/DATE:A02 - 08/15/2012
RAM DETECTED:4096 MB (BANK 0: 0/DDR3/1333 - BANK 2: 0/Unknown/0)
MANUFACTURER:Dell Inc.
PRODUCT:Inspiron 3520
SERIAL:B1JW9V1
NIC SPEED/NAME:100 Mbps - Realtek PCIe FE Family Controller, V:8.1
GRAPHIC RES:1366 x 768 (32 bits)
OPTIC DRIVE:HL-DT-ST, DVD+-RW GT80N , A103 (E:)
FIXED DISK:ST500LM012 HN-M500MBB - 465.76 GB (C: - GPT)
- 0.00 GB (D: - MBR)
WINPE:Microsoft Windows 8 Ultimate Edition, 32-bit (build 9200)
Module 126: Result: Ok
Initializing module version='1.0.0.17' with ''
Module Initialization done.
Starting module with: Param1=0x00010010 - Param2=0xFFFF0000
Module start: 14/05/2013 10:38:40
Set language module to: en-US
Found disk: 0 - 'ST500LM012 HN-M500MBB'
All PHYSICALDRIVE will be used as valid target...
Module started properly.
Cleaning element 'Disk ID: 0 - Model: ST500LM012 HN-M500MBB - Size: 465.762 GB' with algorithm 'Basic (random)'...
Clean process Successful
Process took 6099 sec to clean 476940.02MB ~ 78.20 MB/Sec
Closing module with code '1'...
Closing module done with result '0'
Releasing module...
Releasing module done at: 14/05/2013 12:20:19
So, from all that, I need to pull out the "ILPN number", The MAC address, and the line that says "Module 126: Result OK". Everything else can go away. It would be nice if I could get it sorted with columns for the three values, and then each record in its own row.
This is probably a two step process. Either or both solutions would be tremendously helpful. Thank you!
Upvotes: 0
Views: 11652
Reputation: 38745
Restating the task/problem:
Given a lot of structured text files in a folder, extract from each file three (or a multiple of three?) data items (ILPN Number, MAC Address, Test Result) and put them as rows in a .txt/.csv file that can be red by/imported into Excel.
Idea:
Loop over the files, read each into memory, parse/extract the data triple(s) using a RegExp, write them into the output file; use Excel to open the output file, do further manipulation manually.
Plan:
Use a VBScript from the command line to keep things simple.
[If that look good to you and you can answer the question "Does each input file contain one or more info triple(s)?", I'm willing to add some proof of concept code to this concept.]
While you wait:
Given your additional info, I believe that @Ansgar's assumptions about your input data are mostly correct, and I fixed two typos in the code. So give his script (+1) a try by starting a "Command Prompt", creating/changing to a suitable directory, copy the code into a file - say ansgar.vbs
-, adapt the folder/output file specs to your need, and run it via cscript ansgar.vbs
.
Small changes - like putting the whole "Module" line into the 3rd item
If Left(line, 10) = "Module 126" Then
ws.Cells(row, 3).Value = Trim(Split(line, ":")(2))
==>
If Left(line, 7) = "Module " Then
ws.Cells(row, 3).Value = line
are not to hard. If you are lucky, your problem is solved.
Update (wrt comment/subscript):
I think the
ws.Cells(row, 3).Value = Trim(Split(line, ":")(2))
line is to blame. The Split() on ":" should split input lines like
Module 126: Result: Ok
into an array of three elements "Module 126", " Result", and " Ok" numbered/indexed/subscripted from 0 to 2. If Split() does not get two : separators in the line, the resulting array is smaller and the access to element #2 will fail.
You'll have to assign the return value from Split() to a variable, check the size of the array (UBound), look at the faulty lines, and decide whether to ignore them or to change the guarding If condition.
Upvotes: 0
Reputation: 200203
A VBScript for reading the three lines from multiple input files into an Excel sheet could look like this:
Set xl = CreateObject("Excel.Application")
xl.Visible = True
Set wb = xl.Workbooks.Add
Set ws = wb.Sheets(1)
row = 1
ws.Cells(row, 1).Value = "ILPN"
ws.Cells(row, 2).Value = "MAC Address"
ws.Cells(row, 3).Value = "Module 126"
Set fso = CreateObject("Scripting.FileSystemObject")
For Each f In fso.GetFolder("C:\your\folder").Files
If LCase(fso.GetExtensionName(f.Name)) = "txt" Then
row = row + 1
Set stream = f.OpenAsTextStream
ws.Cells(row, 1).Value = Trim(Split(stream.ReadLine, ":")(1))
ws.Cells(row, 2).Value = Trim(Split(stream.ReadLine, ":")(1))
Do Until stream.AtEndOfStream
line = stream.ReadLine
If Left(line, 10) = "Module 126" Then
ws.Cells(row, 3).Value = Trim(Split(line, ":")(2))
Exit Do
End If
Loop
stream.Close
End If
Next
wb.SaveAs "C:\some\folder\output.xls", -4143, , , , False
wb.Close
xl.Quit
As a VBA macro the following should work:
Sub LoadDataFromFiles
row = 1
ActiveSheet.Cells(row, 1).Value = "ILPN"
ActiveSheet.Cells(row, 2).Value = "MAC Address"
ActiveSheet.Cells(row, 3).Value = "Module 126"
Set fso = CreateObject("Scripting.FileSystemObject")
For Each f In fso.GetFolder("C:\your\folder").Files
If LCase(fso.GetExtensionName(f.Name)) = "txt" Then
row = row + 1
Set stream = f.OpenAsTextStream
ActiveSheet.Cells(row, 1).Value = Trim(Split(stream.ReadLine, ":")(1))
ActiveSheet.Cells(row, 2).Value = Trim(Split(stream.ReadLine, ":")(1))
Do Until stream.AtEndOfStream
line = stream.ReadLine
If Left(line, 10) = "Module 126" Then
ActiveSheet.Cells(row, 3).Value = Trim(Split(line, ":")(2))
Exit Do
End If
Loop
stream.Close
End If
Next
ActiveWorkbook.Save
End Sub
Upvotes: 1