Reputation: 55
I am attempting to query a file that contains a list of personnel data using just the identifying number using VBScript. Currently I have a data set file that contains all of the personnel data and a query file that has the ID numbers that I am wanting to query results from the data set. Once the query has resulted in a match, I would like to output the line to a results file.
Here is a generic example of the data contained within the data set file and the query file.
Data set:
ID,Name,City,State,Zipcode,Phone 1885529946,Hall,Louisville,KY,40208,5026366683 1886910320,Brown,Sacramento,CA,95814,5302981550 1953250581,Rios,Sterling,OK,73567,5803658077 1604767393,Barner,Irvine,CA,92714,9494768597 1713746771,Herrera,Stotts City,MO,65756,4172852393 1022686106,Moore,Ocala,FL,34471,3526032811 1579121274,Beyer,Alexandria,MD,22304,3013838430 1288569655,Rondeau,Augusta,GA,30904,7066671404 1954615404,Angel,Los Angeles,CA,90014,5622961806 1408747874,Lagasse,Traverse City,MI,49686,2318182792
Query file:
1885529946 1713746771 1408747874
I am able to read all the lines in the query file and display the ID number using WScript.Echo
. No errors are generated and the script does not end nor does it generate the results file. The results file should only contain the line from the data set that matches on the ID number. For example:
1885529946,Hall,Louisville,KY,40208,5026366683 1713746771,Herrera,Stotts City,MO,65756,4172852393 1408747874,Lagasse,Traverse City,MI,49686,2318182792
Here is the script that I am attempting to use:
Const intForReading = 1
Const intForWriting = 2
Const intForAppending = 8
strQueryFile = "C:\numbers_test.txt"
strDataSetFile = "C:\data_test.csv"
strOutputFile = "C:\results_test.csv"
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFileToRead = objFSO.OpenTextFile(strQueryFile, intForReading)
Set objFileToQuery = objFSO.OpenTextFile(strDataSetFile, intForReading)
Set objFileToWrite = objFSO.OpenTextFile(strOutputFile, intForWriting, True)
Do Until objFileToQuery.AtEndOfStream
Do Until objFileToRead.AtEndOfStream
strNumber = objFileToRead.ReadLine()
WScript.Echo strNumber
strLine = objFileToQuery.ReadLine()
If InStr(strLine,strNumber) > 0 Then strFoundText = strLine
objFileToWrite.WriteLine(strFoundText)
Loop
Loop
objFileToQuery.Close
objFileToRead.Close
objFileToWrite.Close
Upvotes: 1
Views: 66
Reputation: 984
I like to use ADODB for these kind of tasks, and treat the input files as a database. The trick typically is to find the right connection string for your system and use a Schema.ini file where necessary.
option explicit
Const adClipString = 2
dim ado: set ado = CreateObject("ADODB.Connection")
' data files are in this folder
' using the old JET driver
ado.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=.\;Extended Properties=""text;HDR=Yes;FMT=Delimited"";"
' or maybe use ACE if installed
' ado.ConnectionString = "Driver=Microsoft Access Text Driver (*.txt, *.csv);Dbq=.\;Extensions=asc,csv,tab,txt;"
ado.open
' query is in a CSV too, so we can access as a table
' the column names are given in Schema.ini
const QUERY = "SELECT * FROM [data_test.csv] WHERE ID IN (SELECT ID FROM [query_test.csv])"
' or literals
' const QUERY = "SELECT * FROM [data_test.csv] WHERE ID IN ('1885529946', '1713746771', '1408747874')"
dim rs: set rs = ado.Execute(QUERY)
' convenient GetString() method allows formatting the result
' this could be written to file instead of outputting to console
WScript.Echo rs.GetString(adClipString, , vbTab, vbNewLine, "[NULL]")
'or create a new table!
'delete results table if exists
' catch an error if the table does not exist
on error resume next
' for some reason you need to use #csv not .csv here
ado.Execute "DROP TABLE result#csv"
if err then
WScript.Echo err.description
end if
on error goto 0
ado.Execute("SELECT * INTO [result.csv] FROM [data_test.csv] WHERE ID IN (SELECT ID FROM [query_test.csv])")
rs.close
ado.close
Schema.ini file
[data_test.csv]
Format=CSVDelimited
ColNameHeader=True
Col1=ID Text
Col2=Name Text
Col3=City Text
Col4=Zipcode Text
Col5=Phone Text
[query_test.csv]
Format=CSVDelimited
ColNameHeader=False
Col1=ID Text
Upvotes: 2
Reputation: 338406
The problem in your code is that those files are opened as streams. Once you have reached the end of such a stream (i.e. .AtEndOfStream
becomes true, for example after repeatedly calling .ReadLine()
), it won't magically rewind to the beginning of the file. Your "nested loop" approach would require rewinding of the query file in order to function.
This would be possible by closing and re-opening the stream, but it is not very efficient. It's also not very efficient to compare all numbers against every line from the input file. I recommend you use a Dictionary object to store the numbers in the query file. Dictionaries store key-value pairs and are optimized for very fast key lookup (via .Exists(someKey)
), so they are a perfect fit for this task.
This way you can find out very quickly if a line should be written to the output file:
Const intForReading = 1
Const intForWriting = 2
Const intForAppending = 8
strQueryFile = "C:\numbers_test.txt"
strDataSetFile = "C:\data_test.csv"
strOutputFile = "C:\results_test.csv"
Set objFSO = CreateObject("Scripting.FileSystemObject")
' first import the query file into a dictionary for easy lookup
Set numbers = CreateObject("Scripting.Dictionary")
With objFSO.OpenTextFile(strQueryFile, intForReading)
Do Until .AtEndOfStream
' we are only interested in the key for this task, the value is completely irrelevant.
numbers.Add .ReadLine(), ""
Loop
.Close
End With
Set objFileToWrite = objFSO.OpenTextFile(strOutputFile, intForWriting, true)
With objFSO.OpenTextFile(strDataSetFile, intForReading)
Do Until .AtEndOfStream
line = .ReadLine()
columns = Split(line, ",")
currentNumber = columns(0)
If numbers.Exists(currentNumber) Then objFileToWrite.WriteLine(line)
Loop
.Close
End With
objFileToWrite.Close
Upvotes: 3