Mr Xaero
Mr Xaero

Reputation: 55

Query data set using number in a file

I am attempting to query a file that contains a list of personnel data using just the identifying number using VBScript. Currently I have a data set file that contains all of the personnel data and a query file that has the ID numbers that I am wanting to query results from the data set. Once the query has resulted in a match, I would like to output the line to a results file.

Here is a generic example of the data contained within the data set file and the query file.

Data set:

ID,Name,City,State,Zipcode,Phone
1885529946,Hall,Louisville,KY,40208,5026366683
1886910320,Brown,Sacramento,CA,95814,5302981550
1953250581,Rios,Sterling,OK,73567,5803658077
1604767393,Barner,Irvine,CA,92714,9494768597
1713746771,Herrera,Stotts City,MO,65756,4172852393
1022686106,Moore,Ocala,FL,34471,3526032811
1579121274,Beyer,Alexandria,MD,22304,3013838430
1288569655,Rondeau,Augusta,GA,30904,7066671404
1954615404,Angel,Los Angeles,CA,90014,5622961806
1408747874,Lagasse,Traverse City,MI,49686,2318182792

Query file:

1885529946
1713746771
1408747874

I am able to read all the lines in the query file and display the ID number using WScript.Echo. No errors are generated and the script does not end nor does it generate the results file. The results file should only contain the line from the data set that matches on the ID number. For example:

1885529946,Hall,Louisville,KY,40208,5026366683
1713746771,Herrera,Stotts City,MO,65756,4172852393
1408747874,Lagasse,Traverse City,MI,49686,2318182792

Here is the script that I am attempting to use:

Const intForReading = 1
Const intForWriting = 2
Const intForAppending = 8

strQueryFile = "C:\numbers_test.txt"
strDataSetFile = "C:\data_test.csv"
strOutputFile = "C:\results_test.csv"

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFileToRead = objFSO.OpenTextFile(strQueryFile, intForReading)
Set objFileToQuery = objFSO.OpenTextFile(strDataSetFile, intForReading)
Set objFileToWrite = objFSO.OpenTextFile(strOutputFile, intForWriting, True)

Do Until objFileToQuery.AtEndOfStream
    Do Until objFileToRead.AtEndOfStream
        strNumber = objFileToRead.ReadLine()
        WScript.Echo strNumber
        strLine = objFileToQuery.ReadLine()
        If InStr(strLine,strNumber) > 0 Then strFoundText = strLine
        objFileToWrite.WriteLine(strFoundText)
    Loop
Loop

objFileToQuery.Close
objFileToRead.Close
objFileToWrite.Close

Upvotes: 1

Views: 66

Answers (2)

I like to use ADODB for these kind of tasks, and treat the input files as a database. The trick typically is to find the right connection string for your system and use a Schema.ini file where necessary.

option explicit

Const adClipString = 2

dim ado: set ado = CreateObject("ADODB.Connection")
' data files are in this folder
' using the old JET driver
ado.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=.\;Extended Properties=""text;HDR=Yes;FMT=Delimited"";"
' or maybe use ACE if installed
' ado.ConnectionString = "Driver=Microsoft Access Text Driver (*.txt, *.csv);Dbq=.\;Extensions=asc,csv,tab,txt;"
ado.open

' query is in a CSV too, so we can access as a table
' the column names are given in Schema.ini
const QUERY = "SELECT * FROM [data_test.csv] WHERE ID IN (SELECT ID FROM [query_test.csv])"
' or literals 
' const QUERY = "SELECT * FROM [data_test.csv] WHERE ID IN ('1885529946', '1713746771', '1408747874')"

dim rs: set rs = ado.Execute(QUERY)

' convenient GetString() method allows formatting the result
' this could be written to file instead of outputting to console
WScript.Echo rs.GetString(adClipString, , vbTab, vbNewLine, "[NULL]")

'or create a new table!
'delete results table if exists
' catch an error if the table does not exist
on error resume next
' for some reason you need to use #csv not .csv here
ado.Execute "DROP TABLE result#csv"
if err then
    WScript.Echo err.description
end if
on error goto 0

ado.Execute("SELECT * INTO [result.csv] FROM [data_test.csv] WHERE ID IN (SELECT ID FROM [query_test.csv])")

rs.close
ado.close

Schema.ini file

[data_test.csv]
Format=CSVDelimited
ColNameHeader=True

Col1=ID Text
Col2=Name Text
Col3=City Text 
Col4=Zipcode Text
Col5=Phone Text


[query_test.csv]
Format=CSVDelimited
ColNameHeader=False 

Col1=ID Text

Upvotes: 2

Tomalak
Tomalak

Reputation: 338406

The problem in your code is that those files are opened as streams. Once you have reached the end of such a stream (i.e. .AtEndOfStream becomes true, for example after repeatedly calling .ReadLine()), it won't magically rewind to the beginning of the file. Your "nested loop" approach would require rewinding of the query file in order to function.

This would be possible by closing and re-opening the stream, but it is not very efficient. It's also not very efficient to compare all numbers against every line from the input file. I recommend you use a Dictionary object to store the numbers in the query file. Dictionaries store key-value pairs and are optimized for very fast key lookup (via .Exists(someKey)), so they are a perfect fit for this task.

This way you can find out very quickly if a line should be written to the output file:

Const intForReading = 1
Const intForWriting = 2
Const intForAppending = 8

strQueryFile = "C:\numbers_test.txt"
strDataSetFile = "C:\data_test.csv"
strOutputFile = "C:\results_test.csv"

Set objFSO = CreateObject("Scripting.FileSystemObject")

' first import the query file into a dictionary for easy lookup
Set numbers = CreateObject("Scripting.Dictionary")    
With objFSO.OpenTextFile(strQueryFile, intForReading)
    Do Until .AtEndOfStream
        ' we are only interested in the key for this task, the value is completely irrelevant.
        numbers.Add .ReadLine(), ""
    Loop
    .Close
End With

Set objFileToWrite = objFSO.OpenTextFile(strOutputFile, intForWriting, true)    
With objFSO.OpenTextFile(strDataSetFile, intForReading)
    Do Until .AtEndOfStream
        line = .ReadLine()
        columns = Split(line, ",")
        currentNumber = columns(0)
        If numbers.Exists(currentNumber) Then objFileToWrite.WriteLine(line)
    Loop
    .Close
End With

objFileToWrite.Close

Upvotes: 3

Related Questions