ManishChristian
ManishChristian

Reputation: 3784

Compare text files - Igonring some texts using VBscripting

First of all, I am not from programming background and totally new to VBscript. For some reasons, I have to do a this scripting task at office. I want to use it with Quick Test Professional 11.

I have gone through many posts here as well on other forums but not able to find the required information.

Ok so here's what I need to do:

I have to compare two text files and write the difference in third file. Both the files have almost same content aprt from some field, ie: Date, Order no and so on.

For example, File-1 has Date: 00/11/1234 and Order no: 1111 and File-2 has Date: 11/00/6789 and Order no: 2222 So is there any way that I can ignore these fields and its value? Is there any way that I can create ignore list which I can add and which will be used during comparison and will skip the fields during comparison? So my difference file will not have these difference as these values will always be different. And so I can get the all other differences in my result file.

For your reference, here is the sample files.

So far I have compared both the files but in a simplest way, I don't know how to ignore the fields. I want to make these task as a function so that I can use it in my function library.

File-1

Date: 00/11/1234 / Order no: 1111

Price 1: $1111.00

Price 2: $2222.00

Price 3: $1234.00

ABC def GHI kjl 1111

Order no: 1111

Term: 2-Year

Date: 00/11/1234

File-2

Date: 11/00/6789 and Order no: 2222

Price 1: $1111.00

Price 2: $2222.00

Price 3: $5678.00

ABC def GHI kjl 1111

Order no: 2222

Term: 3-Year

Date: 11/00/6789

Result file should display:

Differences:

File-1 Line 4: Price 3: $1234.00

File-2 Line 4: Price 3: $5678.00

File-1 Line 7: Term: 2-Year

File-2 Line 7: Term: 3-Year

Thank you very much in advance.

Hi @Ekkehard.Horner Thank a lot for your help and time and for tolerating my sily questions. The fact is, the more I tried to understand your code, the more I got confused. When I put this below code in Quick Test Pro_11, it throws me syntax error @ "Dim oDiffer : Set oDiffer = New cDiffer.init("C:......" QTP is saying "Expected End of STatement" between "... New cDiffer" and ".init" QTP showed me error in both function "TrailVersion" as well as in function "GoldVersion"

It will be more than great if you will throw some light on this. And is it necessary to have "Expected" text file...? As I dont want to include that part because otherwise I've to create "Expected" file for my every comparison.

Please pardon me my sily questions.

Thanks in advance.

Class cDiffer   
Option Explicit  
Dim goFS : Set goFS = CreateObject("Scripting.FileSystemObject")  
    WScript.Quit TrialVersion() 
    WScript.Quit TinVersion()  
Function TinVersion()   
    WScript.Echo "can't compare files yet."   
    TinVersion = 1 
End Function ' TinVersion  
HERE I'VE COMMENTED TRIALVERSION FUNCTION 
Function TrialVersion()   
Dim oDiffer : Set oDiffer = New cDiffer.init("C:\Documents and Settings\24800\My Documents\PDF comparison\A_30120625003267.TXT", "C:\Documents and Settings\aa24800\My Documents\PDFcomparison\B_30120502002776.TXT", Array("Quote ID:", "Quote Summary for:", "Quote Date:", "Tracking ID (A):", "Tracking ID (Z):", "Tracking ID:")
    ' the differ should be able to return a result - the differences   
    Dim sRes : sRes = oDiffer.diffs()   
    ' check actual vs. expected result   
    Dim sExp : sExp = goFS.OpenTextFile("Expected").ReadAll()  
    WScript.Echo "--------- Res"   
    WScript.Echo sRes   
    If sExp = sRes Then      
        WScript.Echo "ok"      
        ' save result      
        goFS.CreateTextFile("C:\Documents and Settings\aa24800\My Documents\PDF comparison\Result.TXT", True).Write sRes      
        TrialVersion = 0   Else      
        ' show failure      
        WScript.Echo "--------- Exp"      
        WScript.Echo sExp      
        WScript.Echo "not ok"      
        TrialVersion = 1   
    End If 
End Function ' TrialVersion  
'trivial Differ 
'Class cDiffer   
    Dim m_sLFSpec : m_sLFSpec = "C:\Documents and Settings\aa24800\My Documents\PDF comparison\A_30120625003267.TXT"
Dim m_sRFSpec : m_sRFSpec = "C:\Documents and Settings\aa24800\My Documents\PDF comparison\B_30120502002776.TXT"   
    ' "constructor" with params   
    Public Function init(sLFSpec, sRFSpec)     
        Set init  = Me     
        m_sLFSpec = sLFSpec     
        m_sRFSpec = sRFSpec   
    End Function   
    Public Function diffs()     
        diffs = "cDiffer.diffs() not implemented yet."   
    End Function ' diffs 
'End Class ' cDiffer00
'gold Differ 
'Class cDiffer   
'   Private m_sLFSpec   ' file specs   
'   Private m_sRFSpec   
    Private m_sLFiNa    ' file names   
    Private m_sRFiNa   
    Private m_dicLabels ' store and efficiently find selective labels   
    ' "constructor" with params   
    Public Function init(sLFSpec, sRFSpec, aLabels)     
        Set init  = Me     
        m_sLFSpec = sLFSpec     
        m_sRFSpec = sRFSpec     
        m_sLFiNa  = goFS.GetBaseName(sLFSpec)     
        m_sRFiNa  = goFS.GetBaseName(sRFSpec)     
        Set m_dicLabels = CreateObject("Scripting.Dictionary")     
        m_dicLabels.CompareMode = vbTextCompare ' case-insensitive     
        Dim sKey     
        For Each sKey In aLabels         
            m_dicLabels(sKey) = 0     
        Next   
    End Function   
    Public Function diffs()     ' Use ArrayList to collect the results     
        Dim alRes : Set alRes = CreateObject("System.Collections.ArrayList")     
        ' requested title     
        alRes.Add "Differences:"     
        ' open both input files     
        Dim tsL   : Set tsL   = goFS.OpenTextFile(m_sLFSpec)     
        Dim tsR   : Set tsR   = goFS.OpenTextFile(m_sRFSpec)     
        ' loop over lines     
        Do Until tsL.AtEndOfStream        
            Dim sLL : sLL = tsL.ReadLine()        
            Dim sRL        
            ' second file could be shorter        
            If tsR.AtEndOfStream Then           
                alRes.Add "tsR.AtEndOfStream"           
        Exit Do        
            Else           
                sRL = tsR.ReadLine()        
            End If        
            ' no need for work if lines are equal        
            If sLL <> sRL Then           
                If m_dicLabels.Exists(Split(sLL, ":")(0))Then                  
                Dim sLiNo : sLiNo = CStr(tsL.Line - 1)& ":"              
            alRes.Add Join(Array(m_sLFiNa, "Line", sLiNo, sLL))              
            alRes.Add Join(Array(m_sRFiNa, "Line", sLiNo, sRL))           
            End If        
        End If     
    Loop     
    tsL.Close     
    tsR.Close     
    diffs = Join(alRes.ToArray(), vbCrLf) & vbCrLf   
End Function ' diffs 
End Class ' cDiffer

Function GoldVersion()   
   ' the differ should know about the files to compare   
   ' and the info labels to select   
   Dim oDiffer : Set oDiffer = New cDiffer.init("C:\Documents and Settings\aa24800\My Documents\PDF comparison\A_30120625003267.TXT", "C:\Documents and Settings\aa24800\My Documents\PDF comparison\B_30120502002776.TXT", Array("Quote ID:", "Quote Summary for:", "Quote Date:", "Tracking ID (A):", "Tracking ID (Z):", "Tracking ID:")
   ' the differ should be able to return a result - the differences    
   Dim sRes : sRes = oDiffer.diffs()   
   ' check actual vs. expected result   
   Dim sExp : sExp = goFS.OpenTextFile("Expected").ReadAll()   
   WScript.Echo "--------- Res"   
   WScript.Echo sRes   
    If sExp = sRes Then      
       WScript.Echo "ok"      
       ' save result      
       goFS.CreateTextFile("C:\Documents and Settings\aa24800\My Documents\PDF comparison\Result.TXT", True).Write sRes      
       GoldVersion = 0   Else      
       ' show failure      
       WScript.Echo "--------- Exp"      
       WScript.Echo sExp      
       WScript.Echo "not ok"      
       GoldVersion = 1   
    End If 
End Function ' GoldVersion

Upvotes: 0

Views: 6811

Answers (1)

Ekkehard.Horner
Ekkehard.Horner

Reputation: 38745

If you are new to VBScript, you may profit from tackling your current problem in a way that will help you with your next problem.

Start with putting SelFileComp.vbs:

'' SelFileComp.vbs - selective file compare

Option Explicit

Dim goFS : Set goFS = CreateObject("Scripting.FileSystemObject")

WScript.Quit TinVersion()

Function TinVersion()
  WScript.Echo "can't compare files yet."
  TinVersion = 1
End Function ' TinVersion

into some suitable directory. Add your input and expected result files. Start SelFileComp.vbs:

cscript SelFileComp.vbs
can't compare files yet.
echo %ErrorLevel%
1

Add (and call) a TrialVersion that prepares and uses a (trivial) Differ object to do the heavy lifting in a skeleton suitable to sanity check the implementation:

'' SelFileComp.vbs - selective file compare

Option Explicit

Dim goFS : Set goFS = CreateObject("Scripting.FileSystemObject")

WScript.Quit TrialVersion()
WScript.Quit TinVersion()

Function TinVersion()
  WScript.Echo "can't compare files yet."
  TinVersion = 1
End Function ' TinVersion

Function TrialVersion()
  ' the differ should know about the files to compare
  ' we'll worry about the selection later
  Dim oDiffer : Set oDiffer = New cDiffer.init( _
     "File-1", "File-2" _
  )
  ' the differ should be able to return a result - the differences
  Dim sRes : sRes = oDiffer.diffs()
  ' check actual vs. expected result
  Dim sExp : sExp = goFS.OpenTextFile("Expected").ReadAll()
  WScript.Echo "--------- Res"
  WScript.Echo sRes
  If sExp = sRes Then
     WScript.Echo "ok"
     ' save result
     goFS.CreateTextFile("..\data\Result", True).Write sRes
     TrialVersion = 0
  Else
     ' show failure
     WScript.Echo "--------- Exp"
     WScript.Echo sExp
     WScript.Echo "not ok"
     TrialVersion = 1
  End If
End Function ' TrialVersion

' trivial Differ
Class cDiffer
  Private m_sLFSpec
  Private m_sRFSpec
  ' "constructor" with params
  Public Function init(sLFSpec, sRFSpec)
    Set init  = Me
    m_sLFSpec = sLFSpec
    m_sRFSpec = sRFSpec
  End Function
  Public Function diffs()
    diffs = "cDiffer.diffs() not implemented yet."
  End Function ' diffs
End Class ' cDiffer00

output:

cscript SelFileComp.vbs
--------- Res
cDiffer.diffs() not implemented yet.
--------- Exp
Differences:
File-1 Line 4: Price 3: $1234.00
File-2 Line 4: Price 3: $5678.00
File-1 Line 7: Term: 2-Year
File-2 Line 7: Term: 3-Year

not ok

echo %ErrorLevel%
1

If you then rename the trivial cDiffer

' trivial Differ
Class cDiffer00

you can re-use the name (and the code in TrialVersion()) to come up with a cDiffer that does at least some comparing:

' simple Differ
Class cDiffer
  Private m_sLFSpec
  Private m_sRFSpec
  ' "constructor" with params
  Public Function init(sLFSpec, sRFSpec)
    Set init  = Me
    m_sLFSpec = sLFSpec
    m_sRFSpec = sRFSpec
  End Function
  Public Function diffs()
    ' Use ArrayList to collect the results
    Dim alRes : Set alRes = CreateObject("System.Collections.ArrayList")
    ' requested title
    alRes.Add "Differences:"
    ' open both input files
    Dim tsL   : Set tsL   = goFS.OpenTextFile(m_sLFSpec)
    Dim tsR   : Set tsR   = goFS.OpenTextFile(m_sRFSpec)
    ' loop over lines
    Do Until tsL.AtEndOfStream
       Dim sLL : sLL = tsL.ReadLine()
       Dim sRL
       ' second file could be shorter
       If tsR.AtEndOfStream Then
          alRes.Add "tsR.AtEndOfStream"
          Exit Do
       Else
          sRL = tsR.ReadLine()
       End If
       ' no need for work if lines are equal
       If sLL <> sRL Then
          alRes.Add "??? " & sLL
          alRes.Add "??? " & sRL
       End If
    Loop
    tsL.Close
    tsR.Close
    diffs = Join(alRes.ToArray(), vbCrLf)
  End Function ' diffs
End Class ' cDiffer00

output:

cscript SelFileComp.vbs
--------- Res
Differences:
??? Date: 00/11/1234 / Order no: 1111
??? Date: 11/00/6789 and Order no: 2222
??? Price 3: $1234.00
??? Price 3: $5678.00
??? Order no: 1111
??? Order no: 2222
??? Term: 2-Year
??? Term: 3-Year
??? Date: 00/11/1234
??? Date: 11/00/6789
--------- Exp
Differences:
File-1 Line 4: Price 3: $1234.00
File-2 Line 4: Price 3: $5678.00
File-1 Line 7: Term: 2-Year
File-2 Line 7: Term: 3-Year

not ok

This shows clearly which sub tasks are still to be done:

  1. Selection of relevant differences
  2. Output formatting

Let's be optimistic and add and call GoldVersion()

Function GoldVersion()
  ' the differ should know about the files to compare
  ' and the info labels to select
  Dim oDiffer : Set oDiffer = New cDiffer.init( _
      "File-1", "File-2" _
    , Array("Price 3", "Term") _
  )
  ' the differ should be able to return a result - the differences
  Dim sRes : sRes = oDiffer.diffs()
  ' check actual vs. expected result
  Dim sExp : sExp = goFS.OpenTextFile("Expected").ReadAll()
  WScript.Echo "--------- Res"
  WScript.Echo sRes
  If sExp = sRes Then
     WScript.Echo "ok"
     ' save result
     goFS.CreateTextFile("..\data\Result", True).Write sRes
     GoldVersion = 0
  Else
     ' show failure
     WScript.Echo "--------- Exp"
     WScript.Echo sExp
     WScript.Echo "not ok"
     GoldVersion = 1
  End If
End Function ' GoldVersion

with a better cDiffer:

' gold Differ
Class cDiffer
  Private m_sLFSpec   ' file specs
  Private m_sRFSpec
  Private m_sLFiNa    ' file names
  Private m_sRFiNa
  Private m_dicLabels ' store and efficiently find selective labels
  ' "constructor" with params
  Public Function init(sLFSpec, sRFSpec, aLabels)
    Set init  = Me
    m_sLFSpec = sLFSpec
    m_sRFSpec = sRFSpec
    m_sLFiNa  = goFS.GetBaseName(sLFSpec)
    m_sRFiNa  = goFS.GetBaseName(sRFSpec)
    Set m_dicLabels = CreateObject("Scripting.Dictionary")
    m_dicLabels.CompareMode = vbTextCompare ' case-insensitive
    Dim sKey
    For Each sKey In aLabels
        m_dicLabels(sKey) = 0
    Next
  End Function
  Public Function diffs()
    ' Use ArrayList to collect the results
    Dim alRes : Set alRes = CreateObject("System.Collections.ArrayList")
    ' requested title
    alRes.Add "Differences:"
    ' open both input files
    Dim tsL   : Set tsL   = goFS.OpenTextFile(m_sLFSpec)
    Dim tsR   : Set tsR   = goFS.OpenTextFile(m_sRFSpec)
    ' loop over lines
    Do Until tsL.AtEndOfStream
       Dim sLL : sLL = tsL.ReadLine()
       Dim sRL
       ' second file could be shorter
       If tsR.AtEndOfStream Then
          alRes.Add "tsR.AtEndOfStream"
          Exit Do
       Else
          sRL = tsR.ReadLine()
       End If
       ' no need for work if lines are equal
       If sLL <> sRL Then
          If m_dicLabels.Exists(Split(sLL, ":")(0)) Then
             alRes.Add Join(Array(m_sLFiNa, "Line", sLL))
             alRes.Add Join(Array(m_sRFiNa, "Line", sRL))
          End If
       End If
    Loop
    tsL.Close
    tsR.Close
    diffs = Join(alRes.ToArray(), vbCrLf)
  End Function ' diffs
End Class ' cDiffer

output:

cscript SelFileComp.vbs
--------- Res
Differences:
File-1 Line Price 3: $1234.00
File-2 Line Price 3: $5678.00
File-1 Line Term: 2-Year
File-2 Line Term: 3-Year
--------- Exp
Differences:
File-1 Line 4: Price 3: $1234.00
File-2 Line 4: Price 3: $5678.00
File-1 Line 7: Term: 2-Year
File-2 Line 7: Term: 3-Year

not ok

Selection done, formatting still not good. To improve the output:

       If sLL <> sRL Then
          If m_dicLabels.Exists(Split(sLL, ":")(0)) Then
'            alRes.Add Join(Array(m_sLFiNa, "Line", sLL))
'            alRes.Add Join(Array(m_sRFiNa, "Line", sRL))
             Dim sLiNo : sLiNo = CStr(tsL.Line - 1) & ":"
             alRes.Add Join(Array(m_sLFiNa, "Line", sLiNo, sLL))
             alRes.Add Join(Array(m_sRFiNa, "Line", sLiNo, sRL))
          End If
       End If

To add the trailing vbCrLf:

'   diffs = Join(alRes.ToArray(), vbCrLf)
    diffs = Join(alRes.ToArray(), vbCrLf) & vbCrLf

Final output:

cscript SelFileComp.vbs
--------- Res
Differences:
File-1 Line 4: Price 3: $1234.00
File-2 Line 4: Price 3: $5678.00
File-1 Line 7: Term: 2-Year
File-2 Line 7: Term: 3-Year

ok

echo %ErrorLevel%
0

Next problem, please!

Update A (wrt file specs/file names)

Move/Copy File-1 to ..\data\, change

  Dim oDiffer : Set oDiffer = New cDiffer.init( _
      "File-1", "File-2" _
    , Array("Price 3", "Term") _
  )

to

  Dim oDiffer : Set oDiffer = New cDiffer.init( _
      "..\data\File-1", "File-2" _
    , Array("Price 3", "Term") _
  )

results will be the same, because cDiffer uses

m_sLFSpec = sLFSpec
  to store the (full) path
m_sLFiNa  = goFS.GetBaseName(sLFSpec)
  to extract the file name for output formatting
Dim tsL   : Set tsL   = goFS.OpenTextFile(m_sLFSpec)
  to open the file

Update B (wrt dictionary)

A dictionary is a collection that stores elements under unique keys (as opposed to an array, which makes its items accessible via numbers). By using the labels to look for as keys of a dictionary, the diffs() function can efficiently (look ma, no loops!) check, whether the first part of the line upto the :

  Split(sLL, ":")(0)

is contained in the dictionary

  If m_dicLabels.Exists(Split(sLL, ":")(0)) Then

Update C (wrt classes/constructors)

A class is the definition/specification of a (set of similar) object(s), that is a variable holding/combining data (members) and and functionality (methods). cDiffer is a class defining objects that 'know' all about the files to compare and the labels to look for (member variables like m_sLSpec) and can 'do' comparisons (methods/functions like diffs()). The New statement is used to construct/create objects according to specs:

  Dim oDiffer : Set oDiffer = New cDiffer

An object created by New is Empty, that is useless for practical purposes; you can implement a Class_Initialize() Sub (in the Class ... End Class block), but as that code would be used for all objects of the class, the gain is small.

If you look at the example in the Docs for the Class statement, you'll realize, that the parameter-less 'constructor' (Class_Initialize) is of little use to programmers who aren't paid by line/hour. The boiler plate code

   Private Sub Class_Initialize
      m_CustomerName = ""
      m_OrderCount = 0
      ... ad nauseam: set all other member data to 'nix'
   End Sub

is especially disgusting, because VBScript

  • executes the equivalent of an empty Class_Initialize automatically as soon as you call New
  • initializes all variables to Empty automatically and Empty will
    work fine for strings and numbers

The remedy is to forget Class_Initialize (except for special cases) and invest some effort in one or more

  Public Function initXXX(p1, p2, ... pn)
    Set initXXX = Me  ' return VBScript's this to caller
    ... use p1 ... pn to initialize member data to useful values
  End Function

Upvotes: 1

Related Questions