pbj
pbj

Reputation: 719

Remove double quotes in the content of text files

I am using a legacy application where all the source code is in vb.net. I am checking if the file exists and if the condition is true replace all the " in the contents of the file. For instance "text" to be replaced as text. I am using the below code.

vb.net

Dim FileFullPath As String
FileFullPath = "\\Fileshare\text\sample.txt"

If File.Exists(FileFullPath) Then
    Dim stripquote As String = FileFullPath
    stripquote = stripquote.Replace("""", "").Trim()
Else
    '
End If
    

I get no errors and at the same time the " is not being replaced in the content of the file.

Data:

ID, Date, Phone, Comments
1,05/13/2021,"123-000-1234","text1"
2,05/13/2021,"123-000-2345","text2"
3,05/13/2021,"123-000-3456","text2"

Output:

1,05/13/2021,123-000-1234,text1
2,05/13/2021,123-000-2345,text2
3,05/13/2021,123-000-3456,text2

Upvotes: 1

Views: 657

Answers (3)

jmcilhinney
jmcilhinney

Reputation: 54417

The best way to go about this depends on the potential size of the file. If the file is relatively small then there's no point processing it line by line and certainly not using a TextFieldParser. Just read the data in, process it and write it out:

File.WriteAllText(FileFullPath,
                  File.ReadAllText(FileFullPath).
                       Replace(ControlChars.Quote, String.Empty))

Only if the file is potentially large and reading it all in one go would require too much memory should you consider processing it line by line. In that case, I'd go this way:

'Let the system create a temp file.
Dim tempFilePath = Path.GetTempFileName()

'Open the temp file for writing text.
Using tempFile As New StreamWriter(tempFilePath)
    'Open the source file and read it line by line.
    For Each line In File.ReadLines(FileFullPath)
        'Remove double-quotes from the current line and write the result to the temp file.
        tempFile.WriteLine(line.Replace(ControlChars.Quote, String.Empty))
    Next
End Using

'Overwrite the source file with the temp file.
File.Move(tempFilePath, FileFullPath, True)

Note the use of File.ReadLines rather than File.ReadAllLines. The former will only read one line at a time where the latter reads every line before you can process any of them.

EDIT:

Note that this:

File.Move(tempFilePath, FileFullPath, True)

only works in .NET Core 3.0 and later, including .NET 5.0. If you're targeting .NET Framework then you have three other options:

  1. Delete the original file (File.Delete) and then move the temp file (File.Move).
  2. Copy the temp file (File.Copy) and then delete the temp file (File.Delete).
  3. Call My.Computer.FileSystem.MoveFile to move the temp file and overwrite the original file in one go.

Upvotes: 1

Andrew Morton
Andrew Morton

Reputation: 25023

You can read each line of the file, remove the double-quotes, write that to a temporary file, then when all the lines are done delete the original and move/rename the temporary file as the filename:

Imports System.IO
'...
Sub RemoveDoubleQuotes(filename As String)
    Dim tmpFilename = Path.GetTempFileName()
    Using sr As New StreamReader(filename)
        Using sw As New StreamWriter(tmpFilename)
            While Not sr.EndOfStream
                sw.WriteLine(sr.ReadLine().Replace("""", ""))
            End While
        End Using
    End Using

    File.Delete(filename)
    File.Move(tmpFilename, filename)

End Sub

Add error handling as desired.

Upvotes: 2

dbasnett
dbasnett

Reputation: 11773

TextFieldParser is probably the way to go.

Your code with a few changes.

    Static doubleQ As String = New String(ControlChars.Quote, 2)

    Dim FileFullPath As String
    FileFullPath = "\\Fileshare\text\sample.txt"

    If IO.File.Exists(FileFullPath) Then
        Dim stripquote As String = IO.File.ReadAllText(FileFullPath)
        stripquote = stripquote.Replace(doubleQ, "").Trim()
    Else
        '
    End If

Note the static declaration. I adopted this approach because it confused the heck out of me.

Upvotes: 1

Related Questions