LearneR
LearneR

Reputation: 2531

How to remove the extra commas from a csv file?

I was trying to use a csv file in R in read.transactions() command from arules package.

The csv file when opened in Notepad++ shows extra commas for every non-existing values. So, I'm having to manually delete those extra commas before using the csv in read.transactions(). For example, the actual csv file when opened in Notepad++ looks like:

D115,DX06,Slz,,,,
HC,,,,,,
DX06,,,,,,
DX17,PG,,,,,
DX06,RT,Dty,Dtcr,,

I want it to appear like below while sending it into read.transactions():

D115,DX06,Slz
HC
DX06
DX17,PG
DX06,RT,Dty,Dtcr

Is there any way I can make that change in read.transactions() itself, or any other way? But even before that, we don't get to see those extra commas in R(that output I showed was from Notepad++)..

So how can we even remove them in R when we can't see it?

Upvotes: 0

Views: 5499

Answers (1)

James Trimble
James Trimble

Reputation: 1866

A simple way to create a new file without the trailing commas is:

file_lines <- readLines("input.txt")
writeLines(gsub(",+$", "", file_lines),
           "without_commas.txt")

In the gsub command, ",+$" matches one or more (+) commas (,) at the end of a line ($).

Since you're using Notepad++, you could just do the substitution in that program: Search > Replace, replace ,+$ with nothing, Search Mode=Regular Expression.

Upvotes: 3

Related Questions