Reputation: 1822
I'm trying to read data into R from a text file such that I can plot it:
coupling <- read.table("~/table.format",stringsAsFactors = FALSE, sep='\t')
A row from this table looks as follows:
133 0.0116, 0.0226, 0.0236, 0.0244, 0.0264, 0.0124, 0.013, 0.014, 0.0158, 0.034, 0.0348, 0.0356, 0.0372 329777.0, -236464.0, -348470.0, -554708.0, -471896.0, 538782.0, 695291.0, 812729.0, 983141.0, 208212.0, 214012.0, 366636.0, 343232.0
Where the columns (residue, delay, height) are separated by tabs and data within the columns is separated by ','. I would now like to plot height vs delay, so I attempt to assign the columns to variables:
xdata <- c(coupling[1,2])
ydata <- c(coupling[1,3])
However, if I try and plot plot(xdata,ydata) I get the following errors:
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
5: In min(x) : no non-missing arguments to min; returning Inf
6: In max(x) : no non-missing arguments to max; returning -Inf
Printing xdata (and ydata) gives a variable of the form:
xdata
[1] "0.0116, 0.0226, 0.0236, 0.0244, 0.0264, 0.0124, 0.013, 0.014, 0.0158, 0.034, 0.0348, 0.0356, 0.0372 "
Presumably R can't plot this with the quotes in place. I've tried a few alternatives to try and get round this, however, none of these have worked:
newxdata <-as.numeric(xdata)
Returns error:
Warning message:
NAs introduced by coercion
Print gets me close:
print(xdata,quote=FALSE)
This seems to do the trick; the output loses the quotes:
[1] 0.0116, 0.0226, 0.0236, 0.0244, 0.0264, 0.0124, 0.013, 0.014, 0.0158, 0.034, 0.0348, 0.0356, 0.0372
But if I assign it to a variable, the quotes reappear and I still can't plot the data:
newxdata <- c(print(xdata,quote=FALSE))
newxdata
[1] "0.0116, 0.0226, 0.0236, 0.0244, 0.0264, 0.0124, 0.013, 0.014, 0.0158, 0.034, 0.0348, 0.0356, 0.0372 "
How can I get around this problem?
Upvotes: 3
Views: 157
Reputation: 887501
You may also use scan
(data from @LyzandeR's post)
scan(text=a, what=numeric(), sep=",", quiet=TRUE)
#[1] 0.0116 0.0226 0.0236 0.0244 0.0264 0.0124 0.0130 0.0140 0.0158 0.0340
#[11] 0.0348 0.0356 0.0372
You could directly use scan
to read it from the file with sep=","
scan("~/table.format", what=numeric(), sep=",", quiet=TRUE) #not tested
Upvotes: 2
Reputation: 37879
You need some modifications first and then it will work. The reason for the quotes is that you have a character vector of length 1 that you need to convert into a numerical vector of length 13:
#initial data set: character vector of length 1
a <- "0.0116, 0.0226, 0.0236, 0.0244, 0.0264, 0.0124, 0.013, 0.014, 0.0158, 0.034, 0.0348, 0.0356, 0.0372 "
#function to trim leading and trailing spaces **see bottom of answer
trim <- function (x) gsub("^\\s+|\\s+$", "", x)
#first use strsplit to split the long string into separate string elements
#that are comma separated.
#Then use trim on each element to remove leading and trailing spaces
b <- trim(strsplit(a, ',')[[1]])
#finally use as.numeric to convert to numbers
c <- as.numeric(b)
Variable c can now be used in plot
Output:
> c
[1] 0.0116 0.0226 0.0236 0.0244 0.0264 0.0124 0.0130 0.0140 0.0158 0.0340 0.0348 0.0356 0.0372
Function trim
was taken from here
EDIT
Apparently as per @zero323 's comment you don't even need to trim the character vector. So, this works fine in one call:
> as.numeric(strsplit(a, ',')[[1]])
[1] 0.0116 0.0226 0.0236 0.0244 0.0264 0.0124 0.0130 0.0140 0.0158 0.0340 0.0348 0.0356 0.0372
Upvotes: 3