yCalleecharan
yCalleecharan

Reputation: 4704

Trouble reading a data set into R

I am new to R and I am trying to read in a data set. The data set is here:

http://petitlien.fr/myfiles

(The above link will expand to a GMX File storage folder link and click on Guest access to retrieve the file.)

The file named mydata.log has 32 entries with no header and it consists of 2 columns which are delimited by spaces.

I am trying the powerful command scan

test.frame<-scan(file="mydata.log",sep= "", nlines=32,blank.lines.skip=TRUE)

The above just read the first 3 rows:

head(test.frame)
[1]   0.0000   0.0000 144.3210   0.3400 159.4070   0.8925

I have tried also read.table:

test.frame<-read.table(file="mydata.log",sep= "", nrows=32,blank.lines.skip=TRUE)

This one reads the first 6 lines only as shown below:

names(test.frame)
[1] "V1" "V2"
> head(test.frame)
   V1     V2
1   0.000 0.0000
2 144.321 0.3400
3 159.407 0.8925
4 198.413 0.9450
5 222.557 0.9975
6 235.464 1.0500

Does someone know how to read this data set properly?

A related question: Can I control the number of significant digits or perhaps decimal places in the data being read in?

Thanks a lot...

Upvotes: 1

Views: 175

Answers (2)

jans
jans

Reputation: 265

Instead of nrow() as suggested, I would recommend str() ("structure") that gives you more useful information about your data set (class of variables etc). It's also a bit less cryptic....:)

Upvotes: 1

Xu Wang
Xu Wang

Reputation: 10607

This line of your code works perfectly:

test.frame<-read.table(file="mydata.log",sep= "", nrows=32,blank.lines.skip=TRUE)

The reason why you only get 6 lines in your output is because you are using head. To view all lines, just enter the name of your object:

> test.frame
           V1     V2
1       0.000 0.0000
2     144.321 0.3400
3     159.407 0.8925
4     198.413 0.9450
5     222.557 0.9975
6     235.464 1.0500
7     296.918 1.1025
8     346.773 1.1550
9     442.955 1.2075
10    694.879 1.2600
11    892.436 1.3125
12   1492.970 1.3650
13   2916.960 1.4175
14   3596.060 1.4700
15   5278.950 1.5225
16   7480.730 1.5750
17  12259.800 1.6275
18  14032.600 1.6800
19  19565.600 1.7325
20  31427.700 1.7850
21  58221.400 1.8375
22  92283.900 1.9900
23 165601.000 1.9425
24 165703.000 1.9950
25 213925.000 2.8750
26 260381.000 2.1000
27 312701.000 2.1525
28 370853.000 2.2050
29 479303.000 2.2575
30 487265.000 2.3100
31 545225.000 2.3625
32 703186.000 2.4150

Here is an easy way to see how many rows you have (useful when you have many observations):

nrow(test.frame) [1] 32

As for the number of digits, see the round command. To look at the documentation for a command, enter a ? and then the command, in this case a function: ?round

#note that you do not have to put "digits=2", you can just put "2", but this way is clearer
> rounded_test.frame <- round(test.frame, digits=2)
> rounded_test.frame
          V1   V2
1       0.00 0.00
2     144.32 0.34
3     159.41 0.89
4     198.41 0.94
5     222.56 1.00
6     235.46 1.05
7     296.92 1.10
8     346.77 1.16
9     442.95 1.21
10    694.88 1.26
11    892.44 1.31
12   1492.97 1.36
13   2916.96 1.42
14   3596.06 1.47
15   5278.95 1.52
16   7480.73 1.57
17  12259.80 1.63
18  14032.60 1.68
19  19565.60 1.73
20  31427.70 1.78
21  58221.40 1.84
22  92283.90 1.99
23 165601.00 1.94
24 165703.00 2.00
25 213925.00 2.88
26 260381.00 2.10
27 312701.00 2.15
28 370853.00 2.21
29 479303.00 2.26
30 487265.00 2.31
31 545225.00 2.36
32 703186.00 2.42

Note in the above I created a new object instead of replacing the current one. If you want to replace the current one and lose the data forever (until you reload the dataset of course!), then you can use this line instead:

test.frame <- round(test.frame, digits=2)

If you don't really want to compress your numbers, you might just be interested in viewing the rounded numbers. You can do this the following command:

print(test.frame,digits=2)

Upvotes: 2

Related Questions