Reputation: 55
I am currently trying to produce a scatterplot of a .txt file that is structured like this in 25 rows:
age income weight
33 63 180
25 72 220
however, when I try to convert it to a csv and then produce a scatterplot with the following code:
my_input <- read.csv2('dataInput.txt', sep = '\t', header = T)
plot(x = my_input$ageX, y = my_input$weightY)
I get an error message. I also notice that there is now a period between 'age' 'income' and 'weight', which I don't understand since I would expect to get a comma between them. the error message is as follows:
Error in plot.window(...) : need finite 'xlim' values In addition: Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf
Any ideas on how to actually get a scatterplot of the data?
Edit: executing
head(my_input)
age. income. weight
1 56 63 185
2 38 72 156
3 28 75 178
4 49 59 205
5 69 65 235
6 19 70 195
Edit:
str(my_input)
age.income.weight: Factor w/ 18 levels "56 63 185",..: 1 2 3 4 5 6 7 8 9 10 ...
summary(my_input)
age.income.weight
56 63 185: 1
38 72 156: 1
28 75 178: 1
49 59 205: 1
69 65 235: 1
19 70 195: 1
(Other) :19
Upvotes: 1
Views: 2221
Reputation: 16178
Based on your edits in your question, you have an issue in the loading of your txt file. While checking the structure of your text file, it appears that there is no consistent spacing between each row and columns.
So, one way to get it to work is to create the dataframe from scratch by read it using readLines
:
my_input <- readLines("crime_input.txt")
my_input <- unlist(strsplit(my_input," "))
Now you see that the file contains a lot of space:
> my_input
[1] "age" "income" "crimes" "16" "" "" "" "" "63" "" "" ""
[13] "" "23" "18" "" "" "" "" "72" "" "" "" ""
[25] "25" "18" "" "" "" "" "75" "" "" "" "" "22"
[37] "19" "" "" "" "" "59" "" "" "" "" "16" "19"
[49] "" "" "" "" "65" "" "" "" "" "19" "19" ""
[61] "" "" "" "70" "" "" "" "" "19" "20" "" ""
[73] "" "" "78" "" "" "" "" "18" "21" "" "" ""
[85] "" "35" "" "" "" "" "11" "21" "" "" "" ""
[97] "53" "" "" "" "" "15" "23" "" "" "" "" "28"
[109] "" "" "" "" "" "9" "27" "" "" "" "" "56"
[121] "" "" "" "" "16" "28" "" "" "" "" "52" ""
[133] "" "" "" "14" "29" "" "" "" "" "63" "" ""
[145] "" "" "25" "30" "" "" "" "" "46" "" "" ""
[157] "" "17" "30" "" "" "" "" "55" "" "" "" ""
[169] "19" "31" "" "" "" "" "29" "" "" "" "" ""
[181] "8" "32" "" "" "" "" "55" "" "" "" "" "22"
[193] "32" "" "" "" "" "62" "" "" "" "" "25"
So, we can convert everything to numeric, remove NA and get:
my_input <- as.numeric(my_input)
my_input <- my_input[!is.na(my_input)]
To get:
> my_input
[1] 16 63 23 18 72 25 18 75 22 19 59 16 19 65 19 19 70 19 20 78 18 21 35 11 21 53 15 23 28 9 27 56 16 28 52 14
[37] 29 63 25 30 46 17 30 55 19 31 29 8 32 55 22 32 62 25
Finally, we can fill a matrix with this vector:
my_input <- matrix(my_input, nrow = 3, ncol = length(my_input)/3)
> my_input
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
[1,] 16 18 18 19 19 19 20 21 21 23 27 28 29 30 30 31 32 32
[2,] 63 72 75 59 65 70 78 35 53 28 56 52 63 46 55 29 55 62
[3,] 23 25 22 16 19 19 18 11 15 9 16 14 25 17 19 8 22 25
Now, we can transpose the matrix, transform as a data.frame and add colnames:
my_input <- as.data.frame(t(my_input))
colnames(my_input) <- c("age","income","crimes")
And finally, you get:
> head(my_input)
age income crimes
1 16 63 23
2 18 72 25
3 18 75 22
4 19 59 16
5 19 65 19
6 19 70 19
And if you check the format of my_input
:
> str(my_input)
'data.frame': 18 obs. of 3 variables:
$ age : num 16 18 18 19 19 19 20 21 21 23 ...
$ income: num 63 72 75 59 65 70 78 35 53 28 ...
$ crimes: num 23 25 22 16 19 19 18 11 15 9 ...
So, now, you can plot it:
my_input = my_input[order(my_input$age),]
plot(x = my_input$age, y = my_input$crimes, type = "b")
Now, you can work with this file. Hope it helps you to solve this issue.
Upvotes: 1