Reputation: 35
I have several years of daily discharge data provided in fixed-width ".txt" format as follows:
Year: 1962
Mean daily discharge in m3/s
============================
Day Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec. Year
01 NA NA 169 205 458 472 975 2010 1310 755 307 214
02 NA NA 163 205 467 462 952 2590 1230 800 307 211
03 NA NA 160 198 410 415 946 2340 1220 687 303 211
04 NA NA 158 189 396 392 946 2200 1160 675 303 208
05 NA NA 195 180 365 467 1040 2090 1230 652 299 205
06 NA NA 231 192 335 508 1150 2030 1260 634 295 205
07 NA NA 202 195 311 566 1250 1970 1230 628 291 202
08 NA NA 198 192 291 640 1220 1860 1160 611 288 202
09 NA NA 195 195 284 813 1100 1800 1180 588 280 198
10 NA NA 192 192 273 981 1050 1690 1260 577 273 198
11 NA NA 192 195 269 1030 946 1570 1160 555 269 195
12 NA NA 189 189 265 1000 1100 1850 1110 539 265 192
13 NA NA 189 186 262 1050 1040 1830 1090 518 265 192
14 NA NA 186 198 269 1060 934 1790 1180 503 262 189
15 NA NA 192 205 280 1150 1300 1900 1120 492 258 186
16 NA NA 192 211 295 1170 1470 2080 1160 467 255 183
17 NA NA 192 205 315 877 1490 2140 1260 453 255 183
18 NA NA 192 208 311 807 3030 1990 1130 448 251 180
19 NA NA 192 214 284 774 1360 1940 1700 434 248 180
20 NA NA 189 238 277 915 1310 1880 1830 429 244 178
21 NA NA 189 238 303 1020 1240 1780 1640 415 241 175
22 NA NA 186 277 319 1200 1200 1420 1860 406 238 175
23 NA 160 183 269 323 1280 1120 1350 1940 392 234 172
24 NA 163 183 315 357 1360 1160 1340 1650 369 231 169
25 NA 178 186 361 365 1380 1260 1570 1420 357 227 169
26 NA 169 189 361 365 1340 1360 1730 1220 344 224 169
27 NA 169 189 410 365 1300 1810 1880 1130 335 221 166
28 NA 169 192 434 387 1250 2090 1640 1040 327 218 166
29 NA 195 419 419 1210 1800 1550 981 311 218 163
30 NA 202 387 415 1150 1800 1500 969 311 214 163
31 NA 205 438 1850 1690 307 163
Min NA NA 158 180 262 392 934 1340 969 307 214 163 NA
Mean NA NA 189 249 338 936 1330 1840 1290 494 259 186 NA
Max NA NA 231 434 467 1380 3030 2590 1940 800 307 214 NA
My goal is to convert it into time series. I have tried to use read.fwf() to read the file and pivot_longer() function to acquire the time series. But the problem is I couldn't specify the number of days in each month thus got 372 days. Since I am a newbie to r I am not familiar with many r-packages. Could anyone here suggest a simpler way to achieve my goal? Thank you!
Upvotes: 0
Views: 311
Reputation: 26218
Do it like this
df <- read.table(text = "Day Jan. Feb. Mar. Apr. May. Jun. Jul. Aug. Sep. Oct. Nov. Dec.
01 NA NA 169 205 458 472 975 2010 1310 755 307 214
02 NA NA 163 205 467 462 952 2590 1230 800 307 211
03 NA NA 160 198 410 415 946 2340 1220 687 303 211
04 NA NA 158 189 396 392 946 2200 1160 675 303 208
05 NA NA 195 180 365 467 1040 2090 1230 652 299 205
06 NA NA 231 192 335 508 1150 2030 1260 634 295 205
07 NA NA 202 195 311 566 1250 1970 1230 628 291 202
08 NA NA 198 192 291 640 1220 1860 1160 611 288 202
09 NA NA 195 195 284 813 1100 1800 1180 588 280 198
10 NA NA 192 192 273 981 1050 1690 1260 577 273 198
11 NA NA 192 195 269 1030 946 1570 1160 555 269 195
12 NA NA 189 189 265 1000 1100 1850 1110 539 265 192
13 NA NA 189 186 262 1050 1040 1830 1090 518 265 192
14 NA NA 186 198 269 1060 934 1790 1180 503 262 189
15 NA NA 192 205 280 1150 1300 1900 1120 492 258 186
16 NA NA 192 211 295 1170 1470 2080 1160 467 255 183
17 NA NA 192 205 315 877 1490 2140 1260 453 255 183
18 NA NA 192 208 311 807 3030 1990 1130 448 251 180
19 NA NA 192 214 284 774 1360 1940 1700 434 248 180
20 NA NA 189 238 277 915 1310 1880 1830 429 244 178
21 NA NA 189 238 303 1020 1240 1780 1640 415 241 175
22 NA NA 186 277 319 1200 1200 1420 1860 406 238 175
23 NA 160 183 269 323 1280 1120 1350 1940 392 234 172
24 NA 163 183 315 357 1360 1160 1340 1650 369 231 169
25 NA 178 186 361 365 1380 1260 1570 1420 357 227 169
26 NA 169 189 361 365 1340 1360 1730 1220 344 224 169
27 NA 169 189 410 365 1300 1810 1880 1130 335 221 166
28 NA 169 192 434 387 1250 2090 1640 1040 327 218 166
29 NA NA 195 419 419 1210 1800 1550 981 311 218 163
30 NA NA 202 387 415 1150 1800 1500 969 311 214 163
31 NA NA 205 NA 438 NA 1850 1690 NA 307 NA 163
", header = T)
df %>% pivot_longer(!Day, values_drop_na = T) %>%
mutate(date = as.Date(paste0('2017','.', name, Day ), format = '%Y.%b.%d')) %>%
select(-Day, name) %>%
complete(date = seq.Date(as.Date('2017-01-01'), as.Date('2017-12-31'), by = '1 day'))
# A tibble: 365 x 3
date name value
<date> <chr> <int>
1 2017-01-01 NA NA
2 2017-01-02 NA NA
3 2017-01-03 NA NA
4 2017-01-04 NA NA
5 2017-01-05 NA NA
6 2017-01-06 NA NA
7 2017-01-07 NA NA
8 2017-01-08 NA NA
9 2017-01-09 NA NA
10 2017-01-10 NA NA
# ... with 355 more rows
Upvotes: 1
Reputation: 1625
If your data is stored in test.dat
you can try:
library(tidyverse)
df <- read_table("test.dat") %>%
pivot_longer(Jan.:Dec., values_drop = T) %>%
mutate(name = str_remove(name, "[.]")) %>%
mutate(Year = "1962") %>%
mutate(date = dmy(paste(Day, name, Year, sep = "/")))
Upvotes: 0