I'm trying to load this ugly-formatted data-set into my R session: http://www.cpc.ncep.noaa.gov/data/indices/wksst8110.for
Weekly SST data starts week centered on 3Jan1990
Nino1+2 Nino3 Nino34 Nino4
Week SST SSTA SST SSTA SST SSTA SST SSTA
03JAN1990 23.4-0.4 25.1-0.3 26.6 0.0 28.6 0.3
10JAN1990 23.4-0.8 25.2-0.3 26.6 0.1 28.6 0.3
17JAN1990 24.2-0.3 25.3-0.3 26.5-0.1 28.6 0.3
So far, i can read the lines with
x = readLines(path)
But the file mixes 'white space' with '-' as separators, and i'm not a regex expert. I Appreciate any help on turning this into a nice and clean R data-frame. thanks!
This is a fixed width file. Use read.fwf()
to read it:
x <- read.fwf(
file=url("http://www.cpc.ncep.noaa.gov/data/indices/wksst8110.for"),
skip=4,
widths=c(12, 7, 4, 9, 4, 9, 4, 9, 4))
head(x)
V1 V2 V3 V4 V5 V6 V7 V8 V9
1 03JAN1990 23.4 -0.4 25.1 -0.3 26.6 0.0 28.6 0.3
2 10JAN1990 23.4 -0.8 25.2 -0.3 26.6 0.1 28.6 0.3
3 17JAN1990 24.2 -0.3 25.3 -0.3 26.5 -0.1 28.6 0.3
4 24JAN1990 24.4 -0.5 25.5 -0.4 26.5 -0.1 28.4 0.2
5 31JAN1990 25.1 -0.2 25.8 -0.2 26.7 0.1 28.4 0.2
6 07FEB1990 25.8 0.2 26.1 -0.1 26.8 0.1 28.4 0.3
Update
The package readr
(released April, 2015) provides a simple and fast alternative.
library(readr)
x <- read_fwf(
file="http://www.cpc.ncep.noaa.gov/data/indices/wksst8110.for",
skip=4,
fwf_widths(c(12, 7, 4, 9, 4, 9, 4, 9, 4)))
Speed comparison: readr::read_fwf()
was ~2x faster than utils::read.fwf ()
.