An example of computing with dates in R
Goal: Plot number of sunlight hours along 2015
require(lubridate)
source("daylength.R") #note the "" beacuse daylength.R is a file name
rwd <- "/Volumes/KINGSTON/Rictja/RLab2" #use getwd() and paste your working directory here
Using class Date
dies2015d <- seq.Date(from=as.Date("2015-01-01"), to=as.Date("2015-12-31"),by=1)
class(dies2015d)
## [1] "Date"
Using class POSIXt
dies2015p <- as.POSIXlt(dies2015d)
class(dies2015p)
## [1] "POSIXlt" "POSIXt"
#or
dies2015p <- seq.POSIXt(from=ymd("2015-01-01"), to=ymd("2015-12-31"),by="1 d")
class(dies2015p)
## [1] "POSIXct" "POSIXt"
Let’s try first the Date object to feed our daylength() function
bcnld2015d <- daylength(dies2015d,lat=41.4479,H="N")
## Warning: Incompatible methods ("-.Date", "-.POSIXt") for "-"
## Warning: Incompatible methods ("-.Date", "-.POSIXt") for "-"
## Warning: Incompatible methods ("-.Date", "-.POSIXt") for "-"
## Warning: Incompatible methods ("-.Date", "-.POSIXt") for "-"
The above Warnings are caused because our function daylength.R calculates the difference between the input date (class Date) and a reference POSIXt object (line 34) that is created within the function (line 28). While Warning messages are not errors and the command actually completes, you must always be very careful and understand the messages before accepting the result. In this case, the Warning message implies that we are computing a wrong result, which is a dangerous situation: no error causing the function to halt, but actually a wrong result that we could have accepted.
The warning message indicates the operation causing the problem:
Incompatible methods (“-.Date”, “-.POSIXt”) for “-”
which in this simple function was enough to locate line 34. In more complex cases, we use an specific tool, the debugger, which we will see later in the talleR.
To fully understand the problem, let’s run an equivalent command interactively:
#First review the class:
class(dies2015d[1])
## [1] "Date"
class(dies2015p[1])
## [1] "POSIXct" "POSIXt"
class(dmy(paste("21","12",2012,sep="-")))
## [1] "POSIXct" "POSIXt"
#Then test the operation:
as.numeric(dies2015d[1] - dmy(paste("21","12",2012,sep="-")))
## Warning: Incompatible methods ("-.Date", "-.POSIXt") for "-"
## [1] -1.356e+09
as.numeric(dies2015p[1] - dmy(paste("21","12",2012,sep="-")))
## [1] 741
Note that the difference between the 2 last commands is not just the presence or absence of the Warning message: most important, the results are different!
Let’s thus use the POSIXt object for our daylength() function
bcnld2015 <- daylength(dies2015p,lat=41.4479,H="N")
There are two POSIXt types, POSIXct and POSIXlt.
class(ymd_hms("1970-01-01 00:00:01"))
## [1] "POSIXct" "POSIXt"
unclass(ymd_hms("1970-01-01 00:00:01"))
## [1] 1
## attr(,"tzone")
## [1] "UTC"
as.numeric(ymd_hms("1970-01-01 00:00:01"))
## [1] 1
class(as.POSIXlt(ymd_hms("1970-01-01 00:00:01")))
## [1] "POSIXlt" "POSIXt"
unclass(as.POSIXlt(ymd_hms("1970-01-01 00:00:01")))
## $sec
## [1] 1
##
## $min
## [1] 0
##
## $hour
## [1] 0
##
## $mday
## [1] 1
##
## $mon
## [1] 0
##
## $year
## [1] 70
##
## $wday
## [1] 4
##
## $yday
## [1] 0
##
## $isdst
## [1] 0
##
## attr(,"tzone")
## [1] "UTC"
as.numeric(as.POSIXlt(ymd_hms("1970-01-01 00:00:01")))
## [1] 1
plot(dies2015p,bcnld2015, type="l",las=2,
xlab="Date",ylab="Length of Day (h)",
main="Daylength in Barcelona")
If we want axis annotated at a different interval, we set xaxt=“n” (meaning “x axis type = none”) and then define our custom positions and a POSIXct axis:
plot(dies2015p,bcnld2015, type="l",las=2, xaxt="n",
xlab="Date", ylab="Length of Day (h)",
main="Daylength in Barcelona")
miposx <- seq.POSIXt(from=ymd("2015-01-01"), to=ymd("2015-12-31"),by="15 day")
axis.POSIXct(1, at = miposx, format= "%b%d", las = 2, cex.axis=0.75)
Note that the increment is not linear
bcnld2015delta <- diff(bcnld2015)
plot(dies2015p,bcnld2015delta, type="l",las=2, xaxt="n",
xlab="Date", ylab="Length of Day (h)",
main="Daily Increase of Daylength in Barcelona")
## Error: 'x' and 'y' lengths differ
The error is caused because the diff() result is shorter than the input vector
length(bcnld2015delta)
## [1] 364
and also note that no plot was produced: the error halts the execution of the command.
We just insert an NA at the begining and plot
bcnld2015delta <- c(NA,diff(bcnld2015))
plot(dies2015p,bcnld2015delta, type="l",las=2, xaxt="n",
xlab="Date", ylab="Length of Day (h)",
main="Daily Increase of Daylength in Barcelona")
axis.POSIXct(1, at = miposx, format= "%b%d", las = 2,cex.axis=0.75)
There is a Catalan proverb for this non-linearity:
Per Santa Llucia, un pas de puça (“13-12-2015”)
per Nadal, pas de pardal (“25-12-2015”)
per Sant Esteve, un de llebre (“26-12-2015”)
per Any Nou, un de bou (“01-01-2015”)
per Sant Julia, un pas de capella (“09-01-2015”)
per Sant Antoni un pas de dimoni (“17-01-2015”)
We must add 10 days because the proverb is older than the Gregorian Calendar, which added 10 days, thus the older 13/12 is the current 23/12
sel <- dmy(c("13-12-2015","25-12-2014","26-12-2014","01-01-2015","09-01-2015",
"17-01-2015")) + 10*24*60^2 #we operate in seconds
Now we must select the lengths for those dates. We have 2 options: “%in% and match(). With %in%
dies2015p[dies2015p%in%sel] #test
## [1] "2015-01-04 UTC" "2015-01-05 UTC" "2015-01-11 UTC" "2015-01-19 UTC"
## [5] "2015-01-27 UTC" "2015-12-23 UTC"
bcnld2015delta.sel <- bcnld2015delta[dies2015p%in%sel]
barplot(bcnld2015delta.sel,names.arg=sel-10*24*60^2,las=2,cex.names=0.75)
Hmmm… this is not what we expected! The problem is in the “%in% operation: the result is a vector of TRUE and FALSE of the length of the first argument (dies2015p in our case), indicating whether there is a match with an element of the second argument (sel in our case). Thus, the result is not ordered according to the second vector, but according to the first one. In a shorter example:
delme <- 1:10
delme%in%c(3,5,7)
## [1] FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
delme[delme%in%c(3,5,7)]
## [1] 3 5 7
which looks correct, but
delme%in%c(7,5,3)
## [1] FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
delme[delme%in%c(7,5,3)]
## [1] 3 5 7
“%in%” cannot guarantee that the result follows the ordering of the second vector. We try with match(), which results in a vector of the length of the first vector with the positions of the first match in the second vector. The order of the arguments in match() is thus reversed by comparison to %in%:
match(c(3,5,7),delme)
## [1] 3 5 7
match(c(7,3,5),delme)
## [1] 7 3 5
delme[match(c(3,5,7),delme)]
## [1] 3 5 7
delme[match(c(7,3,5),delme)]
## [1] 7 3 5
selecting the dates mentioned in the proverb with match() would be:
dies2015p[match(sel,dies2015p)] #test
## [1] "2015-12-23 UTC" "2015-01-04 UTC" "2015-01-05 UTC" "2015-01-11 UTC"
## [5] "2015-01-19 UTC" "2015-01-27 UTC"
bcnld2015delta.sel <- bcnld2015delta[match(sel,dies2015p)]
barplot(bcnld2015delta.sel,names.arg=sel-10*24*60^2,las=2,cex.names=0.75,
ylab="Daily Increase of Daylength")
title(main="Verification of Catalan Proverb \n selected dates")
Simplest but long and inelegant option: create one vector object for each latitude (0, 10, 20, …, 90) and then plot each individual vector:
lat0dl2015 <- daylength(dies2015p,lat=0)
lat10dl2015 <- daylength(dies2015p,lat=10)
lat20dl2015 <- daylength(dies2015p,lat=20)
lat30dl2015 <- daylength(dies2015p,lat=30)
lat40dl2015 <- daylength(dies2015p,lat=40)
lat50dl2015 <- daylength(dies2015p,lat=50)
lat60dl2015 <- daylength(dies2015p,lat=60)
lat70dl2015 <- daylength(dies2015p,lat=70)
lat80dl2015 <- daylength(dies2015p,lat=80)
lat90dl2015 <- daylength(dies2015p,lat=90)
plot(dies2015p,lat0dl2015, type="l", col=1,
ylim=c(0,24), las=2,
main="Daylength at 0 - 90 N")
lines(dies2015p,lat10dl2015, type="l", col=2)
lines(dies2015p,lat20dl2015, type="l", col=3)
lines(dies2015p,lat30dl2015, type="l", col=4)
lines(dies2015p,lat40dl2015, type="l", col=5)
lines(dies2015p,lat50dl2015, type="l", col=6)
# and so on.
More compact way: create an empty matrix for the results (one column per latitude), run the function within loop over the columns and use matplot:
lat0_90dl2015 <- matrix(0,ncol=10,nrow=365)
lats <- seq(from=0,to=90,by=10)
colnames(lat0_90dl2015) <- lats
lat0_90dl2015[1:2,]
## 0 10 20 30 40 50 60 70 80 90
## [1,] 0 0 0 0 0 0 0 0 0 0
## [2,] 0 0 0 0 0 0 0 0 0 0
for (i in 1:ncol(lat0_90dl2015)){
lat0_90dl2015[,i] <- daylength(dies2015p,lat=lats[i])
}
head(lat0_90dl2015)
## 0 10 20 30 40 50 60 70 80 90
## [1,] 12 11.43 10.81 10.10 9.211 7.937 5.670 0 0 0
## [2,] 12 11.43 10.82 10.11 9.222 7.955 5.703 0 0 0
## [3,] 12 11.43 10.82 10.12 9.235 7.974 5.738 0 0 0
## [4,] 12 11.43 10.83 10.13 9.248 7.995 5.775 0 0 0
## [5,] 12 11.44 10.83 10.14 9.262 8.016 5.816 0 0 0
## [6,] 12 11.44 10.84 10.15 9.278 8.040 5.858 0 0 0
matplot(dies2015p,lat0_90dl2015,type="l",lty=1:5,col=1:6,
ylim=c(0,25), las=2,
xlab="Date",ylab="Length of Day (h)",
main="Daylength at 0 - 90 N")
legend("topright",lty=1:5,col=1:6,legend=lats)
matplot() is simple but does not plot meaningful date text, so we define our own x axis as above:
matplot(dies2015p,lat0_90dl2015,type="l",lty=1:5,col=1:6,
ylim=c(0,25), las=2, xaxt="n",
xlab="Date",ylab="Length of Day (h)",
main="Daylength at 0 - 90 N")
axis.POSIXct(1, at = miposx, format= "%b%d", las = 2,cex.axis=0.75)
legend("topright",lty=1:5,col=1:6,legend=lats)
To avoid the legend problem of the RStudio graphic window, we draw the plot in another graphic device.
In linux and windows use
x11()
which works on mac if the X11 system has been installed, but which can crash R otherwise.
In mac use:
quartz()
## Warning: Quartz device is not available on this platform
For all systems:
matplot(dies2015p,lat0_90dl2015,type="l",lty=1:5,col=1:6,
ylim=c(0,25), las=2, xaxt="n",
xlab="Date",ylab="Length of Day (h)",
main="Daylength at 0 - 90 N")
axis.POSIXct(1, at = miposx, format= "%b%d", las = 2,cex.axis=0.75)
legend("topright",lty=1:5,col=1:6,legend=lats)