Monday, September 24, 2018

dealing with NA in seasonal cycle analysis R

Leave a Comment

I have a timeseries of monthly data with lots of missing datapoints, set to NA. I want to simply subtract the annual cycle from the data, ignoring the missing entries. It seems that the decompose function can't handle missing data points, but I have seen elsewhere that the seasonal package is suggested instead. However I am also running into problems there too with the NA.

Here is a minimum reproducible example of the problem using a built in dataset...

library(seasonal)  # set range to missing NA in Co2 dataset c2<-co2 c2[c2>330 & c2<350]=NA seas(c2,na.action=na.omit)  Error in na.omit.ts(x) : time series contains internal NAs 

Yes, I know! that's why I asked you to omit them! Let's try this:

seas(c2,na.action=na.x13)  Error: X-13 run failed  Errors: - Adding MV1981.Apr exceeds the number of regression effects   allowed in the model (80). 

Hmmm, interesting, no idea what that means, okay, please just exclude the NA:

seas(c2,na.action=na.exclude)  Error in na.omit.ts(x) : time series contains internal NAs 

that didn't help much! and for good measure

decompose(c2)  Error in na.omit.ts(x) : time series contains internal NAs 

I'm on the following:

R version 3.4.4 (2018-03-15) -- "Someone to Lean On" Copyright (C) 2018 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) 

Why is leaving out NA such a problem? I'm obviously being completely stupid, but I can't see what I'm doing wrong with the seas function. Happy to consider an alternative solution using xts.

1 Answers

Answers 1

My first solution, simply manually calculating the seasonal cycle, converting to a dataframe to subtract the vector and then transforming back.

# seasonal cycle scycle=tapply(c2,cycle(c2),mean,na.rm=T)  # converting to df df=tapply(c2, list(year=floor(time(c2)), month = cycle(c2)), c) # subtract seasonal cycle for (i in 1:nrow(df)){df[i,]=df[i,]-scycle} # convert back to timeseries anomco2=ts(c(t(df)),start=start(c2),freq=12) 

Not very pretty, and not very efficient either.

The comment of missuse lead me to another Seasonal decompose of monthly data including NA in r I missed with a near duplicate question and this suggested the package zoo, which seems to work really well for additive series

library(zoo) c2=co2 c2[c2>330&c2<350]=NA d=decompose(na.StructTS(ts))  plot(c2) lines(d$x,col="red") 

shows that the series is very well reconstructed through the missing period.

black lines shows Co2 series with missing chunk and the red line is the reconstructed series

The output of deconstruct has the trend and seasonal cycle available. I wish I could transfer my bounty to user https://stackoverflow.com/users/516548/g-grothendieck for this helpful response. Thanks to user missuse too.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment