**by Willis Eschenbach**

Previously, we discussed the errors in Levitus et al here in An Ocean of Overconfidence

Unfortunately, the supplemental information for the new Levitus et al. paper has not been published. Fortunately, WUWT regular P. Solar has located a version of the preprint containing their error estimate, located here. This is how they describe the start of the procedure they describe which results in their estimates:

From every observed one-degree mean temperature value at every standard depth level we subtract off a climatological value. For this purpose we use the monthly climatological fields of temperature from Locarnini et a. [2010].

Now, the “climatology” means the long-term average (mean) of the variable. In this case, it is the long-term average for each 1° X 1° gridcell, at each depth. Being a skeptical type of fellow, I though “how much data do they actually have”? It is important because if they don’t have much data, the long-term mean will have a large error component. If we don’t have much data, it increases the size of the expected error in the mean, which is called the “standard error of the mean”.

Regarding the climatology, they say that it is from the World Ocean Atlas 2009 (WOA09), viz:* ” … statistics at all standard levels and various climatological averaging periods are available at http://www.nodc.noaa.gov/OC5/WOA09F/pr_woa09f.html “*

So I went there to see what kind of numbers they have for the monthly climatology at 2000 metres depth … and I got this answer:

The temperature monthly climatologies deeper than 1500 meters have not been calculated.

Well, that sux. How do the authors deal with that? I don’t have a clue. Frustrated at 2000 metres, I figured I’d get the data for the standard error of the mean (SEM) for some month, say January, at 1500 metres. Figure 1 shows their map of the January SEM at 1500 metres depth:

*Figure 1. Standard error of the mean (SEM) for the month of January at 1500 metres depth. White areas have no data. Click on image for larger version. SOURCE*

YIKES! In 55 years, only 5% of the 1° X 1° gridcells have three observations or more for January at 1500 metre … and they are calculating averages?

Now, statistically cautious folks like myself would look at that and say “Well … with only 5% coverage, there’s not much hope of getting an accurate average”. But that’s why we’re not AGW supporters. The authors, on the other hand, forge on.

Not having climatological data for 95% of the ocean at 1500 metres, what they do is take an average of the surrounding region, and then use that value. However, with only 5% of the gridcells having 3 observations or more, that procedure seems … well, wildly optimistic. It might be useful for infilling if we were missing say 5% of the observations … but when we are missing 95% of the ocean, that just seems goofy.

So how about at the other end of the depth scale? Things are better at the surface, but not great. Here’s that map:

*Figure 2. Standard error of the mean (SEM) for the month of January at the surface. White areas have no data. **Click on image for larger version. Source as in Fig. 1*

As you can see, there are still lots and lots of areas without enough January observations to calculate a standard error of the mean … and in addition, for those that do have enough data, the SEM is often greater than half a degree. When you take a very accurate temperature measurement, and you subtract from it a climatology with a ± half a degree error, you are greatly reducing the precision of the results.

w.

APPENDIX 1: the data for this analysis was downloaded as an NCDF file from here (WARNING-570 Mb FILE!). It is divided into 1° gridcells and has 24 depth levels, with a maximum depth of 1500 metres. It shows that some 42% of the gridcell/depth/month combinations have no data. Another 17% have only one observation for the given gridcell and depth, and 9% have two observations. In other words, the median number of observations for a given month, depth, and gridcell is 1 …

APPENDIX 2: the code used to analyze the data (in the computer language “R”) is:

require(ncdf) mync=open.ncdf("temperature_monthly_1deg.nc") mytemps=get.var.ncdf(mync,"t_gp") tempcount=get.var.ncdf(mync,"t_dd") myse=get.var.ncdf(mync,"t_se") allcells=length(which(tempcount!=-2147483647)) zerocells=length(which(tempcount==2)) zerocells/allcells hist(tempcount[which(tempcount!=-2147483647)],breaks=seq(0,6000,1),xlim=c(0,40)) tempcount[which(tempcount==-2147483647)]=NA whichdepth=24 zerodata=length(which(tempcount[,, whichdepth,1]==0)) totaldata=length(which(!is.na(tempcount[,, whichdepth,1]))) under3data=length(which(tempcount[,, whichdepth,1] < 3)) length(tempcount[,, whichdepth,1]) 1-under3data/totaldata

APPENDIX 3: A statistical oddity. In the course of doing this, I got to wondering about how accurate the calculation of the standard error of the mean (SEM) might be when the sample size is small. It’s important since so many of the gridcell/depth/month combinations have only a few observations. The normal calculation of the SEM is the standard deviation divided by the square root of N, sample size.

I did an analysis of the question, and I found out that as the number of samples N decreases, the normal calculation of the SEM progressively underestimates the SEM more and more. At a maximum, if there are only three data points in the sample, which is the case for much of the WOA09 monthly climatology, the SEM calculation underestimates the actual standard error of the mean by about 12%. This doesn’t sound like a lot, but it means that instead of 95% of the data being within the 95% confidence interval of 1.96 * SEM of the true value, only about 80% of the data is in the 95% confidence interval.

Further analysis shows that the standard calculation of the SEM needs to be multiplied by

0.43 N^{ -1.2}

to be approximately correct, where N is the sample size.

I also tried using [standard deviation divided by sqrt (N-1)] to calculate the SEM, but that consistently overestimated the SEM at small sample sizes

The code for this investigation was:

sem=function(x) sd(x,na.rm=T)/sqrt(length(x)) # or, alternate sem function using N-1 # sem=function(x) sd(x,na.rm=T)/sqrt(length(x) - 1) nobs=30000 #number of trials sample=5 # sample size ansbox=rep(NA,20) for (sample in 3:20){ mybox=matrix(rnorm(nobs*sample),sample) themeans=apply(mybox,2,mean) thesems=apply(mybox,2,sem) ansbox[sample]=round(sd(themeans)/mean(thesems)-1,3)}