Numerical analysis of the accuracy of bivariate quantile distributions utilizing copulas compared to the GUM supplement 2 for oil pressure balance uncertainties

In the field of pressure metrology the effective area is Ae=A0 (1+ lP) where A0 is the zero-pressure area and l is the distortion coefficient and the conventional practise is to construct univariate probability density functions (PDFs) forA0 and l. As a result analytical generalized non-Gaussian bivariate joint PDFs has not featured prominently in pressure metrology. Recently extended lambda distribution based quantile functions have been successfully utilized for summarizing univariate arbitrary PDF distributions of gas pressure balances. Motivated by this development we investigate the feasibility and utility of extending and applying quantile functions to systems which naturally exhibit bivariate PDFs. Our approach is to utilize the GUM Supplement 1 methodology to solve and generate Monte Carlo based multivariate uncertainty data for an oil based pressure balance laboratory standard that is used to generate known high pressures, and which are in turn cross-floated against another pressure balance transfer standard in order to deduce the transfer standard’s respective area. We then numerically analyse the uncertainty data by formulating and constructing an approximate bivariate quantile distribution that directly couples A0 and l in order to compare and contrast its accuracy to an exact GUM Supplement 2 based uncertainty quantification analysis.


Introduction
In the field of pressure metrology when piston-cylinder operated pressure balances are used to measure applied pressures defined as P = F/A e where F is a generalized force and A e is the effective area, the most widespread model to characterize a pressure balance is that of a two parameter model such that A e = A 0 (1 + lP) where A 0 is the zero-pressure area and l is the distortion coefficient. As a result knowledge of both the expected values as well as the associated covariance of a pressure balance's effective area model parameters A 0 and l are intrinsically necessary in order to fully quantify the uncertainty in A e for subsequent application in pressure generation and measurement tasks. The conventional practise by many metrologists when performing an uncertainty quantification (UQ) analysis of a piston-cylinder operated pressure balance is to characterize the model parameters A 0 and l based on an analysis of the combination of the available experimental data such as cross-floating measurements and theoretical data such as finite element simulations. This analysis is usually performed with the application of the classical sensitivity coefficient based formulation of the GUM [1], or alternatively with the GUM Supplement 1 (GS1) Monte Carlo based UQ technique [2] in cases where the measurand model is either too complex or too nonlinear.
When these UQ techniques are applied the resulting multivariate UQ data for the pressure balance model, either implicitly in terms of a multivariate Gaussian distribution if generated with the GUM or potentially in terms of a multivariate non-Gaussian distribution if generated with the GS1, is then typically approximated with univariate probability density function (PDF) distributions for A 0 and l due to the complexity of the models, whilst the covariance information through the available experimental and theoretical available data in terms of the coupling between A 0 and l is then usually either neglected, estimated or occasionally indirectly approximated.
Univariate PDFs of measurands have traditionally been analytically specified either as Gaussian or Student tdistributions in the case of a GUM based analysis, or alternately through discrete representations of either the underlying distribution function or equivalently the PDF when the uncertainty analysis was performed using the GS1 approach. In GS1 based uncertainty analysis approaches generated Monte Carlo data has in a majority of cases been utilized for subsequent analysis by retrieving previously generated and stored numerical data from electronic files. Recent work by Harris et al. [3] considered the problem of how to analytically represent GS1 numerical data by investigating and studying different families of quantile functions and approaches in order to estimate the respective quantile function parameters. Their investigations demonstrated the utility of extended lambda distribution (ELD) based quantile function representations to accurately summarize arbitrary PDFs generated using Monte Carlo simulation data. This methodology has recently been applied by Ramnath [4] in the field of gas pressure metrology through the use of an ELD based quantile function for the characterization of a pressure balance's effective area in the low pressure range 5 kPa p 7 MPa where the distortion coefficient l is considered negligible and the pressure balance characteristics is fully specified in terms of just a univariate PDF of the zero-pressure area A 0 . In this particular investigation a non-continuum gas flow was present and as a result the GUM was not appropriate due to the presence of a highly non-linear measurand model. The measurand model of the gas pressure balance's effective area which was numerically solved with a corresponding uncertainty PDF determined through a GS1 Monte Carlo methodology demonstrated the feasibility of utilizing quantile functions for characterizing univariate PDFs which may exhibit deviations from more traditional Gaussian shaped deviations such as asymmetry and/or skewness in more complex fluid based primary scientific measurement systems.
Unfortunately for pressure balances operated at higher pressures where the effect of the distortion coefficient is significant due to fluid structural interaction effects the uncertainties and correlation effects between A 0 and l for hydraulic oil operated gauge pressure balances in the medium pressure range 7 MPa P 30 MPa must be considered. These effects are considered particularly critical and essential in the high pressure range 30 MPa P 500 MPa when calculating the uncertainty of a pressure balance's effective area. As a result based on this observation from an UQ perspective it is considered desirable and beneficial if a pressure balance model can be characterized directly in terms of a joint bivariate PDF in order to preserve intrinsic statistical uncertainty analysis information that may utilized be in order to accurately determine and quantify the magnitudes of the correlation effects that are present in physical pressure balances operated at high hydraulic pressures where there is strong coupling between A 0 and l.
Based on these motivating factors our approach in this paper is to firstly utilize an implicit multivariate GUM methodology which is used to solve for and characterize generated applied pressure data values and uncertainties for an oil pressure balance laboratory standard by sampling from appropriate PDF and joint PDF distributions where available. We then utilize the known generated pressure values and uncertainties as inputs into a full GS1 Monte Carlo simulation which is used to numerically generate multivariate data for the cross-floating of a transfer standard. Once the multivariate data for the transfer standard has been generated we then post-process it using the GS2 [5] methodology for further analysis in order to determine the actual ellipsoidal and smallest coverage regions for the coupling of the transfer standard's zeropressure area A 0 and distortion coefficient l in a joint PDF. The exact coverage regions obtained with the GS2 methodology are then compared with results obtained with approximate bivariate quantile distributions that we formulate in this paper in order to investigate and determine the potential accuracy of bivariate quantile distributions for modelling pressure balance characteristics in terms of joint PDFs.

Mathematical models
The mathematical model for a pressure balance using a working fluid in a liquid state may be formulated as a pressure measurand equation through the analysis of a freebody diagram such that the applied pressure P = (p À p a ) is of the form where p is the working fluid pressure, and p a is the surrounding ambient pressure as per the original formulation by Dadson et al. [6]. In this model m i and r i are the actual mass and density values for the system of weights used to generate the respective pressures if there are n total mass pieces, r a is the ambient density, V s is the submerged volume of the piston into the working fluid, r f is the density of the working fluid at an actual pressure of p, g is the local gravitational acceleration constant, s is the surface tension of the working fluid where C is the circumference of the wetted perimeter of the piston in contact with the working fluid, H is a hydrostatic height term to account for any potential differences in elevation between the reference datums of generated pressues and measured pressures, and S is the effective area of the pressure balance. In the absence of additional information a useful approximation that may be used to estimate the circumference is C ¼ ffiffiffiffiffiffiffiffi ffi 4pS p . Due to the fact that physical measurements may occur in which the temperature of the pressure balance may vary through a combination of ambient conditions and operating conditions the effective area is usually adjusted to correspond to a known reference temperature such that S = A 0 [1 + lP] f (t, t ref ) and f (t, t ref ) =1 + a (t À t ref ). As per the discussion by Dadson et al. a is the sum of the thermal expansion coefficients of the piston and cylinder so that a = (a p + a c ), and the reference temperature is fixed to a constant value which is usually just set to t ref = 20°C in many commonwealth countries, and to 23°C in other countries. Following the general approach by many pressure metrologists the ambient air density may be specified using the CIPM-2007 formula as r a ¼ pM a i as discussed by Picard et al. [7] where M a = 28.96546 Â 10 À3 kg mol À1 is the molar mass of dry air, M v = 18.01528 Â 10 À3 kg mol À1 is the molar mass of water, Z is the air compressibility, R = 8.314472 J mol À1 K À1 is the CODATA-2006 recommended value for the universal gas constant, and x v is the mole fraction of water vapour for the corresponding pressure, temperature and relative humidity conditions for the surrounding ambient air. The previous formulae may now be combined and rearranged as a nonlinear function f ⭑ ðp; q ⭑ 1 ; . . . ; q ⭑ m Þ where p is the unknown working fluid pressure p which is to be determined and q ⭑ j ; j ¼ 1; . . . ; m are laboratory standard parameters such that The generated pressure p of the working fluid from a laboratory standard may then be obtained by solving f ⭑ ðp; q ⭑ 1 ; . . . ; q ⭑ m Þ ¼ 0 with any suitable solver by noting that r f , C and S formally depend on the pressure p whilst all the other terms are independent of the working fluid's pressure. The above calculations are performed if we assume that the air density r a = f a (p a , t a , h a ) and fluid density r f = f f (p, t) are calculated by known functions in terms of the independent variables, and where f ⭑ ðp; q ⭑ 1 ; . . . ; q ⭑ m Þ formally has units of applied pressure which allows for the numerical solution to be obtained for any specified applied pressure accuracy level.
A simple approach to solve for the unknown generated pressure p is to assume that p min p p max and search within this pressure range where p min and p max are rough estimates of the nominal pressure. As an example for an applied pressure of 250 MPa with a nominal atmospheric pressure of p a = 101.325 kPa first calculate the approximate pressure as p 0 = P + p a and then for example set p min = 0.95p 0 and p max = 1.05p 0 as specifications to search within this interval or alternately to solve the full non-linear solver of f ⭑ ðp; q ⭑ 1 ; . . . ; q ⭑ m Þ ¼ 0 using p 0 as a starting solution. The next issue which naturally arises is how to solve this particular non-linear equation noting that for the generating pressure equation both the density r f and the wetted circumference C ¼ ffiffiffiffiffiffiffiffi ffi 4pS p depend on the pressure p. Ideally we would use for example Newton's method x n+1 = x n À [f(x n )]/[f0(x n )], n = 0, 1, 2, ... for successive approximations to solve f (x) =0 where f0(x n ) denotes the derivative of the function where the benefits of using an iterative solver is that the generating pressure may be calculated to any desired numerical accuracy level. In our particular case it is however cumbersome to analytically calculate the derivative due to the choice of the fluid's equation of state. Fortunately there are derivative free iterative formulae as reported by Dehghan and Hajarian [8] to solve a non-linear equation f (x) =0 and we may to use a forward difference formula originally developed by Steffensen such that . As a result through the use of the Steffensen formula we may control the numerical precision of the solved for applied pressures P k , k = 1, … , n p where n p are the number of generated pressures by specifying the number of iterations.
In pressure metrology practise there is generally only have a limited indirect level of covariance information available and it is therefore common to disregard covariance terms in the f ⭑ ðp; q ⭑ 1 ; . . . ; q ⭑ m Þ ¼ 0 model by setting cov (q i , q j ) =0, i ≠ j in the generated pressure calculation. The only potential exception where cov (q i , q j ) ≉0 i ≠ j is in terms of the correlation between A 0 and l i.e. for cov (A 0 , l), and in the correlations between the weights and densities used i.e. cov (m i , m j ), cov (r i , r j ) and cov (m i , r j ) respectively for i ≠ j unless otherwise specified.
The issue of how to approximate the correlation for weights was previously investigated by Bich [9] who utilized the multivariate GUM approach of Cox and Harris [10] in a slightly modified equation for a mass measurement as a general implicit equation. Later investigations by Palencar et al. [11] followed a similar multivariate GUM framework but refined the analysis slightly by using a least squares formulation to deduce the optimal values of the parameters in a calibration of mass pieces. Their investigation concluded that covariance terms can actually contribute non-negligible terms in the uncertainty analysis of mass measurements and that ideally covariance information should be included in mass measurement calibration certificates.
Whilst the approach of Palencar et al. [11] does allow for the covariance information to be directly estimated under certain restricting assumptions for mass pieces traceable through a sequence of measurements to a particular country's national kilogram prototype standard or equivalent national metrology institure primary mass standard, in this paper we will disregard the correlation effects between mass and density terms used by the set of weights in the laboratory standard pressure generation model for simplicity. This modelling simplification is due to the fact that in most practical cases calibrated mass certificates do not provide any covariance information for the weights used by a pressure balance. In addition for simplicity we will also disregard the covariance between the laboratory standard zero-pressure area and distortion coefficient cov (A 0 , l) in the pressure balance that is used to generate the known applied pressure due to that fact that many existing pressure calibration certificates do not generally provide any specific explicit covariance information.
Our approach of disregarding the covariance term cov ( A 0 , l) for the generated pressure in the absence of specific known covariance information is consistent with experimental recommendations for national metrology institutes for pressure balances operated in free deformation mode as documented in for example EURAMET CG3 [12], and results from theoretical finite element based studies for pressure balances operated in both free deformation and controlled clearance modes such as the EUROMET Project 463 by Sabuga et al. [13] and in controlled clearance modes only by Dogra et al. [14] respectively. As a result of our choice to disregard covariance effects in the generated pressure calculation when implementing a GS1 Monte Carlo approach as discussed by Cox and Siebert [15] for the pressure generation calculations, it will then usually suffice to sample with appropriate numerical techniques for the inputs q i from univariate PDFs g i (j i ) associated with the corresponding input parameters q i where j i is a random variable of q i unless information of a joint PDF is specifically available. Unless otherwise specified the PDFs for the input parameters will usually be Gaussian distributions however following the more recent work by Harris et al. it is now in principle possible to model the inputs as arbitrary PDFs using a univariate ELD quantile distribution as originally investigated by Willink [16].
We comment that even though we disregard covariance terms in the inputs of the generated pressure balance model in the absence of available information that when these generated pressures are used as inputs in the cross-floated pressure balance model to deduce the corresponding effective areas that due to the complexity of the mathematical model the covariance information in the parameters used to model the effective areas will still be present in an implicit form of the associated Monte Carlo data of the cross-floated effective areas.
Once the generated pressures from the laboratory standard (LS) pressure balance model are obtained and the GS1 information appropriately processed, the generated pressure PDFs g ⭑ i ðj ⭑ i Þ where j ⭑ i is a random variable for generated pressure p i may then used as inputs for when the LS is cross-floated against another pressure balance such as a unit-under-test or transfer standard (TS) in order to determine the other pressure balance's effective area. When two pressure balances are cross-floated against each other the point of equilibrium occurs when the applied pressure P = (p À p a ) in both pressure balances are equal in which case the measurand equation for a cross-floated TS is In the above equation S is the unknown TS effective area which must be determined, and q j , j = 1, … , m are corresponding parameters associated with the cross-floated TS pressure balance measurand equation. The benefit of this particular equation formulation is that f (S, q 1 , … , q m ) formally has units of cross-floated area which then allows for the numerical solution accuracy to be controlled by specifying the number of iterations. Whilst a simple estimate to solve for the generated pressure is to assume a range of possible generated pressure values using for example 0.95p 0 p 1.05p 0 where p 0 is the approximate nominal pressure p 0 = P + p a as previously discussed such a straightforward estimate of the range of possible crossfloated areas of the TS is not possible since we have no prior information in order to determine a possible range of values S min S 0 S max to search within. A simple initial approximation to overcome this difficulty is to ignore the effect of the oil surface tension and TS submerged volume on the TS cross-floated area so that a rough cross-floated area estimate is S 0 ≈ P n i¼1 Â m i g 1 À r a r i =½ðp À p a Þ À Hgðr f À r a Þ since the hydro-static pressure head term would not be negligible and to then use a search range of say 0.95S 0 S 1.05S 0 in order to solve the full non-linear equation f (S, q 1 , … , q m ) =0 or alternately by using for example the Steffensen formula as previously discussed.
In this paper the ⭑ super-scripts signifies terms specific to the LS pressure balance which is used to generate the pressure whilst the unstarred terms represent quantities that are specific to the TS pressure balance which is crossfloated in order to determine corresponding effective areas. The model f (S, q 1 , … , q m ) =0 using the previous generated pressure PDF g P k ðj P k Þ inputs may then be solved in another GS1 simulation to obtain the PDF distribution g S k ðh S k Þ for the cross-floated effective areas S k for each pressure P k where h S k is a random variable of the crossfloated effective area S k .
Since the cross-floated effective areas S k are univariate data it follows that an ELD-QF distribution may also be used to conveniently summarize S k in terms of parameters {a k , b k , c k , d k } for k ∈ [1, … , n P ] for each of the n P crossfloat effective area measurements corresponding to the n P generated pressures where as previously discussed the cross-floated effective area is modelled in terms of a two parameter model such that S = A 0 (1 + lP) where P is the applied pressure.
For our particular pressure balance problem a GS1 based uncertainty analysis is utilized for conceptual simplicity for the LS generated pressures in order to generate data for further analysis when the LS is crossfloated against the TS pressure balance as this avoids some of the subjectivity in certain aspects of the correlation effects modelling in the GUM based matrix analysis.
The GS1 methodology is focused on multivariate input models with an input X ¼ ½X 1 ; . . . ; X N T and a single output Y ¼ ½Y 1 defined in terms of an implicit equation hðy; xÞ ¼ 0 such that the expected valueỹ, variance u 2 ðỹ~Þ and distribution functionG~Y ðhÞ are approximated using M Monte Carlo simulations such that hðy r ; x r Þ ¼ 0 where r = 1, 2, … , M,ỹ~¼ 1 M P M r¼1 y r , u 2 ðỹÞ ¼ 1 Mðy rþ1 Ày r Þ where y r h y r+1 for r = 1, 2, … , M À 1. When the GS1 methodology is implemented for the LS generated pressure P k , k = 1, … , n P there will be a set of Monte Carlo univariate simulation data P k ¼ fP ð1Þ k ; :::; P ðMÞ k g for each generated pressure which may then be used as an input for the cross-float of the TS.
A GS2 methodology on the other hand when contrasted to the GS1 method also considers the case for an input X but where there are now multiple outputs such that Y ¼ ½Y 1 ; :::; Y m T defined in terms of an implicit vector equation hðy; xÞ ¼ 0. This methodology is also implemented by generating Monte Carlo data by first sampling from the input vector PDF for x r ∼ g X ðjÞ or relevant associated joint PDFs if this information is available such that hðy r ; x r Þ ¼ 0 for r = 1, 2, … , M,ỹ~¼ 1 M ½y 1 þ ⋯ þ y M , Uỹ ¼ 1 MÀ1 ðy 1 ÀỹÞ⋅ðy 1 ÀỹÞ T þ:::þðy M ÀỹÞ⋅ðy M ÀỹÞ T Â Ã and G ¼ ½y 1 ; . . . ; y M where M is the number of Monte Carlo simulation events in order to produce a dataset y r . In our particular case when the GS2 methodology is implemented for the TS cross-floated area S k , k = 1, … , n P there will be a set of Monte Carlo bivariate simulation When the GS2 uncertainty analysis method is implemented and the Monte Carlo simulation data postprocessed an implicit multivariate model may be used to construct the average vector valueỹ and corresponding covariance matrix Uỹ . As a result when using a GS2 approach the method as per the officy ∼ N ðm; VÞmplicit assumption that the output follows a multivariate Gaussian distribution such that y ∼ N ðm; VÞ where the expected value is m ≈ỹ and the covariance matrix is P ≈ Uỹ~unless otherwise specified. This feature is common to both the GS1 and GS2 due to the extreme mathematical complexity and numerical challenges in directly solving the Markov integral which provides for the exact PDF or joint PDF without any assumptions for both univariate as well as multivariate models. For the case of the GS2 the joint PDF for a multivariate Gaussian may then be written in matrix notation such that where V À1 is the inverse of the matrix V, and jV j ¼ detðVÞ denotes the determinant of the matrix V. Using this notation in the case of a bivariate distribution with an input X ¼ ½X 1 ; X 2 T and a corresponding random variable j ¼ ½j 1 ; j 2 T the expected vector m and covariance matrix V may then be specified as where s 2 X 1 and s 2 X 2 are the variances for X 1 and X 2 respectively, and r is an indication of the covariance between the inputs X 1 and X 2 . The use of a Pearson r instead of r i.e. Spearman's rho as an indication of the correlation is usually the more common approach in mechanical metrology problems however other approaches such as Kendall's tau are also possible. The main advantage of these approaches are that the quantification of the extent of the correlation between the random variables x 1 and x 2 do not depend of the choice of parametrization for the model. Some additional benefits of using either a Kendall's tau or Spearman's rho instead of the Pearson's correlation coefficient are that these options are usually a more accurate indicator of correlation when the random variables X 1 and X 2 do not follow a Gaussian joint PDF since a Pearson's correlation can sometimes give misleading results if the random variables do not follow multivariate normal distributions as discussed in more detail by Goda [17]. If (x 1 , y 1 ) , … , (x n , y n ) are a set of n observations then Kendall's tau t K may be calculated as where C is the number of concordant pairs and D is the number of discordant pairs for the set of observations where a particular pair (x i , y i ) and (x j , y j ) for i ≠ j is concordant if x i > x j and y i > y j or alternately if x i < x j and y i < y j , whilst the same pair is discordant if x i > x j and y i < y j or if x i < x j and y i > y j . In the special case if x i = x j and y i = y j then the pair is neither concordant nor discordant and does not contribute in the calculation of t K . The use of Kendall's tau t K provides a useful simplification later in the paper for constructing particular formulations of copulas. The benefit of using a GS2 approach to directly calculate the cross-floated effective areas is that any covariance information is immediately present and available from the covariance matrix Uỹ, and as a result this approach avoids any additional modelling assumptions for covariance terms. We will utilize the above S k ¼ f½ðA 0 Þ ð1Þ k ; l ð1Þ k T ; . . . ; ½ðA 0 Þ ðMÞ k ; l ðMÞ k T g multivariate dataset in order to investigate the accuracy of a bivariate quantile distribution in approximating a joint PDF for A 0 and l.
Once the univariate generated pressure Monte Carlo data ℘ k has been generated it is then necessary to summarize them. In the case of a univariate PDF g (h) with a random variable h and output Y one particular definition of defining a quantile function Q (r) following the approach of Harris et al. is by setting the distribution function as G (h) = r where 0 r 1 such that the quantile The practical consequences of this definition is that if r is sampled from the rectangular distribution R (0, 1) then Q (r) corresponds to a sampling from g (h), [Q (p 1 ) , Q (p 2 )] is a probabilistically symmetric coverage interval for p if p 1 ¼ 1 2 ð1 À pÞ and p 2 ¼ 1 2 ð1 þ pÞ, and that the expectation and variance for Y may be calculated as EðY Þ ¼ ∫ 1 0 QðrÞ dr and V ðY Þ ¼ ∫ 1 0 ½QðrÞ À EðY Þ 2 dr respectively. For the univariate case where a quantile function distribution following the approach of Willink [16] utilizing an ELDs is implemented it has been demonstrated by Willink that the corresponding QF and ELD-QF for the PDF may be simply represented in terms of real parameters a, b, c and d. Technical details to determine the values of a, b, c and d for an ELD were discussed in full and complete detail by Willink who compared the method of using four moments, four quantiles, and a modified method using the mean, variance and two quantiles. For pressure metrology purposes the method of using four quantiles is considered relatively simple and straightforward for pressure balances with univariate effective area models such as gas operated pressure balances at low pressure as discussed by Ramnath [4]. The approach of using quantile functions such as ELDs to summarize Monte Carlo uncertainty analysis data was further developed by Harris et al. [3] where they extended the earlier work of Willink for univariate distributions of a quantity Y to summarizing an arbitrary parent probability distributions denoted as P in terms of a generalized quantile function. This approach involved the utilization of the distribution function G P (z) and PDF G P (z) where z ∈ Z ∼ P is a random variable for the parent probability distribution and h ∈ Y is a random variable for the output Y. According to this approach where h = F (z) the random variable is calculated as h ¼ F ðG À1 P ðrÞÞ where r ∈ U ∼ V. Ramnath: Int. J. Metrol. Qual. Eng. 8, 29 (2017) R (0, 1) is a random variable following a rectangular distribution and the PDF of Y is calculated as The key simplification of this approach is that the generalized quantile function defined in terms of F (z) simply requires knowledge of F (z) and a mechanism of sampling random variables from the parent distribution z ∈ Z ∼ P. In this paper we will opt for this general approach of Harris et al. since the ELD approach of Willink using four parameters to model the PDFs of the generated pressures and cross-floated areas may not necessarily be adequate. For simplicity we will use a random variable r ∈ R (0, 1) following a rectangular distribution as this is convenient with readily available random number generators. Although it is possible to use an increasing number of parameters over and above the four parameters in for example an ELD based QF through the calculation of higher order moments by working out for example the coefficients of B-spline approximations for F (z) with open source software routines [18] our approach will simply directly utilize the actual GS1 distribution functions for each of the generated pressures and crossfloated areas when constructing the joint PDF.
Earlier approaches such as that by Ramnath [19] utilized statistical linear regression analysis techniques developed by Krystek and Anton [20] for straight line data with correlation in plots of applied pressure [P ± u (P)] data versus cross-floated effective area [S ± u (S)] data with associated error bars. In this approach the generated pressure and cross-floated areas uncertainties were calculated with the classical sensitivity coefficient based GUM and the straight line fit for the curve S = A 0 (1 + lP) allowed for A 0 and A 0 l to be estimated, and which were then in turn used to estimate the distortion l and indirectly estimate the correlation between A 0 and l. In this paper our approach is different since our objective is to instead construct an approximation to the actual joint PDF g X ðjÞ with X ¼ ½A 0 ; l T so that we may directly sample random values j A 0 and j l from the underlying joint bivariate PDF similar to how univariate random variables j P may be sampled from the generated pressure PDF.
The GS2 documentation specifically focuses on multivariate Gaussian distributions x ∼ N ðm; VÞ for a model with an output x, an expected value m and an uncertainty matrix V. Explicit guidelines for sampling for such a joint PDF are provided such that a sampled draw may be estimated as . . . ; X N T and R is an upper triangular matrix formed from a Cholesky decomposition such that U x ¼ R T R. This focus is achieved through the assumptions that m ≈ỹ~and V ≈ U x , however in general it may not necessarily be the case that a multivariate Gaussian is the best choice of PDF.
Although there may be some merits in particular cases of modelling the multivariate Monte Carlo data in terms of non-Gaussian distributions the benefits of doing so may be outweighed by the necessity of selecting, deciding on and choosing alternative probability distribution information formulations as discussed by Tang et al. [21] who studied these effects in certain applications using the equivalent probability distribution information contained in appropriate choices of copulas. The theoretical approach of constructing bivariate distributions as discussed by Tang et al. for constructing the joint cumulative distribution function (CDF) and PDF of variables X and Y is possible if the marginal distributions of X say F (x) and that of Y say G (y) where x and y are corresponding random variables are both known. If a particular choice of copula function is specified then the application of Sklar's theorem as discussed by Tang et al. allows for the joint PDF f (x, y) to be constructed as fðx; yÞ ¼ F ðxÞGðyÞcðF ðxÞ; GðyÞ; uÞ ð 7aÞ cðF ðxÞ; GðyÞ; uÞ ¼ where u = F (x), v = G (y) and the copula is defined as is known as the copula density and I = [0, 1]. The copula C occasionally written as C u (u, v) where u is a fixed parameter is formally defined as a mapping from the unit square I 2 to the unit interval I. In this formulation u is a parameter associated with the choice of the copula function for the underlying data specific to the particular model data as discussed by Genest and Favre [22]. One particular example of a copula are those belonging to the Farlie-Gumbel-Morgenstern family where the copula is constrained by the Frechet-Hoeffding bounds such that These bounds are used if X and Y are not independent, whilst a choice of C = P = uv may be used if for example X and Y are fully independent where the degree of independence between the variables X and Y may be estimated in terms of either the Spearman's rho or Kendall's tau values as previously discussed. In the particular case of bivariate distributions the formal definition of a copula is that of a mapping such that Cð0; xÞ ¼ Cðx; Cð1; xÞ ¼ Cðx; 1Þ ¼ x ∀x∈I ð8cÞ 0 ½Cðb; dÞ þ Cða; cÞ À Cða; dÞ À Cðb; cÞ ∀a; b; c; d∈I; a b and c d ð8dÞ Different types of copulas are considered later in the paper when we construct the exact joint PDF from a GS2 Monte Carlo simulation and utilize various choices of copulas in approximating the actual bivariate joint PDF in order to investigate the suitability of various choices of bivariate quantile constructions of which copulas are one particular choice amongst several for modelling and summarizing a pressure balance's joint PDF. Regardless of the particular choice of copula C u (u, v) it is seen that the construction of the joint PDF essentially reduces to that of the choice of a mapping function FI 2 →I.
This choice of mapping function is necessary in order to relate how variables 0 p, r 1 may be used as inputs for the mapping function to generate the corresponding random variables consistent with the joint PDF f (x, y). As a result although the absence of guidance from the GS2 for sampling from non-Gaussian multivariate PDFs is potentially problematic this aspect is not an issue since it turns out that sampling from a bivariate quantile function based distribution may still be achieved using sampling from associated univariate PDFs and two dimensional coordinate transformation mappings.
In this paper our main motivation is to summarize the GS2 based joint PDF using a bivariate quantile smoothing spline that was originally developed by He et al. [23] where a response surface Z depends on two variables X and Y where it is assumed that the observations z ij is known at each (x i , y j ) for i = 1, … , m and j = 1, … , n such that x 1 < . . . < x m and y 1 < . . . < y n for convenience. In the original paper by He et al. it was demonstrated that the optimal solution for fitting the surface and they considered the special case where the covariates X 1 and X 2 were in the domain [0, 1] Â [0, 1] which corresponds to our particular problem for random variables 0 p, r 1.
One approach to construct the joint PDF is based on the direct use of the Markov formula as discussed by Cox and Siebert [15] where the PDF denoted as g Y ðhÞ for a model Y ¼ fðXÞ is formally defined in terms of the joint PDF denoted as g X ðjÞ as g Y ðhÞ ¼ ∫ ∞ À∞ g X ðjÞdðh À fðjÞÞ dj 1 ⋯ dj N . For our model the cross-floated area of the pressure balance is simply S = A 0 (1 + lP) so this may be formally expressed as S ∼ g S (h) so that In the above formulation g A 0 ;l ðj A 0 ; j l Þ is the unknown joint PDF which is not necessarily a multivariate Gaussian where the applied pressure is independent since in practical terms it can be varied by simply setting the working fluid pressure. As a result the PDF for the pressure can be uncoupled from the joint PDF for the area. It follows that the above continuous multivariate integral may then be approximated as a system of simultaneous equations such that Whilst the Markov formula may in principle be used to calculate the exact joint PDF it is however not generally feasible in practical terms due to finite arithmetic precision resolution errors. These issues are due to the large difference in magnitudes from the cross-floated area PDF g S (h), the joint PDF g A 0 ;l ðj A 0 ; j l Þ and pressure PDF g P (j P ) which when utilized to build up an equivalent system to solve the Markov integral equation will result in an ill-conditioned linear system of the form Ax ¼ B where the coefficient matrix A is built up in terms of the PDF g P (j P ), the known vector B is built up in terms of the PDF g S (h) and the unknown x is the values of the joint PDF g A 0 ;l ðj A 0 ; j l Þ at fixed coordinates for a chosen grid of j A 0 and j l points. Extensions beyond standard IEEE 32-bit and 64-bit accuracies to variable precision arithmetic (VPA) accuracies using Fortran/C++ based implementations to mitigate against finite arithmetic resolution errors are discussed in more detail by Bailey and Borwein [24]. Our approach in this paper is to instead use the GS2 approach for multivariate measurand models of the form hðy; xÞ ¼ 0 for an input x and output y. If x ∼ g X ðjÞ and y ∼ g Y ðhÞ where j and h are random variables then model must also satisfy hðh; jÞ ¼ 0. This means that if the model undergoes a Monte Carlo simulation then h can also be post-processed in order to determine its probability distribution as per the GS2 documentation if h can be recovered from the equation hðh; jÞ ¼ 0.
The utilization of the GS2 to determine A 0 and l where in our case h ¼ ½A 0 ; l T is slightly more complicated since the pressure balance model for the TS f (S, q 1 , … , q m ) =0 is not in the standard form where there is an explicit system of equations for the parameters A 0 and l. If we however consider the entire set of cross-float measurements as one system we may then construct a new vector equation h (y, x) =0 by employing a unweighted linear least squares which is discussed in more technical detail by White and Saunders [25].
For our purposes we will utilize the well known results for a general linear unweighted least squares problem where the function is constructed as yðxÞ ¼ P N i¼1 a i X i ðxÞ in terms of an independent variable x, and parameters a 1 , … , a N where X i (x) are specified basis functions as discussed by Press et al. [26]. Following this approach a merit function is constructed as 2 and minimized by calculating the parameter values to satisfy ∂x 2 /∂ a i = 0, i = 1, … , N. Implementing this approach to our particular problem where n P is the number of generated pressures and associated cross-floats with S = A 0 (1 + lP) then yields so that the simultaneous system of equations ∂x 2 /∂ A 0 = 0 and ∂x 2 /∂ l = 0 may then be used to implement the GS2 methodology where V. Ramnath: Int. J. Metrol. Qual. Eng. 8, 29 (2017) where in our case y ¼ ½A 0 ; l T and x ¼ ½q 1 ; . . . ; q m T as per our earlier mathematical modelling approach. In practical terms the approach used would be to sample random variables q ⭑ 1 ; . . . ; q ⭑ m in order to solve the LS equation f ⭑ (p, q 1 , … , q m ) =0 for the generated pressure, and to then use this as input to solve the TS equation f (S, q 1 , … , q m ) =0 for the cross-floated area S. As a result there will be a set of cross-floated areas S ¼ ½S 1 ; . . . ; S M T for the M Monte Carlo simulation events where each simulation event will have a known applied pressure. Since for each simulation event the solved for pressures and cross-float areas are statistically valid it follows that the model would also need to be satisfied for these values.
As a result the GS2 Monte Carlo simulation of unweighted least squares of statistically sampled values in accordance with the underlying probablity distributions then becomes equivalent to a conventional multivariate regression practice as discussed by Press et al. [26] since the sampled values are formally a statistically valid possibility based on the underlying probability distribution. This system may then be used to determine a Hyper-ellipsoidal coverage region which in our particular problem will correspond to a two dimensional region for the bivariate joint PDF.
In order to have a better conceptual mathematical understanding of the statistical definition of a quantile we first make the geometric observation that in the case of a univariate PDF g (h) that the corresponding univariate quantile Q 1D (r) function is technically a one dimensional mapping Q 1D : r→h ; s:t: r∈Rð0; 1Þ; h∈gðjÞ that transforms a random variable r sampled from a rectangular distribution into a corresponding random variable j such that the sampled value is h = g (j). We may then using this conceptual tool in this paper then use the approach of Gilchrist [27] to generalize the definition of a bivariate quantile Q 2D ([r 1 , r 2 ] T ) as a corresponding two dimensional mapping In particular if when implementing the previously discussed geometrical mapping mathematical definition for bivariate quantiles by taking h (p, r) =1 where p and r are random variables such that 0 p 1 and 0 r 1 a natural consequence which results is that the bivariate quantile may be considered to be the mapping from the unit rectangle to the corresponding surface of the joint PDF f (x, y) formally defined following the approach by Gilchrist as In the above formula J (p, r|x, y) =1/[J (x, y|p, r)] is the corresponding Jacobian of the transformation from the unit rectangle h (p, r) =1 to the joint PDF surface f (x, y).
The random variables x and y as outlined above may be assumed to be equivalent to x ≡ j A 0 and y ≡ j l for brevity later in the paper where they correspond to random variables associated with the zero-pressure area A 0 and distortion coefficient l respectively for our particular pressure balance cross-floated effective area model. If an appropriate two dimensional mapping is correctly chosen then the variables p and r may be independently sampled from a rectanguar distribution R (0, 1) and then the choice of mapping will allow for random variables to be sampled from the corresponding bivariate joint PDF. This purely mathematical transformation result is consistent with the earlier statistical based observation of the need for a choice of copula that may be used with the marginal distributions in order to construct the joint PDF. As a result of this observation we note that a bivariate quantile is not necessarily unique as there are in principle an arbitary number of mappings between the two surfaces however if we fix the type of mapping which is equivalent to a particular choice of copula then we can construct a unique associated bivariate quantile distribution. Our approach in this paper is to use the multivariate GS1 methodology to generate the actual bivariate joint distribution of the pressure balance TS g X ðjÞ where X ¼ ½A 0 ; l T and j ¼ ½j A 0 ; j l T are associated random variables sampled from a PDF corresponding to X. The actual joint PDF fðj A 0 ; j l Þ surface constructed out of the GS1 Monte Carlo simulation data is then numerically approximated with a bivariate PDF 'ðj A 0 ; j l Þ using a selection of two dimensional transformation mappings. The accuracy of the bivariate quantile distribution may then be investigated by determining how the corresponding effective area uncertainty calculated in terms of 'ðj A 0 ; j l Þ compares to the actual effective area uncertainties obtained through the actual GS1 Monte Carlo simulation cross-floating data.
The previous definition of bivariate quantiles by Gilchrist was developed in more formal mathematical statistical details in an earlier work by Chaudhuri [28] and may be formally extended to systems other than bivariate distributions with potential application to metrology problems involving higher dimensional non-Gaussian multivariate distributions. One particular model discussed by Gilchrist is the generalized circular model of form where 0 p, r 1 as previously discussed and b is an appropriate constant parameter that must be determined. In this formulation the first function x = Q x (p) is a univariate quantile function, whilst in the second function y = Q y (p, r) if x is fixed then p is in turn fixed so that the second function is actually also a univariate quantile function for y for a given choice of x. The practical consequence of choosing to model a bivariate quantile in terms of a marginal/conditional formulation is that the previously developed ELD formulation can in principle also be utilized to construct the associated level curves, however a potential drawback of a conventional marginal distribution approximation of a bivariate joint PDF is that in general only a fixed number of contour curves can be analytically constructed and hence interpolation is required in order to calculate random variables for arbitrary choices of specified values of p and r. Whilst marginal distributions for the zero-pressure area A 0 and distortion coefficient l can be constructed relatively easily using for example extended lambda based univariate quantile distributions a newer approach of fitting the actual bivariate joint PDFs with copulas is now available. In the copula approach one uses known specified marginal distributions which are relatively easy to calculate, and then optimizes parameters u ¼ ½u 1 ; . . . ; u n T associated with the particular choice of copula family C u ðu; vÞ in order more closely match the copula constructed joint PDF with the actual known joint PDF. As a result fitting copulas to joint PDFs can potentially offer considerable algebraic simplifications where for bivariate distributions the main choices that may be investigated are usually elliptical copulas and generalized Archimedean copula families. Whilst these particular copula based approaches have their own respective merits we comment that a generalized multivariate quantile function approach originally developed by Chaudhuri [28] may also be implemented for bivariate distributions. The starting point is to generate multivariate data points X 1 ; X 2 ; . . . ; X n ∈ℝ d ; 2 d∈ℤ in ℝ d which are assumed to be known through for example Monte Carlo simulations. For our purposes this Monte Carlo data will be generated with the GS2 approach as discussed earlier with hðy; xÞ ¼ 0 which will formally provide a sequence of data points y 1 ; . . . ; y M ∈ℝ 2 where M is the number of Monte Carlo simulation events and y q ¼ j . . . ; M and where we make the observation that we do not assume that m ≈ y and P ≈ U y but simply assume that y 1 ; . . . ; y M follows some underlying probability distribution which may not necessarily be Gaussian. By defining the set as B ðdÞ ¼ fu : u∈ℝ d ; kuk < 1g i.e. B (d) is an open unit ball, and by defining a functional as Fðu; tÞ ¼ ktk þ 〈 u; t 〉 where u∈B ðdÞ , t∈ℝ d where 〈⋅ , ⋅ 〉 represents the Euclidean inner product, a multivariate quantile function may then be formally mathematically defined asQ The above is a multivariate generalization for 2 d ∈ ℤ of the univariate case corresponding to d = 1 where the univariate quantile maps values of a for 0 a 1 with an associated parameter u = (2a À 1) to the one dimensional interval (À1, 1).
According to the above multivariate definition extreme quantiles would correspond to ku k ≈ 1 and central quantiles would correspond to ku k ≈ 0 respectively. An iteration algorithm in order to construct the quantile function was originally provided by Chaudhuri where for X 1 ; . . . ; X n the algorithm Step #1 is to compute for all of the corresponding data points X i for 1 i n and check if the degeneracy condition is valid for some 1 i n. If the degeneracy condition is satisfied for some i then just setQ n ðuÞ ¼ X i . Alternately if the degeneracy condition cannot be satisfied for any 1 i n then move to Step 2 by constructingQ n ðuÞ by solving the equation The above equation must be solved by using for example the method of successive approximations of which one possible approach is to use a starting solution composed of a vector of means such that where each mean is simply the mean of the respective set of components. Once this starting value has been constructed successive approximations for m = 2, 3, … may then be generated using iterations so that Q ðmþ1Þ n ðuÞ ¼Q ðmÞ n ðuÞ þ F À1 D where F ¼ X n i¼1 jX i ÀQ ðmÞ n j À1 ½I d À jX i ÀQ ðmÞ n j À2 fX i ÀQ ðmÞ n gfX i ÀQ ðmÞ n g T ð21Þ and I d is the d Â d identity matrix. Further aspects of how this definition of quantile function may be used to quantify and describe skewness and kurtosis in higher dimensional multivariate distributions is discussed in more detail in the original paper by Chaudhuri [28]. This geometric generalization of a quantile function in higher dimensional spaces may thus have some further technical potential for analysing metrology systems than cannot be modelled in terms of either univariate and bivariate probability distributions as a future topic of research study by metrologists particularly when non-Gaussian multivariate distributions may occur. In the context of pressure metrology such potential future research problems might include higher dimensional multivariate joint PDF's such as for pressure balances in free deformation mode or controlled clearance modes. Examples of higher dimensional multivariate probability distributions for pressure balances in free deformation mode operation include situations when non-linear elasticity theory is utilized and two distortion coefficient parameters are then necessary so that the effective area is modelled as A = A 0 (1 + l 1 p + l 2 p 2 ) and a joint PDF g A 0 ;l 1 ;l 2 ðj A 0 ; j l 1 ; j l 2 Þ must be constructed. On the other hand for pressure balances operated in controlled clearance mode such as when the Heydemann-Welch method is implemented for a primary standard scale realization for hydraulic pressures and the pressure balance effective area is modelled as A = A 0 (1 + a (t À t ref ) (1 + lP) [1 + d j (p j0 À p j )] as discussed by Kajikawa et al. [29,30] additional parameters such as the jacket pressure coefficient d j and zero-clearance jacket pressure p j0 are then present in which case a higher dimensional joint probability distribution g A 0 ;l;d j ;p j0 ðj A 0 ; j l ; j d j j p j0 Þ must be approximated to fully characterize a pressure balance.

Numerical simulations
In this paper we perform an analysis by utilizing the experimental data set previously reported by Ramnath [19] as indicated in Appendix A which provides full technical details and supporting data where the working fluid is assumed to be Di(2)-ethyl-hexylsebacate defined in terms of an oil equation of state reported by Kocas et al [31]. Numerical experiments were performed on a Toshiba laptop with an Intel Pentium B950 CPU operating at 2.10 GHz with 2 GB of RAM running on a Microsoft Windows 7 64-bit operating system using Gnu Octave 4.2.0. Simulations were undertaken with M = 10 000 Monte Carlo simulation events for each generated pressure and cross-floated area data point. The full GS1 simulation for ten generated pressures and cross-floats from 50 MPa to 500 MPa therefore consisted of 10 000 Â 10 = 100 000 total simulation events and took approximately 115 minutes to solve. When the GS2 was implemented by post-processing the GS1 data by using the cross-float data to fit a curve through the data points [P 1 , … , P 10 ] T and [S 1 , … , S 10 ] T in order to construct the x 2optimization for the best fit curve S = A 0 (1 + lP) the equivalent number of M = 10 000 simulation events took less than a minute to solve when using the built-in polyfit routine of Gnu Octave. As a result Monte Carlo simulations for 10 Â 10 3 M 25 Â 10 3 simulation events may be is considered a feasible option with single-core laptops/ workstations as these calculations would typically be able to performed in a single working day. On the other hand if higher counts of simulation events in the range 25 Â 10 3 M 100 Â 10 3 are required we comment that this would in most practical circumstances require access to either multicore computers or alternatively high performance computing parallel computing computers or clusters to avoid simulation times of a few weeks.
Our approach in this paper as outlined earlier in the paper first solves the underlying equations for the generated pressure f ⭑ ðp; q ⭑ 1 ; . . . ; q ⭑ m Þ ¼ 0 and cross-floated area f (S, q 1 , … , q m ) =0 using sampled random values q ⭑ and q which are drawn from the respective input PDFs which produce Monte Carlo GS1 data for the generated pressure MCP = [P 1 , … , P 10 ] and cross-floated area MCS = [S 1 , … , S 10 ] which are both M Â 10 matrices since each generated pressure is used to cross-float the TS. The Monte Carlo data of these matrices are built up in terms of column vectors of the form P k ¼ ½j ð1Þ P k ; . . . ; j ðMÞ P k T and S k ¼ ½h ð1Þ S k ; . . . ; h ðMÞ S k T where each column contains the Monte Carlo data for that particular generated pressure or cross-floated area so that the matrix data is then stored in data files for convenience. The distribution functions G P k ðj P k Þ and G S k ðh S k Þmay if necessary be calculated and summarized in terms of ELDs built up in terms of fours parameters a, b, c, and d such that h = Q (r) where and 0 r 1 is a random variable to generate the PDFs as originally discussed by Willink [16]. We comment that the generated pressures and cross-floated areas may if necessary also be calculated directly as m P k ¼ ∫ 1 0 Q P k ðrÞ dr and s 2 P k ¼ ∫ 1 0 ðQ P k À m P k Þ 2 dr where Q P k is the quantile function for pressure P k as summarized in Table 1 with similar expressions for the cross-floated areas as summarized in Table 2.
Once the expected values m P k and m S k are calculated this set of GS1 data which is consistent with the associated underlying PDFs may then be used to estimate the approximate nominal zero-pressure area m A 0 and distortion coefficient m l from the pressure versus area graph as illustrated in Figure 1a which is based on a GS1 simulation with M = 10 000 total simulation events as previously discussed. When this original GS1 data is then further processed in the GS2 simulation using the original GS1 data in the x 2 optimization to extract the values for A 0 and l it then results in the scatter plot as illustrated in Figure 1b which may then be further post-processed in order to construct the bivariate joint PDF.
An approximate visualization of the bivariate joint PDF using a kernal density estimate (KDE) approach along with visualizations of the associated marginal distributions for A 0 and l are illustrated in Figure 1c using a Gaussian kernal.
Our approach to construct the bivariate PDF uses the built-in histogram2d function from the Python numpy and scipy scientific computing libraries for convenience since we wish to fit the actual joint PDF f (x, y) where x and y are random variables. This choice of using histogram2d to explicitly build up a discrete approximation of the actual PDF with explicit bin counts that are appropriately weighted for a two dimensional normalization of the joint PDF is used over a KDE approach since the implementation of a KDE implicitly requires a choice of kernal function as discussed by Diwekar and David [32] where the PDF for univariate data X 1 ; . . . ; X n is constructed such that where h is the bandwidth and K is the kernel function. The use of a kernal function also occurs in the case of multivariate data X i ∈ℝ d with data points X 1 ; . . . ; X n where the multivariate PDF is constructed such that fðxÞ ¼ 1 and as a result a similar issue will arise if we opt to use a KDE formulation to construct the bivariate PDF since the choice of kernal used for the KDE construction will introduce a level of subjectivity in the bivariate joint PDF. Typically when implementing a KDE to construct a univariate PDF a Gaussian kernal function is usually used however other particular choices of kernals such as an Epanechnikov and biweight kernal can also be used. As a result different PDF estimates may occur when the underlying Monte Carlo data is post-processed using a KDE approach.
Since we do not have any a priori knowledge of the most appropriate underlying kernal our approach is to simply construct the actual discrete approximation of the joint PDF using the Python histogram2d function which correctly normalizes the two dimensional PDF such that ∫ ∞ À∞ ∫ ∞ À∞ fðx; yÞ dx dy ¼ 1 where in our particular problem the random variables are chosen such that x ≡ j A 0 =½m 2 and y ≡ j l /[Pa À1 ]. This is performed by using the well known Freedman-Diaconis statistical rule to estimate the bandwidth as where IQR (x) = Q 3 (x) À Q 1 (x) is the interquartile range for the the random variable x where Q 3 (x) and Q 1 (x) are the third and first quartiles. The number of bins h x associated with x using this choice of bandwidth is then calculated using the ceiling function as with similar associated expressions for the other random variable y. The final actual bivariate joint PDF from the GS2 simulation is illustrated in Figure 2 in natural physical units.
Referring to the two dimensional discrete histogram we observe that the limits for a joint PDF g A 0 ;l ðj A 0 ; j l Þ are approximately minðj A 0 Þ ¼ 1:9612 mm 2 , maxðj A 0 Þ ¼ 1:9617 mm 2 , min(j l ) =0.30 ppm/MPa and max(j l ) =1.10 ppm/M. In this joint PDF the random variable for area j A 0 is in units of m 2 so that when the pressure is in units of Pa the random variable for the distortion coefficient j l is in units of Pa À1 for dimensional consistency in order to satisfy the normalization condition ∫ ∞ À∞ ∫ ∞ À∞ g A 0 ;l ðj A 0 ; j l Þ dj A 0 dj l ¼ 1. For brevity let the random variable j A 0 be represented by x and the random variable j l be represented by y so that the joint PDF fðj A 0 ; j l Þ is just f (x, y) for convenience. If we construct variables such that the original random variables are mapped to equivalent scaled variables such that 0 p 1 and 0 r 1. As a result the original joint PDF g A 0 ; ðj A 0 ; jÞ may now be defined in terms of transformed random variables p and r. Carlo data of the cross-floated pressures P k and cross-floated areas S k is first pre-processed by averaging and plotted in order to get a rough qualitative estimate of the model parameters. In (b) the actual GS1 data points are then processed with the GUM Supplement 2 (GS2) using a suitable x 2 optimization to extract the actual model parameters and the discrete model parameter values plotted. In (c) the discrete model parameters are then post-processed using a kernal density estimation scheme in order visualize the mathematically continuously variable joint bivariate probability density function behaviour of the underlying model.
The normalization of variables is utilized due to scaling effects since it will be generally be more convenient to fit the normalized joint PDF such that ' (p, r) = [f (x, y)]/[max (f (x, y)]. As a result by fitting the normalized parameters 0 ' (p, r) 1 with 0 p 1, 0 r 1 and f max = max {f (x, y)} the scaled values can then be recovered such that fðx; yÞ ¼ f max 'ðp; rÞ ð 27cÞ This fit may be implemented using the marginal distributions form of the generalized bivariate quantile distributions for the zero-pressure area designated as g ⭑ A 0 ðj A 0 Þ in Figure 3a and for the distortion coefficient designated as g ⭑ l ðj l Þ in Figure 3b respectively. We comment that due to the shape of the PDFs for these marginal distributions as observed from the histogram for A 0 in Figure 3c and the histogram for l in Figure 3d respectively which are constructed with a Freedman-Diaconis choice of bin size that it would generally be convenient and reasonably accurate to approximate these PDFs in terms of either univariate ELDs as summarized in Table 3 or alternatively in terms of univariate splines where the degree of interpolation can be adjusted to refine the desired accuracy level.
The approach of sampling points x and y from a marginal distribution approximation of a joint PDF f (x, y) is to first generate a point p using a rectangular random number generator distribution which can then be used to calculate x from the corresponding marginal distribution Q x (p). Next by using this known value of x we solve for y from Q y (p, r) using the previously determined value of p however a practical implementation scheme is needed to construct the marginal distribution for Q y (p, r). In this paper for simplicity we opt to perform this implementation by constructing a set of functions for a set of contour curves for specified known values of x so that any arbitrary values can be estimated using for example bilinear or bicubic interpolation for simplicity.
A conventional classical marginal distribution approximation of the joint PDF may therefore be specified by the following three steps where a convenient functional form would in most practical cases be that of a spline fit such as for example a B-spline. The general algorithm would then implement the following general steps such that: Step 1. Fit the marginal distribution Q p (p) for the scaled variate p with any convenient practical functional form such that Q p ðpÞ ¼ Mðp; aÞ where Mðp; aÞ is a marginal distribution function and a ¼ ½a 1 ; . . . ; a m is a corresponding fitted parameter Step 2. Construct a set X = {p 1 , … , p m } for some finite integer m ∈ ℤ where 0 = p 1 < p 2 < . . . < p mÀ1 < p m = 1 Step 3. For each p i ∈ X extract the corresponding contour curve C i = ' (p i , r) from the scaled joint PDF ' (p, r) and fit any convenient practical functional form for each curve C i such that As a result the quantification and specification of the parameter a for the marginal distribution Q x ðp; aÞ and the set of parameters b 1 . . . b m for the contour curves for the for easy visualization so that the joint PDF has the correct magnitude for a two dimensional normalization such that ∫ ∞ À∞ ∫ ∞ À∞ g A0;l ðj A0 ; j l Þ dj A0 dj l ¼ 1 when constructed from the two dimensional histogram data using N A0 ¼ N x ¼ 63 and N l = N y = 58 bins respectively with M = 10 000 Monte Carlo simulation events. In (a) the actual discrete data of the bivariate joint probability density function is first plotted for a qualitative visualization. In (b) the discrete data is post-processed to produce the mathematically continuous joint probability density function in a three dimensional space where each space coordinate corresponds to a random variable. In (c) the joint probability density function is normalized in order to more clearly view the behaviour of the underlying model. Fig. 3. Monte Carlo GUM Supplement 2 marginal distributions. In (a) the probability density function distribution for the continuous random variable j A0 of the zero-pressure area is visualized. In (b) the probability density function distribution for the continuous random variable j l of the distortion coefficient is visualized. In (c) the underlying discrete data from the GUM Supplement 2 analysis that is used to build the PDF for the zero-pressure area is shown. In (d) the underlying discrete data from the GUM Supplement 2 analysis that is used to build the PDF for the distortion coefficient is shown Table 3. Marginal distribution ELD parameters for zero-pressure area A 0 and distortion coefficient l variates. The underling probability density function distribution for the zero-pressure area random variable j A 0 and the distortion coefficient random variable j l is modelled and approximated with an extended lambda distribution based on the discrete data obtained from the GUM Supplement 2 uncertainty analysis. In order to clarify how to sample random variables j A 0 and j l from the marginal distribution approximation of the joint PDF g A 0 ;l ðj A 0 ; j l Þ suppose for illustration purposes that we wish to generate a random variable of the zeropressure area j A 0 as x i for p i = 0.7 and a random variable for the distortion coefficient l as y i for r i = 0.82. We can immediately calculate x as x i = Q x (0.7) but it will be more difficult to calculate y i since in general we would have a set of contour curves for fixed values of p, say p ∈ [0.0, 0.25, 0.5, 0.75, 1.0]. Selecting the nearest p ℓ that is smaller than p i i.e. p ℓ = 0.5 and the nearest p u that is larger than p i i.e. p u = 0.75 allows us to calculate the corresponding y for p ℓ as y ℓ = Q y (p ℓ , r i ), and that for p u as y u = Q y (p u , r i ). Then these data points can be used to interpolate for the required y i corresponding to r i = 0.82 using linear interpolation so that y i = y ℓ + [(y u À y ℓ ) (p i À p ℓ )]/[p u À p ℓ ].
This particular illustrative example for how to perform a two dimensional interpolation to calculate samples of random variables from the underlying joint PDF is mathematically equivalent to a bilinear interpolation, although in principle any particular interpolation scheme such as for example bicubic interpolation may also be used to obtain smoother interpolated fit values. The marginal distribution approach is illustrated by comparing the actual bivariate surface in Figure 4a with the surface approximation as shown in Figure 4b which demonstrates how a set of contour curves can be used to approximate the two dimensional joint PDF surface. In this figure there are a total of N x = 60 curves for evenly spaced values of the random variable x ¼ j A 0 from 1.9612 mm 2 to 1.9617 mm 2 and the y ¼ j l varies from 0.3145 ppm/MPa to 1.1581 ppm/MPa in evenly spaced values using a total of N y = 75 points. Whilst the marginal distribution approach for constructing a bivariate PDF is theoretically possible it does not unfortunately offer a convenient and practical means of modelling and summarizing a bivariate PDF for reporting purposes such as in a pressure balance calibration certificate or inter-comparison report.
An alternative to the marginal distribution approach for arbitrary specifications of p i and r i without any need for additional interpolations in order to immediately directly sample random variables from the joint PDF is to formulate the surface fitting of the joint PDF as a scattered data interpolation problem. This problem which is common in the field of statistics is formally specified as that given a set of discrete sampled data points ðx j ; y j Þ for j = 1, 2, … , N with x j ∈ℝ s and y j ∈ ℝ where 0 < s ∈ ℤ is the dimension of the data, then find a continuous function P f such that P f ðx j Þ ¼ y j ∀j∈½1; 2; . . . ; N. A particular well-posed solution to the scattered data interpolation problem is to use the distance matrix formulation such that Fig. 4. Illustration of comparison between the actual bivariate joint PDF surface with that of a conventional classical marginal distribution approach approximation showing the discrete set of contour curves visualized by three dimensional lines that are used to construct joint PDF surface, and that of a scattered data approximation of the corresponding normalized bivariate surface. In (a) the actual continuous probability density function distribution is shown which must be modelled and approximated through a visualization in a three dimensional space where the x coordinate corresponds to the random variable j A0 , the y coordinate corresponds to the random variable j l and the z coordinate corresponds to the joint PDF g A0l ðj A0 ; j l Þ. In (b) the modelling and approximation is shown where the three dimensional surface is approximated by a series of three dimensional lines within the three dimensional space. In (c) the modelling and approximation is visualized for the equivalent surface that is built up from discrete three dimensional scattered points.
where k ⋅ k 2 is the ℓ 2 -norm defined as kxk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi The coefficients c k for k = 1, 2, … , N where N is the total number of known discrete points fx j ; y j g is obtained by solving the linear system specified as We comment that the scattered data interpolation solution using the distance matrix formulation can be refined for an arbitrarily large number of supplied known data points. A useful benefit of this approach is that it is not strictly necessary to first normalize and scale the random variables to fit x ¼ ½p j ; r j T and y j = ' (p j , r j ) for j = 1, … , N since the scattered data interpolation scheme can be applied directly with x j ¼ ½ðj A 0 Þ j ; ðj l Þ j T and y j ¼ fððj A 0 Þ j ; ðj l Þ j Þ. Nevertheless although a normalization is not theoretically necessary in our simulations we opt to apply the fit to the normalized variables 0 p 1, 0 r 1 and 0 ' (p, r) 1 to avoid ill-conditioning issues when solving the matrix equation Dc ¼ b. Results for a scattered data interpolation scheme using the same underlying data is illustrated by comparing the actual bivariate surface as previously shown in Figure 4a with the scattered data surface approximation as shown in Figure 4c. Whilst this construction of the conventional scattered data two dimensional joint PDF is better than that of a marginal distribution approach which is restricted by the fixed number of discrete contour curves chosen, the main issue is that the number of constants used to construct the surface is still very large since in this case there are N = 3534 constants c 1 , … , c N for the underlying discrete data points fx j ¼ ½p; r T ; y j ¼ 'ðp; rÞg; j ¼ 1; . . . ; N. As a result the main issue with a scattered data interpolation scheme apart from the large number of constants to approximate the surface in our particular physical pressure balance metrological system, is that although it is guaranteed to adequately model the underlying data that the associated system of equations is usually ill-conditioned.
Alternative meshless techniques that avoid ill-conditioning encompass the radial basis function (RBF) approach as discussed by Larsson and Fornberg [33] which may be considered a generalization of the scattered data interpolation scheme P f ðxÞ ¼ P N k¼1 c k B k ðxÞ. In this approach the constants are calculated by solving the matrix equation Ac ¼ y where A ij = f (e, r) where f = r 2 log(r) in the case of a thin plate spline or where f = exp(À (er) 2 ) in the case of a Gaussian RBF where r ¼ kx j À x i k and c ¼ ½c 1 ; . . . ; c N T as previously discussed, however the same issue of a large number of constants is still present.
This problem again occurs when low order spline approximations of bivariate functions are considered where the bivariate function is traditionally approximated as a tensor product spline surface of the form where K is an integer corresponding to the order of the approximation, s k are constants, and u k (x) and v k (y) are univariate functions each with their own individual corresponding parameters particular to the characteristics of the respective basis functions as recently developed by Georgieva and Hofreither [34]. The main issue with such a traditional low rank approximation in our particular problem is that although it can very accurately model and generate a reconstructed surface for suitable choices of s k , u k (x) , v k (y) and K, is that in general a very large number of coefficients are generally still necessary in order to adequately reconstruct the basis functions u k (x) and v k (y). Whilst a large number of coefficients is not an issue in a computer implementation the reporting of a large number ranging from a few hundred to a few thousand coefficient values is nevertheless however not considered practical for a calibration paper certificate or written intercomparison report. Due to these practical implementation issues we therefore consider a copula approach C u (u, v) as previously discussed in order to construct the bivariate distribution for our particular physical metrological measurand system since for many practical copula families only a small number of parameters are required in the term u ¼ ½u 1 ; . . . ; u n T . As a result due to generally small number of parameters to adequately characterize bivariate PDF distributions this renders the reporting of the physically characterized pressure balance joint PDF in calibration certificates and inter-comparison reports a practical and feasible alternative for oil based pressure balances.
In this paper for brevity we restrict our numerical investigation to the normal, Student-t, Gumbel, Frank and Clayton bivariate copulas C u (u, v) as summarized in Table 4 where the respective copula parameter u is obtained from the Kendall tau parameter t K as previously discussed. When constructing the copula we use x as the random variable for the zero-pressure area A 0 and y as the random variable for the distortion coefficient l in order to construct the joint PDF f (x, y) such that fðx; yÞ ¼ uv ∂ 2 Cðu; vÞ ∂u∂v ð31Þ In the above formulae g x (j x ) is the marginal PDF for x as previously approximated with an ELD as illustrated in Figure 3a, whilst g y (j y ) is the marginal PDF for y as illustrated in Figure 3b both of which are fully parame-trized in terms of respective univariate quantile parameters a, b, c, d. By exploiting the properties of ELDs as previously discussed by Willink [16] the distribution function formally defined as F ðxÞ ¼ ∫ x À∞ fðuÞ du for the random variable x with a similar expression for the random variable y may in the special case of an ELD approximation be simplified as As a result if the marginal distribution for the zeropresssure area A 0 random variable x has ELD parameters a x , b x , c x , d x , and the marginal distribution for the distortion coefficient l random variable y has ELD parameters a y , b y , c y , d y it follows that we can immediately utilize the previous analytical expressions in the copula construction for the bivariate joint PDF in terms of the ELD parameters. In order to construct a copula function C u (u, v) for the formulae listed in for example Table 4 we must in the case of the Gaussian copula use F r (•) which denotes the bivariate standard normal distribution with correlation r whilst F À1 (•) denotes the inverse standard normal distribution. On the other hand for the bivariate Student-t copula t r,n denotes the bivariate t-distribution with parameters r and n where the degrees of freedom n must typically be calculated using the Akaike Information Criterion (AIC) where the smallest AIC value (which depends on n) is considered to be the best-fit copula choice, and t À1 n ð•Þ is the corresponding inverse t-distribution value for the associated degrees of freedom n parameter for the copula. Although the conventional t-distribution copula usually has one degree of freedom parameter n more modern alternatives can incorporate multiple degrees of freedom parameters n 1 , … , n n for bivariate models as discussed in Luo and Shevchenko [35].
As a result whilst it is certainly possible to estimate bivariate copula parameters u ¼ ½u 1 ; . . . ; u n T directly with custom written computer code through various parameter optimization approaches for a range of copula families of which Table 4 is only a small selection of the very extensive range of possible copula families this is no longer strictly essential. This is due to the fact that Yan [36] developed a R based package copula which is now readily available to researchers worldwide, and which has subsequently been expanded by Kojadinovic and Yan [37] for multivariate distribution modelling using copulas. As a result the use of R based open source statistical software from the Comprehensive R Archive Network is now commonly available to researchers worldwide and is accepted as a standard statistical tool within the statistics community. In this paper we will therefore utilize the copula [38], copBasic [39], and VineCopula [40]. R packages to simplify the analysis in order to construct the best choice of a copula function for our underlying pressure balance Monte Carlo bivariate data.
The use of the above-mentioned R packages allows us to simply load the bivariate data from our post-processed GS2 Monte Carlo simulations and to then implement the following computer code using the VineCopula library which automatically selects the most appropriate bivariate copula function drawing on a selection of copula families more extensive that the most common families previously outlined in Table 4. When performing the copula selection to avoid numerical ill-conditioning due to scaling effects since Oðj A 0 Þ ¼ 10 À6 and Oðj l Þ ¼ 10 À12 in SI units it may depending on the available computer systems be advantageous when fitting the copula function to utilize the variate data j A 0 for A 0 and j l for l such that they are first converted so that x is in mm and y is in ppm/MPa. We comment that the choice of units is not problematic since our main objective is to model the bivariate joint PDF and the form of the fit should simply allow us to adequately sample values for the A 0 and l variates x and y. Further technical implementation details for the VineCopula library are available in the official documentation however we briefly comment that the user would in most practical situations select from ml for maximum likelihood, mpl for maximum pseudo-likelihood, itau for inversion of Kendall's Table 4. Specification of common parameter based bivariate copula families (Adapted from Goda [17]). In our particular problem the variable u corresponds to the random variable j A 0 and the variable v corresponds to the random variable j l so that the bivariate copula C (u, v) can be used to construct the joint probability density function as g A 0 ;l ðj A 0 ; j l Þ ¼ uv ∂ 2 C ∂u∂v for convenience.

Copula name
Copula function C (u, v) Copula parameter relationship with Kendall's tau Student-t t r;n t À1 n ðuÞ; t À1 n ðvÞ r ¼ sin tau, or irho for inversion of Spearman's rho as choices for the parameter optimization choice for the software code to determine the optimal copula parameter fit. The process to fit the bivariate data starts with the GS2 Monte Carlo data GUMSupplement2.txt that is obtained with any suitable computer code simulation environment that is saved in a neutral ASCII txt file format. Once the data is saved it can then be processed to firstly extract the univariate ELD quantile parameters in for example GNU Octave for the marginal distributions, and then secondly processed in RStudio using the VineCopula library to determine the optimal bivariate copula and associated copula parameters as shown in Figure 5.
Results for the univariate marginal distribution parameter fits with GNU Octave v4.2.0 by running computer program CalcMarginalParameters.m with the m-code shown in Figure 6 are shown below in Listing 1.
Results for the bivariate copula parameter fits with RStudio by running computer program CalcCopulaParameters.r are shown below in Listing 2.
The above computer code console outputs are actually sufficient to fully characterize the ELD marginal distribution and copula bivariate distribution parametrizations. In order to construct the joint PDF the second order mixed partial derivative of the copula C (u, v ; u) must be calculated in order to calculate the copula density c (u, v ; u). Further technical implementation details to directly numerically construct the joint PDF from a general copula that is not amenable to closed form analytical expressions of the copula density are outlined in Appendix B.
For the particular case of a Gaussian copula Meyer [41] provides explicit analytical closed form expressions such that for the univariate case the cumulative distribution function F À1 (•) is defined as Fig. 5. Illustration of how a GUM Supplement 2 (GS2) Monte Carlo uncertainty analysis would be implemented in practice for constructing the marginal distribution and copula distribution parameters. In (a) an extract of the actual M = 10000 data points is shown from the GS2 which were determined with a suitable x 2 optimization. In (b) the computer code implementation of how to process the GS2 data to fit the marginal distributions in terms of an extended lambda distribution is shown. In (c) the computer code implementation of how to determine the copula parameters using open source easily accessible standard statistical software is shown.  The univariate standard normal CDF F (h) is then used to construct the bivariate standard normal CDF F 2 (a, b) such that When the above expressions are combined we then obtain the final expression for a Gaussian copula formulation of the bivariate joint PDF as fðx; yÞ ¼ F ðxÞGðyÞcðu; v; rÞ ð 39Þ In our particular case since the optimal copula is a Gaussian we simply specify the copula as Cðu; vÞ ¼ F 2 ðF À1 ðuÞ; F À1 ðvÞ; rÞ ð 40aÞ The above copula function informations is then completed by also specifying the associated quantile function parameters for F (x) = u and G (y) = v which using our ELD approximation is of the form The above set of four equations for the copula function C (u, v ; u), marginal distributions F (x) and G (y), and parameters associated with the respective marginal distributions {a x , b x , c x , d x } and {a y , b y , c y , d y } are then actually sufficient to model the bivariate joint PDF for the TS pressure balance, by noting the ELD parameter limits are by definition Once the joint PDF has been summarized using f (x, y) = F (x) G (y) c (u, v) with u = F (x) and v = G (y) as indicated above, it can then be used in any further uncertainty analysis numerical simulations for the TS pressure balance. As an example we would in a practical physical pressure balance calibration like to sample random variables j A 0 and j l from g A 0 ;l ðj A 0 ; j l Þ in order to calculate the uncertainty in generated pressure. The general numerical procedure we propose in this paper adapted from the univariate sampling procedure for arbitrary distributions is as follows for bivariate distributions: Step 1. Sample a random variable r x ∼ R from the rectangular distribution so that 0 r x 1 Step 2. Solve the equation F (x ⭑ ) = r x in order to deduce a value for x ⭑ and then simply set Step 3. Construct the contour curve CðyÞ ¼ f X ðx ⭑ ; yÞ by fixing x = x ⭑ in the joint PDF f (x, y) and normalize as appropriate Step 4. Build the CDF such that HðyÞ ¼ ∫ y À∞ CðhÞ dh and normalize as appropriate Step 5. Sample a random variable r y ∼ R from the rectangular distribution so that 0 r y 1 Step 6. Solve the equation H (y ⭑ ) = r y in order to deduce a value for y ⭑ and then simply set j l = y ⭑ If the sampled variates are x ⭑ and y ⭑ , and r x and r y are sampled rectangular random numbers then the above procedure can be summarized as solving the following system of equations such that We comment that the above bivariate sampling scheme is more general than a traditional bivariate Gaussian sampling scheme since it is not restricted to a conventional Gaussian bivariate PDF, and can sample from arbitrary bivariate joint PDFs since the approach will work for arbitrary marginal distributions and arbitrary bivariate copulas. As a result the benefit of using a copula to model a bivariate distribution over a traditional GS2 based Gaussian joint PDF is that one is no longer specifically restricted to bivariate Gaussian distributions as is presently the case according to the official GS2 guidelines.
With the application of copulas as a particular modelling approach for bivariate quantile distributions one may therefore now in principle utilize any other alternative physical or mathematically plausible choice of probability distributions modelled in terms of copulas to summarize bivariate Monte Carlo uncertainty analysis simulation data as outlined in Appendix B.
We can now test the extent to which our bivariate joint PDF for the TS pressure balance can estimate the generated pressures. This is performed by implementing a GS1 simulation of a generating pressure model for the TS pressure balance as where ½j A 0 ;j l T ∼ g A 0 ;l ðj A 0 ; j l Þ is sampled using our bivariate joint PDF modelling approach using the temperature compensation function f (t, When this Monte Carlo simulation for the TS pressure balance is performed using the data in Appendix A and the Monte Carlo generated pressure data is obtained, then this generated pressure data for each of the cross-floated pressure points 50 MPa, … , 500 MPa may be compared to the actual generated pressures from the LS pressure balance. Numerical results were performed using a direct simulation single language implementation in GNU Octave with the above generalized sampling scheme for x ⭑ and y ⭑ as shown in Table 5. These results were obtained on a computer workstation using an Intel Xeon E5-1650 v3 CPU running at 3.50 GHz with 32 GB of RAM for M = 500 Monte Carlo simulation events by directly sampling from the bivariate joint PDF by solving the respective equations for x ⭑ and y ⭑ . Alternatively in the special case where the analytical expression for the copula density c (u, v) is explicitly known beforehand then a mixed language approach using R and Octave can offer considerable computational time savings. This approach involves first using R to sample from the Gaussian copula, or any other suitable copula family available from the RStudio available copula library that adequately approximates the bivariate data Kendall tau value, with the specified parameter value u and then saving the marginal distribution values u and v to a text file as indicated below in Listing 3.
The R computer code shown in Listing 3 writes the u and v values that are consistent with the specified copula density function c (u, v) to a text file CopulaDataUV.txt which is an array of dimension 500 Â 2 where the first column is the u data and the second column is the v data, whilst the corresponding copula density c (u, v) is saved to the text file CopulaDataC.txt. Afterwards the corresponding variate data x and y can then be simply recovered by using one dimensional interpolations from the previously generated ELD based univariate marginal distributions u = F (x) and v = G (y). This mixed language process for n = 10 000 samples using RStudio and GNU Octave Table 5. Extended lambda distribution (ELD) parameters for transfer standard pressure balance generated applied pressures P k /[Pa] using bivariate copula joint PDF. The index k corresponds to the pressures P k for 50 MPa through to 500 MPa and the parameters a k , b k , c k and d k can then be used to construct analytical expressions for the probability density function distribution for each of the P k pressures. takes approximately 85.36 s to generate, post-process and then recover the variate j A 0 and j l sampled data on the previously mentioned Toshiba laptop. As a result the numerical simulations to test the bivariate statistical sampling approach demonstrates that a copula based bivariate quantile joint distribution sampling scheme is computationally feasible for both single computer and well as mixed computer language implementations when used as inputs in subsequent Monte Carlo uncertainty simulations.
In order to verify and validate (V&V) our proposed bivariate joint PDF modelling approach in terms of the standard quality engineering V&V methodology we must compare and contrast these numerical results which utilize the bivariate statistical sampling of the pressure balance',s zero-pressure area and distortion coefficient values with the exact Monte Carlo numerical results previously summarized in Table 1. We can implement this comparison by determining the normalized errors E n defined as for all our generated/cross-floated data points between the LS which generates the known applied pressures and the TS which is cross-floated against the LS. According to this approach x ref would correspond to the LS generated applied pressure P are necessary noting that we have used ELDs to summarize the respective Monte Carlo data for both the LS and TS. We recall that the expected value m and standard uncertainty s for an ELD are specified as As a result knowledge of the ELD parameters immediately gives us the values of the expected values m k for the LS and TS but we do not have direct knowledge of the respective expanded uncertainties which must be calculated from the ELD based distribution function. In general for a specified confidence level p, say p = 0.95 corresponding to a 95% confidence level, for a possibly non-symmetric and/or skew PDF g (h) a value of a is first calculated by minimizing [G À1 (p + a) À G À1 (a)], however if symmetry with an absence of skewness is assumed as a simplification then a ¼ ð1ÀpÞ 2 . Under these circumstances for a measurand y with a distribution function G (h) the confidence interval for the specified confidence level is just Y min = G À1 (a) and Y max = G À1 (p + a). Consequently we may then just approximate the expanded uncertainty as When the above V&V formulation is implemented it results in the data summarized in Figure 7 using M = 500 Monte Carlo simulation events by sampling from the bivariate joint PDF. Referring to the normalized errors E n it is observed that À1 E n 1 for all the applied pressures 50 MPa, … , 500 MPa. As a result of these numerical simulations we therefore conclude that the proposed method of using bivariate quantiles with ELDs for the marginals with parameter based optimized copulas is mathematically and statistically consistent for pressure balance calibrations and intercomparisons at a primary scientific metrology standards level. In this paper we have investigated the feasibility and utility of extending and applying quantile functions to systems which naturally exhibit bivariate PDF's and considered the particular case of oil pressure balances where the area takes the form A = A 0 (1 + lP) and knowledge of the bivariate joint PDF g A 0 ;l ðj A 0 ; j l Þ in terms of the zero-pressure area j A 0 and distortion coefficient j l random variables is necessary for elevated pressures. Our research approach involved the implementation of a GS2 multivariate uncertainty analysis of a TS pressure balance where we mathematically formulated an approach to post-process the original Monte Carlo data in order to recover the underlying j A 0 and j l bivariate random variable statistical data for further analysis.
Numerical simulations were then considered and performed for a variety of mathematical modelling approaches in order to study how to adequately model, summarize and reconstruct the underlying bivariate statistical uncertainty analysis data. Based on these numerical simulations we sought to investigate the extent to which a combination of univariate ELD quantile functions for the marginal distributions u = F (x) and v = G (y) of x ¼ j A 0 and y = j l respectively when coupled with a suitable bivariate copula family C u (u, v) selection with an optimized copula parameter u would be sufficient for pressure calibrations and inter-comparisons.
Results of generated pressures for a variety of conditions were then considered and analysed in order to perform bench-mark studies using conventional metrology statistical tests. When these results were analysed to assess the level of necessary verification and validation measures for the method proposed in this paper it was concluded that a bivariate quantile distribution of a pressure balance PDF of the form g A 0 ;l ðj A 0 ; j l Þ ¼ uvcðu; vÞ is indeed sufficient to accurately model and summarize a pressure balance's metrological characteristics at a primary standards scientific metrology level. As a result we conclude that the measurement modelling technique proposed compares very favourably to an exact GS2 UQ analysis and may thus offer benefits to pressure metrologists at national metrology institutes involved in high accuracy client calibration work and in participation in inter-comparisons at a primary scientific standards metrology level.
A potential future topic of metrology research in the area of uncertainty analysis for pressure balances is therefore the possible application of vine copulas for the modelling of multivariate uncertainty analysis data for summarizing GS2 Monte Carlo for higher dimensions. Alternatives to vine copulas are empirical copulas which are copulas note defined in terms of a parameter u ¼ ½u 1 ; . . . ; u n T but which are defined directly in terms of the underlying data. In the case of two dimensional variates [x i , y j ] T the bivariate empirical copula when there are n variates is defined as C n i n ; j n ¼ def pairs ðx; yÞ s:t:x x i and y y j n ð51Þ The above working definition is usually expressed mathematically as Different types of implementations for an empirical copula are possible as discussed in the copBasic package [39] and include the Hazen, Weibull and Berstein empirical copulas. More recent recent research in the area of empirical copulas by Segers et al. [42] using Berstein polynomials concluded that beta empirical copulas can fully meet the formal mathematical specifications for a copula and that these offer very good performance in terms of estimating bias and variance.
As a result these types of copulas are potentially attractive choices for metrologists for future UQ research in terms of modelling and summarizing higher dimensional . Verification and validation of proposed method using normalized errors of applied pressures demonstrating that kE n k 1 ∀ n ∈ [1, … , 10] laboratory standard pressure balance generated pressures and transfer standard pressure balance cross-floated area measurements. The pressures P 1 through to P 10 correspond to pressures from 50 MPa through to 500 MPa at 50 MPa steps and the E n values for each of these pressures indicates the corresponding normalized error. When calculating the E n value for each pressure the known reference pressure P ref is determined from the laboratory standard pressure balance whilst the calibrated pressure P cal is determined from the transfer standard pressure balance using the bivariate copula that models and summarizes the information of the calibrated pressure balance's zero-pressure area A 0 and distortion coefficient l. Due to the fact that all of the E n values are smaller than unity this then means that the results and uncertainties that were calculated for the calibrated pressure balance when tested against the reference pressure balance are statistically consistent and hence provides proof that the proposed method of using copulas to model and summarize bivariate probability density functions has been verified and validated.
PDFs where multivariate Gaussian PDFs as per the GS2 are problematic due to higher dimensional probability distribution asymmetries and skewness characteristics.

Implications and influences
The main implication of this paper is that we have demonstrated that quantile functions may be used to accurately and completely model the bivariate joint probability density distribution function for a pressure balance's effective area in terms of its zero-pressure area A 0 and distortion coefficient l. As a result the specification of nine numerical parameters, four for each of the marginal distributions plus one for the bivariate copula parameter, now enables pressure metrologists to simply summarize pressure balance bivariate PDF information in calibration certificates and inter-comparison reports, and to have an increased level of confidence in the behaviour and uncertainty of their oil pressure balance laboratory primary standards at elevated pressures which was previously limited due to the complexity of incorporating the uncertainty in the distortion coefficient.
Based on the results reported in this paper the wider influences which are now possible is an increased awareness of the utility of multivariate higher dimensional uncertainty analysis with the GS2, and how copulas may now be used to simply and conveniently summarize higher dimensional uncertainty analysis results in high accuracy calibrations and scientific metrology inter-comparisons in other metrology areas and fields of work.
This work was performed with funds provided by the Department of Higher Education and Training (DHET) on behalf of the South African government for research by public universities.

Appendix B: Parameter fits of specific copulas
To fit specific types of copulas consider for example explicitly using a Clayton distribution. The corresponding R computer code to fit a Clayton copula and extract the associated parameter values would be: The parameter value u associated with a user's particular choice of copula family may then be obtained by simply printing out the value as When the respective fits are completed for a particular choice of copula family we could then compare the Kendall's tau values for a particular fit with that of the optimal copula fit in order to determine how closely the chosen fit matches the actual data correlation by using the following code: 24 V. Ramnath 3. Boolean values to determine pressure balance laboratory standard and transfer standard operating conditions (weights for pressure generation).
In our particular case using Kendall's tau as an indication of the variate correlation from the above VineCopula library computer results we observe that the Clayton copula approximation diverges from the optimal Gaussian copula and therefore it would not be beneficial to use a Clayton's copula to model the pressure balance joint distribution. If simulations are performed on a 64-bit computer using double floats then the practical numerical precision is usually 16 digits unless variable precision arithmetic (VPA) libraries are used as previously discussed.
In the event that the copula function determined either from an optimal fit using the VineCopula optimal fit routine or a custom copula fit based on physical or mathematical requirements is too complicated to algebraically differentiate then a numerical differentiation of a function u (x, y) may be performed so that ∂ 2 uðx i ; y j Þ ∂x∂y ≈ 1 4ðDxÞðDyÞ Â ½u iþ1;jþ1 À u iþ1;jÀ1 À u iÀ1;jþ1 The copula density may then be explicitly calculated directly from the analytical copula function C(u, v ; u) by using the above numerical approximation so that cðu; v; uÞ ¼ ∂ 2 Cðu; v; uÞ ∂u∂v As a result as long as an analytical algebraic expression for the copula function C (u, v ; u) is specified along with the respective value for the parameter u in for example a calibration certificate or regional/international intercomparison report, then the joint PDF distribution information for the measurement instrument is completely specified through the reported copula function and appropriate parametrizations for the associated marginal distributions.
Extensions to non-parametric copulas in terms of empirical copulas are possible using the R package copBasic [39]. More recent research in the area of empirical copulas tends to favour an empirical beta copula as discussed by Segers et al [42] due to certain statistical technical reasons, and the use of empirical beta copulas may therefore in principle be considered for utilization by metrologists for modelling higher dimensional joint probability density distributions where multivariate Gaussian probability density functions are problematic.