Comparison of straight line curve ﬁ t approaches for determining parameter variances and covariances

. Pressure balances are known to have a linear straight line equation of the form y = ax + b that relates the applied pressure x to the effective area y , and recent work has investigated the use of Ordinary Least Squares (OLS), Weighted Least Squares (WLS), and Generalized Least Squares (GLS) regression schemes in order to quantify the expected values of the zero-pressure area A 0 = b and distortion coef ﬁ cient l = a / b in pressure balance models of the form y = A 0 (1 + l x ). The limitations with conventional OLS, WLS and GLS approaches is that whilst they may be used to quantify the uncertainties u ( a ) and u ( b ) and the covariance cov ( a , b ), it is technically challenging to analytically quantify the covariance term cov ( A 0 , l ) without additional Monte Carlo simulations. In this paper, we revisit an earlier Weighted Total Least Squares with Correlation (WTLSC) algorithm to determine the variances u 2 ( a ) and u 2 ( b ) along with the covariance cov ( a , b ), and develop a simple analytical approach to directly infer the corresponding covariance cov ( A 0 , l ) for pressure metrology uncertainty analysis work. Results are compared to OLS, WLS and GLS approaches and indicate that the WTLSC approach may be preferable as it avoids the need for Monte Carlo simulations and additional numerical post-processing to ﬁ t and quantify the covariance term, and is thus simpler and more suitable for industrial metrology pressure calibration laboratories. Novel aspects is that a Gnu Octave/Matlab program for easily implementing the WTLSC algorithm to calculate parameter expected values, variances and covariances is also supplied and reported.


Research motivation
The use of least squares based statistical regression analysis is a widely used tool to analyse experimental data when estimating and fitting model parameter values and uncertainties. In the particular case of straight line equations of the form y = ax + b known input data x i and output data y i for data-points i = 1, … , n inclusive of possible available uncertainty information are used in the regression analysis where standard uncertainty analysis theory from the GUM [1] specifies the model uncertainty as In the above formula since x is a statistically independent random variable it follows that cov(x, a) = 0, cov(x, b) = 0, and in general cov(a, b) ≠ 0. As a result, the probability density function (PDF) of the output g y (h y ) may be determined where h y is a univariate random variable of the model output y, g x (j x ) is a univariate random variable of the model input x, and g a,b (j a , j b ) is a bivariate random variable of the joint PDF of the model parameters a and b that incorporates the coupling effect between random variables j a and j b for the model parameters a and b that are modelled in terms of cov(a, b).
The research problem that occurs when OLS, WLS and GLS regressions are used to estimate the parameters a and b to varying degrees of accuracies, is that additional Monte Carlo simulations must then be performed in order to determine the variances u 2 (a) and u 2 (b) along with the covariance cov(a, b), and that this information must then be further processed in order to infer u 2 (A 0 ), u 2 (l) and cov(A 0 , l). Consequently, in the absence of covariance information the model uncertainty will be systematically under-estimated if the covariance is unknown or unavailable and only the variances of the model parameters are utilized in uncertainty calculations.

Research focus
This paper focuses on demonstrating the conceptual ease and functionality of the WTLSC approach and how it is able to incorporate measurement variance information u 2 (x i ) and u 2 (y i ) along with covariance information cov(x i , y i ) of the data when available, in order to determine both the variances u 2 (a) and u 2 (b) along with the covariance information cov(a, b) of the model parameters for statistical completeness. A new simple analytical calculation approach to directly infer both the variances u 2 (A 0 ) and u 2 (l) as well as covariance term cov(A 0 , l) from knowledge of u 2 (a), u 2 (b) and cov(a, b) as obtained from the WTLSC algorithm implemented with a Gnu Octave/ Matlab program is presented that does not rely on additional Monte Carlo simulations or statistical postprocessing.
2 Literature review 2.1 Physical theory of pressure balances Pressure balance theory from Dadson [2] for a cross-floated pressure balance in equilibrium may be used to define the applied pressure P in terms of a generalized force F and an effective area S as P ¼ F S where F is a generalized force term to account for buoyancy effects, and S is a cross-sectional effective area that accounts for pressure induced deformation and temperature expansion/contraction effects such that F ¼ m 1 À r a r m g where S = A 0 (1 + lP)f, A 0 is a effective area when extrapolated to a zero pressure, l is a distortion coefficient based on linear elasticity theory, and f is an appropriate temperature compensation function that adjusts the area to an appropriate reference temperature.
In general since there is a pressure dependence within the effective area term S the equation P ¼ F S is usually rearranged as a homogeneous equation h ¼ F S À P and then solved for specified inputs to obtain a value of P such that h = 0. In the event that the pressure P is known and the area S must be determined the equation P ¼ F S is rearranged as S ¼ F P and again rearranged as another homogeneous equation h ¼ F P À S in order to determine the value of S that solves the homogeneous equation for specified inputs.
In the above equation m is the mass on top of the piston, r a is the ambient fluid density in contact with the piston to account for buoyancy forces acting on the piston, r m is the mass density, g is the gravitational acceleration constant, A 0 is the zero-pressure area, l is the distortion coefficient, P is the applied pressure, and f = 1 + a(t À t 0 ) is a temperature compensation function to adjust the crosssectional area to a reference temperature t 0 where a = a p + a c is the linear thermal expansion coefficient of the combination of the piston and cylinder where a p and a c are the corresponding linear thermal expansion coefficients of the piston and cylinder respectively. When the above system of equations is implemented the equation for the cross-floated area S may then be simply modelled as In the above equation the inputs are the expected values and standard uncertainties of the applied pressures (m(P i ) ± u(P i )), the cross-floated areas (m(S i ) ± u(S i )), and estimates of the covariances cov(P i , S i ) for a sequence of measurements i = 1, … , n where n is the number of measurements, whilst the unknowns are considered to be the expected values of the zero-pressure area A 0 , the distortion coefficient l, and the covariance cov(A 0 , l). This modelling approach is equivalent to fitting a straight line of the form y = ax + b from experimental data, and utilizing a least squares regression to quantify the values and uncertainties of the parameters a and b.

Historical and mathematical context
The fitting of straight lines is a common numerical task to fields such as science, engineering, and economics that require experimental or computational data to be modelled and fitted. Usually a least-squares algorithm is implemented to minimize a merit function as discussed by Saunders [3] where in general a non-linear calibration equation y = y(x ; a 1 , … , a N ) is desired to be fitted from measurement pairs of controlled values x i and observed values y i . In the case where the number of measurements is M and the number of parameters to determine is N then a least squares solution is mathematically essential in the general case where M > N. For this situation a merit function is constructed as and optimized in terms of the parameters a 1 , … , a N by simultaneously solving ∂x 2 /∂ a i = 0 for i = 1, … , N where w i is a suitable and appropriate weighting function to account for the underlying measurement accuracy and available information.
The simplest case involves an optimization where all of the data points have an assumed constant uncertainty/ accuracy such that u(x i ) = u(y i ) = s for i = 1, … , M measurements, which corresponds to the situation where no detailed or specific measurement uncertainty information is available, and as a result the optimization of the merit function does not require any specific knowledge of s as this can be conveniently factored out when optimizing x 2 . This particular problem thus corresponds to the well known historical total least squares problem as investigated by Van Huffel and Vandervalle [4].
At a next conceptual level the problem of fitting a straight line with errors in both coordinates was originally formulated by York [5] and subsequently involved either attempts to directly solve the original problem posed by York or by various approximation methods in order to infer the variances of the fitting parameters by researchers working in the area of regression analysis of experimental measurements. For this problem the approach of Press et al. [6] has achieved the most notable recognition amongst many non-mathematicians for the fitting of a straight line where both x and y coordinates possess some underlying known statistical uncertainty.
In this particular case the model y(x) = a + bx involves a x 2 merit function optimization of the form where u(x i ) and u(y i ) are the standard deviations of the x i and y i data points that are allowed to vary. Although conceptually straightforward the main challenge with directly solving this problem is that the presence of the parameter b in the denominator renders the partial derivative equation ∂x 2 /∂ b = 0 a non-linear equation, and as a result this problem is considerably more difficult to solve than the earlier total least squares problem which is amenable to an analytical solution. Consequently, the optimization problem for straight lines with errors in both coordinates is typically solved either with classical optimization techniques such as the Levenberg-Marquardt method or with the Monte Carlo bootstrap technique as outlined by Press et al. [6], and as a result this solution approach is not generally directly accessible for many metrologists working in industrial laboratories due to the mathematical complexity of the method.
It should be noted that a conventional numerical solution of this non-linear optimization problem using either classical techniques such as Newton's, steepest descent and homotopy methods as discussed by Burden and Faires [7] or newer methods such as the particle swarm optimization (PSO) approach would still suffer from two limitations, namely that it would firstly not be able to compute the parameter variances, and secondly that it would not be able to provide any insights into the parameter variances and covariances.
The statistical effects from the choice of OLS, WLS and GLS regressions was subsequently investigated by Duer et al. [8] who determined that whilst the specific choice of OLS, WLS or GLS regressions did not generally produce inconsistent estimates for the intercept b and slope a parameter values, that the different regression schemes nevertheless did in fact produce considerable variations in the corresponding estimates of the uncertainties u(a) and u(b) of the parameters a and b. As a result, Duer et al. concluded that a simplified OLS regression could only be safely utilized in the very specific and special case of where the uncertainties u(x k ) in the x k data points was negligible.
In the particular case of scientific metrology work an inconsistent or incorrect estimate of a parameter value and variance/covariance could then have a catastrophic effect on high level laboratory inter-comparison work or in safety critical experimental measurements as parameter uncertainties of equipment and instruments are typically utilized in many quality engineering systems to specify tolerances and safety limits for equipment and personnel.
Due to these specific limitations particularly for scientific measurement work at national laboratories the total least squares problem was subsequently revisited by Krystek and Anton [9] who simplified the original twodimensional problem into an equivalent one-dimensional optimization which is amenable with a set of simplified mathematical tools. This initial approach was then later generalized to solve the full problem of errors in both coordinates with correlation effects in a later paper by Krystek and Anton [10]. This approach is now termed as a weighted total least squares with correlation (WTLSC) analysis approach and it makes an assumption that the covariance matrix U k for individual experimental data points for k = 1, … , n are available prior to optimizing the merit function which is explicitly specified as Part of the reasoning as to why the WTLSC approach does not appear to have been more widely adopted is that in the specific area of scientific metrology that the calculation of the covariance matrix is extremely mathematically complicated when using standard calculus based approaches, and is therefore only tractable from a full Monte Carlo pure numerical simulation approach as discussed by Harris [11]. As a result, if the covariance matrix is unavailable then the WTLSC approach becomes problematic in which case the covariance must either be calculated or estimated, or alternately modified by setting the correlations to zero such that the algorithm implementation then simplifies to the earlier weighted total least squares WLS algorithm.
Whilst the absence of a covariance matrix may be viewed as technical limitation when implementing a GLS regression where the standard statistical formulation comprises of determining the value of the regression parameter b which optimizes the residuals e in a linear regression model y = Xb + e, it has subsequently been determined by researchers that this is not an insurmountable challenge as mechanisms exist to infer a likely covariance matrix. For this problem the covariance matrix Cov[e|X] = V, is known to take the general form In order to account for the unavailability of the actual covariance matrix V which may be unknown, or impractical to estimate, one may nevertheless get a consistent estimate of V usually written asV that is termed the feasible generalized least squares (FGLS) estimator using either iteration based commercial routines with software such as Matlab [12], or with newer support vector machines (SVMs) approaches that are currently under development such as recently reported results from Miller and Startz [13].

Recent statistical regression developments
Recently work reported in the literature by Wuethrich and Souiyam [14] building on earlier work by Otal and Yardin [15] has investigated the use of Monte Carlo based simulations for Ordinary Least Squares (OLS), Weighted Least Squares (WLS) and Generalized Least Squares (GLS) regression schemes in order to determine the parameter values of a pressure balance model of the form S = bP + a, where P is an input of applied pressure, S is an output of cross-floated area, and a and b are model parameters whose values and uncertainties must be determined. Conventionally the pressure balance model has the form S = A 0 (1 + lP) where A 0 is a zero-pressure area and l is a distortion coefficient and the regression analysis fits the values for a = A 0 and the product b = A 0 l for i = 1, … , n data-points such that For the above regression scheme the distortion coefficient is recovered as l = b/a after the residuals e i are minimized in the matrix equation XC = Y + e. This minimization is implemented through the use of a coefficient matrix V that is used to incorporate the known information such that In an OLS scheme no weighting is applied and a coefficient matrix V consists of just a diagonal matrix with unity on each of the diagonal elements such that V i,i = 1, i = 1, . . . ,n in a WLS scheme the coefficient matrix consists of a diagonal matrix where the elements on each of the diagonal elements are proportional to the uncertainties of the areas i.e. V i,i ∝ u 2 (S i ), and finally in a GLS scheme the diagonal elements of V equals the variances of the areas such that V i,i = u 2 (S i ) whilst the off-diagonal elements equal the covariances such that The general approach of Wuethrich and Souiyam [14] is to solve equation (9) with nominal values of the input quantities at specified pressures P i for the equivalent homogeneous equation h ¼ F i P i À S i in order to determine corresponding values of S i for a sequence of pressures P 1 , … , P n with i = 1, … , n, and to then use this information to determine the regression parameter C j = [(A 0 ) j , b j ] T from the matrix equation with an appropriate choice of weighting matrix V. This process is repeated M = 10000 times with slightly perturbed values of the inputs consistent with the underlying uncertainties to sequentially generate samples C j for j = 1, … , M that then results in a sequence C 1 , … , C M of regression parameters, which are then post-processed to calculate expected values, variances, and the covariance of the regression parameters A 0 and b = A 0 l where the distortion coefficient is extracted as l j = b j /(A 0 ) j . The pair of values ((A 0 ) j , l j ) is then further post-processed according to the GUM [1] for two sampled random variables q j and r j to calculate the covariance covðq; rÞ that is estimated as Investigations performed by Wuethrich and Souiyam [14] has concluded that a GLS regression produces high accuracy results, followed by the WLS regression which produces medium accuracy results, and finally by the OLS regression which produces low accuracy results. As a result the GLS approach may be considered the best overall approach however the key difficulty with implementing this approach is in the estimates of the covariances cov(S i , S j ) that are required, which are either mathematically complex to determine or which must be numerically estimated with a large number of Monte Carlo simulations.
The key differences in the newer approach of Wuethrich and Souiyam [14] and an earlier approach of Ramnath [16] is that in the latter work a multivariate GUM Supplement 2 approach [17] was utilized that modelled the various inputs x by treating the two parameters A 0 and l as a single vector y = [A 0 , l] T with an equation h(y, x) = 0 where a multivariate PDF g y (h y ) of the output was to be determined from a multivariate input g x (j x ) by numerically solving the equation h(h y , j x ) = 0 for a large number M of Monte Carlo simulation events. 4 V. Ramnath: Int. J. Metrol. Qual. Eng. 11, 14 (2020) In this approach sampled multivariate values of the inputs consistent with the joint PDFs of the inputs x were used to solve the homogeneous vector equation corresponding to h j ¼ F j P j À S j and a sequence of pressures P 1 , … , P n and areas S 1 , … , S n were generated that was then used with an OLS approach to calculate sampled values of ((A 0 ) j , l j ) from the joint PDF g y (h y ). The sampled values were then used to construct a bivariate copula that modelled u 2 (A 0 ), u 2 (l) and cov(A 0 , l), however a limitation of this approach was that it required a large number of Monte Carlo simulations, was mathematically complex, and thus more suited to scientific metrology applications in national metrology institutes.
Differences in the two approaches is that firstly the method of Wuethrich and Souiyam [14] requires estimates for the uncertainties of the areas u 2 (S 1 ), … , u 2 (S n ) in the case of a WLS approach and additionally that of the covariances cov(S i , S j ), 1i, jn for each Monte Carlo event j, and secondly makes an assumption of a Gaussian correlation between A 0 and l. On the other hand the method of Ramnath [16] firstly avoids the explicit need for calculating u 2 (S i ) and cov(S i , S j ) at each Monte Carlo event j as the data is considered as statistical samples, and secondly does not assume a Gaussian correlation between A 0 and l since bivariate copulas can model non-Gaussian correlation effects.
It should be noted that the calculation of the covariances cov(S i , S j ) that are required in a GLS regression whilst theoretically possible requires an additional matrix formulation solution as outlined in the GUM Supplement 2 [17], pp. 15-17. For an explicit matrix equation y = f(x) the covariance matrix of the output is C y ¼ C x U x C T x and for an implicit matrix equation h(y, x) = 0 the covariance matrix of the output is solved from the matrix equation where U x is a covariance matrix of the input, and C x ¼ ∂h ∂x ¼ J x and C y ¼ ∂h ∂y ¼ J y are corresponding sensitivity or Jacobian matrices.
Due to the combination of algebraic and numerical ill-conditioning complexity that is present for determining a multivariate covariance U y , this calculation is usually only tractable from a pure numerical based Monte Carlo simulation using an appropriate homogeneous equation h i ¼ F i P i À S i for i = 1, … , n as previously discussed.

Mathematical modelling
We consider the standard case of a straight line equation y = ax + b with inputs x i , outputs y i , variances u 2 (x i ) and u 2 (y i ), and covariances cov(x i , y i ) for measurements i = 1, … , n. This system was originally analysed by Krystek and Anton [10] who termed this system as a weighted total least squares with correlation system and developed a WTLSC algorithm that is able to calculate the variances u 2 (a) and u 2 (b) and the covariance cov(a, b). An earlier paper by Ramnath [18] utilized this scheme to quantify the variances u 2 (A 0 ) and u 2 (l) but did not explicitly report on the covariance cov(A 0 , l) term.
Considering the straight line equation S = aP + b = A 0 (1 + l) it immediately follows that From the above set of equations it follows that the variances are ∂l ∂a ∂l ∂b covða; bÞ Using a = A 0 l it follows that Expanding out the partial derivatives in the above equation and solving for cov(A 0 , l) then yields

Numerical simulations
Numerical simulations are performed using the summary of statistical data reported in Table 1 which is based on an earlier full M = 1000 Monte Carlo simulation reported in detail in Ramnath [16]. In this table the columns are the values of P k , u(P k ), S k , u(S k ) and r(P k , S k ) for k = 1, … , 10, where r(P k , S k ) is the dimensionless correlation that is defined in terms of the covariance cov(P k , S k ) by the formula Results are obtained by running the Octave/Matlab implementation of the WTLSC algorithm reported in Figure 1 with the script in Figure 2 as summarized in V. Ramnath: Int. J. Metrol. Qual. Eng. 11, 14 (2020) Table 2. All of the results reported in this table were directly obtained from a Octave/Matlab program with the exception of the dimensionless correlation coefficient r(A 0 , l) as the available number of Monte Carlo simulation events at M = 1000 was too small to obtain meaningful statistical results with the use of equation (11).
Instead the correlation was estimated by saving the processed data from the Octave/Matlab program as a txt file and fitting a Gaussian copula which is defined in terms of the covariance using the following RStudio computer code [19]: This trick was utilized since a calculation of a covariance/correlation is mathematically equivalent to the assumption of a Gaussian bivariate copula to model the coupling effect between two random variables, and the VineCopula statistical package [20] uses more advanced optimization algorithms to estimate the parameters for a covariance/correlation. In general if a large number of Monte Carlo simulation events for example M = 10000 is available then equation (11) can be used directly, and this observation illustrates the essential need for a large number of simulations to adequately apply a GLS regression in order to obtain a statistically meaningful result particu-larly when a relatively small number of scattered datapoints from a small number of Monte Carlo simulations is available for data analysis as shown in Figure 3.
It may be observed that in the case of a bivariate PDF that the joint PDF is then a three-dimensional function and as a result it becomes difficult to visualize the three PDFs g OLS (x), g WLS (x) and g GLS (x) corresponding to the OLS, WLS and GLS regressions as these function overlap and intersect with one another. As a result in order to simplify the interpretation of the numerical results level curves of the PDFs are extracted as shown in Figure 4 so that the relative shape and sharpness of the PDFs may be meaningfully compared later in the paper.
Due to the complexity of the underlying pressure balance equations a large number of Monte Carlo simulations such as say M = 10000 is only feasible if the equations are implemented in an appropriate computer programming environment such as a Visual Basic for Applications (VBA) script within a Excel spreadsheet or a Matlab script, and as a result may present onerous technical computing demands on a pressure metrologist in an industrial calibration laboratory.
The results from Table 2 are visualized in Figure 5 from which it is confirmed that a OLS regression gives the lowest accuracy and a WLS gives a medium accuracy. The relative errors for the zero-pressure area e  Table 1. Statistical data for pressure balance cross-floating measurements for pressures P k and areas S k in approximate equal steps with k = 1, … , 10 from P 1 = 50 MPa through to P 10 = 500 MPa. The above data is sufficient to use as an input for the WTLSC algorithm and produces essentially the same quality of results as a full GLS Monte Carlo simulation. If the dimensionless correlation term r(P k , S k ) is unknown or cannot be otherwise estimated it may be set to zero and the WTLSC algorithm will equivalently reproduce the results from a WLS algorithm which provides conservative estimates for the corresponding parameter expected values and associated variances/covariance. It should be noted that a zero correlation of the input/output data will in general still produce a non-zero correlation in the parameters to account for and model the coupling effect between the parameter random variables.   Table 1 and may conveniently be extracted from an Excel spreadsheet of measurement results and as saved a txt file.  Hyper-ellipsoids are generated to indicate the corresponding 95% confidence regions (CRs) and reveals that the highest accuracy scheme is that of a GLS which has the smallest CR which corresponds to a high degree of certainty of the range of possible values, followed by a WLS which has a slightly lower accuracy compared to a GLS, and finally a OLS which has the lowest accuracy as the CR is unduly large.  figure) of the ellipsoids confidence regions are taken which produce level curves to more easily visualize the shape, orientation and size of the different OLS, WLS and GLS regression schemes. From these level curves it is observed that for level curves along the ellipsoid major axis that the GLS regression (green curve) has the narrowest/sharpest curve which corresponds to the highest confidence or accuracy in the estimates of the parameters, then the WLS regression (blue curve) which has a medium spread/sharpness, and finally the OLS regression (red curve) which is relative wide and flat which corresponds to less certainty/accuracy of the parameters in the regression analysis. A broadly similar behaviour is observed for level curves cut along the minor ellipsoid axis, and as a result it is observed and concluded a OLS regression has the least accuracy, a WLS regression has a medium accuracy, and a GLS regression has the highest accuracy. From the above graphs it may be concluded that a straight line regression for pressure metrology work should ideally utilize a GLS regression (green line) however if no covariance information is available the a WLS regression (blue line) can give a reasonable estimate particularly for industrial pressure metrology laboratories.
V. Ramnath: Int. J. Metrol. Qual. Eng. 11, 14 (2020) where Q represents a selection of a quantity and T denotes a selection of a technique. Considering the relative errors for the WTLSC regression when compared to predictions from the GLS which is considered as the most accurate technique, it is observed that the relative errors are e uðA 0 Þ ¼ 0:765% respectively for the representative pressure balance data that was analysed. As a result it may be concluded that a WTLSC algorithm gives essentially the same quality of a results as full GLS Monte Carlo simulation if there are available estimates for the correlation terms cov(P i , S i ) for the data points i = 1, … , n. Comparison of relative errors of the quantities predicted by OLS, WLS and WTLSC regression approaches relative to predictions from a GLS regression approach. These graphs indicate that the relative error between a WTLSC regression and a GLS regression is relatively insignificant, and as a result a pressure metrologist instead of applying a more complicated GLS regression can simply apply a WTLSC regression with the aid of the supplied methodology developed in this paper.
The key benefits of utilizing a WTLSC algorithm is that it is a relatively straightforward implementation of using a Octave/Matlab function file FitWTLSC with about 20 lines of computer code as outlined in Figure 2, and does not require extensive or advanced programming skills for a pressure metrologist to implement. Limitations of the method are nevertheless present in that estimates for the correlation terms r(P i , S i ), i = 1, … , n are technically necessary to utilize the WTLSC algorithm.
This issue may be addressed in one of two practical ways, namely (i) setting zero correlations for all of the cross-float data-points such that r(P i , S i ) = 0 for i = 1, … , n for which the WTLSC algorithm will reproduce results for a WLS algorithm as a conservative compromise, or (ii) performing a small number of say 50 or 100 repeat measurements in an Excel spreadsheet which is feasible without the need for any specialist VBA scripting programming within a spreadsheet with appropriate expert physical judgement to generate (P i , S i ) data to fit a Gaussian copula with the 11 lines of RStudio code previously discussed to obtain an approximate estimate for the correlation.
Referring to the results in Figure 3 it is observed that the PDFs of the OLS, WLS and GLS regressions may be compared by examining the behaviour of the PDFs along the semi-major and semi-minor axes of the hyperellipsoid for the confidence region. When this analysis was performed as visualized in Figure 4 it was observed that the WLS regression performed relatively well when compared to the GLS regression. As a result it may be concluded that the limitations of the potential absence of a covariance matrix is not excessively detrimental.

Conclusions
Based on the research reported in this paper the following conclusions were determined: -Numerical simulations have been performed that demonstrate that a weighted total least squares with correlation (WTLSC) algorithm gives essentially equivalent results for straight line parameters as a more advanced generalized least squares (GLS) algorithm with additional Monte Carlo simulations for parameter uncertainty estimation -A method has been developed for the WTLSC algorithm that is able to analytically extract the variance/ covariance information of the straight line parameters without the need for additional and more advanced Monte Carlo simulations

Influences and implications
Based on the research reported in this paper the following influences and implications are: -Pressure metrologists working in industrial calibration laboratories may now utilize the WTLSC algorithm to conveniently and easily quantify the expected values and variances/covariance of the zero-pressure area A 0 and distortion coefficient l -A computer program FitWTLSC.m is now freely and publicly available to conveniently implement the WTLSC algorithm in Gnu Octave or Matlab for calculating the straight line parameters and their associated variance/covariance information and may now also be used by other metrologists from different fields This work was performed with funds provided by the Department of Higher Education, Science and Technology (DHEST) on behalf of the South African government for research by public universities.
E ¼ À2s 2 1 m 2 þ 2rs 1 s 2 m 1 ðA16Þ F ¼ s 2 2 m 2 1 þ s 2 1 m 2 2 À 2rs 1 s 2 m 1 m 2 À k 2 p s 2 1 s 2 2 ð1 À r 2 Þ ðA17Þ The corresponding ellipse geometry parameters may now be algebraically calculated as a; b ¼ À1 and the ellipse rotation is given as The hyper-ellipsoid may then be conveniently plotted using the parametrization 0 t 2p so that the initial boundary points x i and y i are x i ¼ a cos ðtÞ ð A22Þ Next the boundary points are rotated by the angle u by using a rotation matrix to give x r and y r such that Finally the points are translated as This algorithm is used to produce the hyper-ellipsoids in Figure 3 and may be used to determine the points along the ellipse major and minor axes. As a result if these lines are used a "cut" through the joint PDF may be used to study how the level curves of the different OLS, WLS, and GLS regressions behave as it is too difficult to easily visualize the three dimensional joint PDFs in Figure 4 even with transparency effects. Once the lines are determine the x 1 and x 2 points are known from a simple straight line fit so the x 1 and x 2 points may then be substituted in the bivariate joint PDF in equation (A4) which will produce a density value. The set of points (x 1 , x 2 , g(x 1 , x 2 )) may then be plotted as a three-dimensional curve in order to visualize the shape and sharpness of the corresponding level curve of the PDF. This algebraic approach is only suitable for bivariate simulations and becomes impractical in higher dimensions, for which the eigenvector/eigenvalue approach is then necessary.