There are three components to refractive state, namely sphere, cylinder and axis. Similarly, central corneal power is also composed of three components, namely the power along the flat meridian, the power along the steep meridian and the axis of the flat meridian. Most studies that investigate refractive data and corneal power analyse each of the three components individually rather than as a trivariate entity. In doing so, pertinent information may inadvertently be omitted.

The purpose of this review is to provide a brief overview of the multivariate statistics that are available to analyse multivariate data such as dioptric power. This will enable readers to better understand research that is analysed using these methods.

An extensive review of databases such as Google Scholar, Science Direct and ResearchGate was done to gather publications on the topic of multivariate statistical analysis. Keywords such as multivariate statistical analysis, dioptric power, stereo-pairs, polar profiles and hypothesis testing were used to conduct the search.

The debate for the need to analyse dioptric power using multivariate statistical methods has been a long-standing one. For this review, more than 40 publications were analysed to provide a simplified overview of the multivariate statistical methods that can be used to analyse dioptric power.

The use of multivariate statistical methods is a valuable tool in analysing and understanding dioptric power holistically and may provide more insights for research involving refractive error and corneal power.

Multivariate statistical analysis provides a method to analyse the trivariate nature of dioptric power holistically, thus enabling one to evaluate the scalar (spherical) as well as the astigmatic changes that may occur. In earlier literature, the scalar axis was referred to as the stigmatic axis. Treating sphere, cylinder and axis as a univariate entity (spherical equivalent) diminishes the meaningfulness of conclusions drawn because astigmatic changes that may occur are diluted thereby. With the multivariate transformation of dioptric power, standard statistical calculations such as means, standard deviations and variances can be determined thus making hypothesis testing possible. Around the late seventies, the idea of analysing dioptric power with multivariate statistics was conceptualised by researchers such as Long,^{1} Keating^{2,3,4} and Harris.^{5} Over the decades, work done by Harris^{5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25} and others including Thibos et al.,^{26} for example, led to the development of the multivariate methods of analysis that are available today to analyse dioptric power in its entirety. The matrix representation of dioptric power and the ability to add, average and square sphero-cylindrical powers allows for variances and standard deviations, as well as other multivariate statistics, to be calculated,^{6} which were thought to be impossible for such data.^{27} These methods of analysing refractive and keratometric data have been used frequently in research,^{28,29,30,31,32,33,34} and the aim of this review is to provide a simplified overview of the multivariate methods of analysis of dioptric power so that readers of such research can better understand the analysis used in those studies. The data used in the examples provided are measurements taken on patients with keratoconus (KC). There have not been many other studies done on KC patients using this sort of analysis.

A ^{5} credit should go to H Fick^{35} and W.F. Long^{1} who pioneered the idea that dioptric power could be expressed as a 2 × 2 matrix. For sphero-cylindrical power, this 2 × 2 matrix is symmetric; that is, the off-diagonal entries are equal; therefore, there are only three distinct numbers just as there are three numbers (sphere, cylinder and axis) representing dioptric power in clinical notation. The unit of measurement of the dioptric power matrix is dioptres or inverse metres (m^{−1}), whereas the unit of measurement for a sphero-cylinder is a combination of dioptres (for sphere and cylinder power) and degrees (for axis). Consider the following 2 × 2 matrix:

The positions of the elements, or entries, are described by the subscripts. The first number in the subscript defines the row and the second number defines the column in which the element is positioned. For example, _{11} is the entry in row 1 and column 1, _{21} is the entry in row 2 and column 1. Each entry in the dioptric power matrix represents a particular characteristic of the dioptric power from which it was converted. The entries _{11} and _{22} link back directly to the curvital power in the horizontal and vertical meridians, respectively, while _{12}, which for a symmetric matrix is equal to _{21}, represents the torsional component of refractive power.^{5,12}

A set of formulae, which were derived to convert sphero-cylindrical powers to a dioptric power matrix and ^{1}:

where _{s} is sphere (D), _{c} is cylinder (D) and

Conversion from matrix representation to clinical notation (sphere, cylinder and axis)^{2}:

where

and

respectively.

These equations work well for thin systems that meet the requirements when analysing clinical sphero-cylindrical data. However, thick systems, for example, the power of the human eye, are more complicated and require a distinct fourth element in the 2 × 2 matrix thereby rendering it asymmetric. This matter is beyond the scope of this overview and the interested reader is referred elsewhere.^{3,17,20,21,22}

The matrix is an orthodox mathematical concept; therefore, the whole field of linear algebra (matrix algebra) becomes available. With sphero-cylindrical power having a mathematical representation (the dioptric power matrix), any mathematical function is thus possible with refractive or keratometric data including calculating means and variances, which are two paramount statistics when comparing samples of data and making inferences for populations.^{11} All the statistical methods discussed in this review are based on the dioptric power matrix. Harris, Malan and Rubin^{36,37,38,39,40,41} contributed to the development of mathematical, statistical and software methods that were specifically designed to convert such data into matrix representations for multivariate statistical analyses. Refractive powers and corneal powers can be converted into matrices so that they can be plotted in three-dimensional symmetric dioptric power space. Thereafter all the statistical functions and methods required to analyse the data are carried out on the matrix equivalents. For corneal power, raw keratometric data (radii of curvature along principal meridians) must first be converted into conventional powers using a nominal refractive index (usually 1.3375) and then into matrix representations. For the purposes of demonstrating the usefulness of the multivariate methods of analysis, the first author has used figures generated from data that were collected during her postgraduate doctoral studies.^{41} Central corneal power measurements were obtained for a sample of keratoconic and healthy control eyes and thereafter compared.

Multivariate statistical analysis of dioptric power is based on assumptions such as data normality and equality of variances, and if these assumptions are violated for a particular sample, then the inferences made on such data need to be treated with caution. However, if that data were to be represented graphically, then the statistical inferences should be validated so that the conclusions drawn would be more meaningful.^{15}

Stereo-pair scatter plots provide a visual representation of dioptric power in its entirety without any underlying assumptions thus providing graphical substantiation to all statistical assertions made in a study. Each point in a scatter plot represents one refractive power or one corneal power that was converted from its sphero-cylindrical form to a matrix that is plotted in three-dimensional Euclidean space.^{16} Thibos and others^{26} use vectors to perform similar analyses but without stereo-pairs that can be useful in enhancing data visualisation and analysis.

Work by Harris on representing dioptric power graphically evolved over the years to overcome shortcomings he found along the way.^{15,16,21,22} This lead to the development of a four-dimensional space called dioptric power space for thick optical systems but which also has symmetric dioptric powers representing thin optical systems.^{16,21} Symmetric dioptric power space is the three-dimensional sub-space, which represents sphero-cylindrical powers such as those commonly analysed in optometric and ophthalmologic research.

Harris^{22,23,24,25} represents symmetric dioptric power as:

also written as

or as a coordinate vector

where _{I} (= _{st}), _{J} (= _{or}) and _{K} (= _{ob}) are scalars defined by:

The unit of measurement for the scalars _{st}, _{or} and _{ob} is dioptres; however, the matrices _{I}, _{J} and _{K} are used for _{st}, _{or} and _{ob}, respectively. The scalar quantities here are the same as M, J_{0} and J_{45} of Thibos and others.^{26}

_{I}_{J}_{K}_{I}, _{J} and _{K}, respectively) are matrices that can be graphed along three mutually orthogonal axes (in three-dimensional dioptric power space), that is, axes along which any dioptric power can be plotted (see _{I}_{I}_{J}_{K}

Stereo-pair scatter plots with 95% distribution ellipsoids of (a) 40 consecutive central corneal power measurements taken on a single keratoconic eye and (b) 40 consecutive central corneal power measurements taken on a single control eye.^{41} The stereo-pairs have an axis length of 2 D and a tick interval of 1 D. The origin is placed at the sample mean for each stereo pair.

An ellipsoid is the three-dimensional equivalent of an ellipse. For the purposes of this review, α = 0.05 therefore samples of dioptric power data can be used to generate 95% ellipsoids of constant probability density and 95% confidence ellipsoids for means. Other values can be used for α, which result in change in size and volume; however, the shape and orientation of the ellipsoid are maintained. The principal diameters and principal radii of an ellipsoid are measured along its three mutually orthogonal principal axes. The directions of these axes provide useful information when making comparisons between samples or populations. Every ellipsoid has a centroid or centre (the sample mean). The position of the centroid is maintained regardless of the α used.^{15,16}

In multivariate statistics, analysis done on a random and normally distributed sample can be used to make inferences on the population from which the sample was obtained. A sample of dioptric powers can be used to generate ellipsoids of constant probability density (also referred to as distribution ellipsoids), which provide a graphical representation of the spread of dioptric power in a sample. The size, shape and orientation of these distribution ellipsoids characterise the nature of the variation of the population and provide a visual aid in making comparisons between populations.^{16} As can be seen from ^{41}

While ellipsoids of constant probability density describe the distribution of the population of, for example, dioptric power measurements, confidence ellipsoids are confidence regions centred on the sample mean.^{15,16} Confidence ellipsoids also provide an estimation of the mean of the population. Therefore, for example, one can assume at a 95% level of confidence that the mean of a particular population of dioptric power measurements will lie within the respective 95% confidence ellipsoid. Confidence ellipsoids also demonstrate the accuracy of the mean, that is, the smaller the 95% confidence ellipsoid, the less variation is exhibited by the sample and the more confident one can be about the accuracy of the mean. If the confidence ellipsoids of two samples being compared do not intersect (^{17} The opposite applies when the confidence ellipsoids do intersect (^{15,17,19} are used to compare the variances and also means for the two samples concerned.

Stereo-pair scatter plots for central corneal power with 95% confidence ellipsoids for a single (a) keratoconic eye and a single (b) control eye. The axis length is 1 D, tick interval is 0.25 D and the origin is placed at the mean of the sample concerned. There were two measurement sessions for each participant in the study. Black represents Session one and red Session two. (Points are omitted to allow greater clarity with the small ellipsoids.) The intersection of confidence ellipsoids (or lack thereof) provides an indication of a change in the means over two measurement sessions.

Adapted from Harris^{11,12,13,14,15,16,17}: the average of a sample of _{i} is given by:

The transpose of

and therefore the mean coordinate vector can be given by:

and based on vector

According to Harris,^{11}: ‘…the mean is a value around which the sample clusters while the variance is a measure of the spread or dispersion of the cluster around the mean’. The mean and variance of a sample are statistical characteristics that are pertinent to finding correlations in data and drawing conclusions about the populations that the samples represent.

Harris^{6,9} discussed some of the methods used previously to average refractive error. He pointed out that while some of these methods led to the correct answer most times, there were instances where these methods were found to be incomplete or blatantly incorrect. An example of the ^{9} The example provided by Harris is as follows: the naive mean (obtained by averaging each component of a sphero-cylindrical power individually) of 1 –1 × 1 and 1 –1 × 179 was calculated to be 1 –1 × 90, which was clearly wrong. With the use of matrices,^{4,15} an explicit method of finding the mean of refractive errors was developed (see

Stereo-pair scatter plots of comets joining the means of the measurements taken over two sessions: (a) for a group of eyes with keratoconus and (b) for a group of healthy control eyes. For each comet, the dot represents the mean of a set of 40 measurements taken in Session one for the eye concerned, and the comet ends at the mean of the measurements taken in Session two for the same eye. The origin is placed at the respective sample means. The axis lengths and tick intervals are 4 D and 1 D, respectively.

In each of the comets in

Saunders^{26} asserted that dioptric power could not be squared, and hence the variance of a sample could not be calculated. Harris^{6,7,13,15} found this to be incorrect and showed for the first time that variance–covariance matrices could be calculated for dioptric power. ^{2}:

There are six distinct entries in this matrix that describe the variances and covariance of a sample. The diagonal entries _{11}, _{22} and _{33} characterise the variances for _{I}, _{J} and _{K}, respectively. The off-diagonal entries _{12} = _{21}, _{13}= _{31} and _{23} = _{32} characterise the covariances between _{I} and _{J}, _{I} and _{K} and _{J} and _{K}, respectively. The variances and covariances across the meridians of one or more eyes can be graphically represented by means of a polar profile.^{42}

The variances and covariances across the meridians of one or more eyes can be graphically represented using a polar profile (

Polar profiles of variation in central corneal power of (a) a keratoconic eye and (b) a control eye. The solid curve is curvital and the dash is torsional. The radial scale and outer circle (in dots) are set at 0.5 D^{2} and 0.025 D^{2} for (a) and (b), respectively. For the keratoconic eye, the meridian of greatest curvital variance is near 70° and torsional variance is maximal near 115°, whereas the control eye displays maximum curvital variation near 80° and maximum torsional variation near 120°. The variances for the control eye are much smaller than that of the keratoconic eye.

If variation is purely scalar (spherical), then the scaled torsional variance profile will be reduced to a point at the origin of the polar plot, and the curvital variance profile becomes a semi-circle of constant radius. When variation is uniform across all meridians of the eye both profiles take on a semi-circular shape. If variation is non-uniform, then profiles depart from being semi-circles. The torsional variance profile often assumes the shape of a pair of ‘rabbit ears’ as described by Van Gool^{42} and seen in ^{42} and also shown by Gillan,^{33} there appears to be a strong correlation between the position of the rabbit ears and the major axis of the corresponding distribution ellipsoid. If, for example, the rabbit ears orientate symmetrically about the 90° axis, then one can expect to find the major axis of the corresponding distribution ellipsoid orientated parallel to the antistigmatic plane when viewed straight down the stigmatic axis. The following two examples are provided to illustrate how central corneal power varies across all the meridians of the eye and not just within the principal meridians as indicated by many researchers who analyse this sort of data only along the horizontal and/or vertical meridians.

For

Note that the scales are different for the two polar plots, and thus they cannot be so easily compared directly in terms of magnitude of variation. However, on closer inspection of the figures, one would note that the keratoconic eye exhibits the most variation and the control eye exhibits the least variation, irrespective of type of variation (i.e. curvital or torsional). The use of polar plots highlights clearly that dioptric power variation is not isolated to only principal meridians or the horizontal and vertical meridians (180° and 90°) and that there is also a torsional component of variation (albeit small in healthy eyes) that accompanies the curvital variation. This is an important finding and this method of analysis (i.e. polar plots of variance) tends to be ignored or misunderstood in more traditional dioptric power analysis.

Hypothesis testing forms an integral part of statistical analysis. It allows one to make assumptions at a certain level of confidence regarding a population. The multivariate test statistic ^{2} statistic. Details on these statistics and their respective definitions are provided elsewhere^{15} for a more comprehensive understanding. Hypothesis tests are conducted on the variance–covariances and means for all sets of measurements. For example, the equality of variance–covariances and means for a set of dioptric power measurements taken in a first session can be tested against the variance–covariances and means for the dioptric power measurements taken in a second session. For _{v} and _{vv} representing the mean and the variance–covariances of a population of dioptric power data, the hypothesis tests are as follows:^{15,19}

For testing means, the null hypothesis

is tested against the alternative hypothesis

and the test statistic is given by

The null hypothesis is rejected if the test statistic is greater than or equal to the critical value, that is,

in which

For example, α = 0.05,

For testing variance–covariances, the null hypothesis

is tested against the alternative hypothesis

and the test statistic is given by

The null hypothesis is rejected if the test statistic is greater than the critical value, that is,

where χ^{2} can be found in the chi square distribution table. For example, α = 0.05 and the critical value is 12.592.

The underlying assumption when conducting hypothesis tests on means is that Σ = Σ_{0}. An adequate alternative for cases where Σ ≠ Σ_{0} has not been found, and this is referred to as the Behrens–Fisher problem.^{19}

Multivariate statistical analysis is generally based on the assumptions of random selection and multivariate normality of samples. Multivariate normality is modelled with respect to the bell-shaped curve and the associated assumptions that the distribution is continuous, perfectly symmetrical, unskewed and mesokurtic and that the mean, median and mode are all equal.^{43} Because of the nature of sphero-cylindrical data one or more of these underlying assumptions are not always met. While the solution to the problem has not yet been discovered and therefore cannot be circumvented, one continues with the statistical analysis and is prudent with the interpretation of the results.

Multivariate normality can be investigated with the aid of various tools such as skewness and kurtosis, identification of possible outliers (using Mahalanobis distances; ^{44} can be used to calculate the skewness and kurtosis for multivariate samples such as dioptric power. The developers of this programme discuss how the calculations are done to arrive at a ^{45} Possible outliers are assumed atypical in the sample and may be subjectively identified as one or more data points that appear to be far removed from the cluster of data. The Mahalanobis distance is an objective method to quantify the distance so that one can identify possible outliers more accurately.

The Mahalanobis distances for central corneal power measurements.^{41} None of the 40 measurements for a single eye reach the critical distance (or percentage) that would imply possible outliers (≤ 90%).

Multivariate methods of analysis are especially pertinent when investigating keratometric and refractive power data, which is fundamentally trivariate in nature. With the aid of multivariate methods of analysis, keratometric and refractive data would be represented in their entirety; that is, all three components of sphere, cylinder and axis would be used to plot data on three-dimensional stereo-pairs that have distinct advantages, for data visualisation and quantitative analysis in comparison to other methods that use vector analysis in two-dimensional space only. Although some of these methods may seem complicated at first, this review offers a simplified overview of some multivariate methods of analysis that are available to analyse dioptric power holistically.

The authors declare that they have no financial or personal relationships that may have inappropriately influenced them in writing this article.

E.C. and A.R. contributed equally to this work.

Approval to conduct the study was obtained form the Research Ethics Committee of the Faculty of Health Sciences of the University of Johannesburg, South Africa (ethical clearance number: REC-241112-035)

This study was funded by the Thuthuka grant (TTK160519165562) from the National Research Foundation in South Africa.

Data sharing is not applicable to this article as no new data were created or analysed in this study.

The views and opinions expressed in this article are those of the authors and do not necessarily reflect the official policy or position of any affiliated agency of the authors.