Home

QUALITATIVE ANALYSIS

by

Per-Erik E. Bergner

per-erik.bergner@comhem.se

 

 

 

A fundamental procedure of science is comparison. Perhaps one could go so far and say that the very purpose of science is comparison of systems; it could be the same “system” at two different occasions (e.g. a lake in summer and in winter) or it could just be two different systems like two rabbits. By the term quantitative analysis I mean that the comparison is in terms of numerical estimates of some parameter(s). An example would be a simple irreversible first order chemical reaction of the type  taking place in a small well-stirred vessel. If the volume concentration C of a is observed as a function of time, which according to classic theory should be of the form , the so-called rate constant k can be estimated. What has been a rather popular project is to study such a reaction at different temperatures in order to see how the rate of reaction depends on temperature: one compares the same system (reaction) at different temperatures, and those comparisons are made in terms of the parameter k. That is, for each selected value of the temperature the value of k is estimated from the above function and, hence, one can see how the rate constant depends on the temperature. Thus, one can say that the reaction goes so and so much faster at one temperature than at another temperature: the value of k allows for a quantitative measure of “how much faster”. From such a temperature dependency one might estimate another parameter, namely the activation energy (the Arrehnius equation), and then compare quantitatively different but similar reactions (e.g.  )  different systems that is  in terms of that parameter (yes, the temperature behaviour of each such reaction must be determined separately, so we are considering an investigation that usually takes little more time than just a few hours).

There are parameters that can be numerically estimated directly as it were. And as we here are dealing with a well-stirred system the simple notion of temperature is an example of such a quantity. As a principle it is just a matter of placing a thermometer into the mixture and read, which in a more magnificent language sounds “making an empirical observation”.

Of importance is that the numerical estimates are sufficiently good in the sense that a comparison of parameters is feasible in terms of the estimates. Almost invariably this means that the statistical variance is reasonable and that the sample size is large.

It is this demand that often causes difficulties, especially in bio-medical work where often the variances are annoyingly large and the sample sizes frustratingly small, and this to the extent that usual quantitative comparison of parameter values becomes meaningless. An example is when data are a result of an accident and the amount of data is quite limited but nevertheless can be expected to contain valuable information (e.g. in the past, radiation accidents have been a dominant source of information for understanding of man’s response to radioactive irradiation). And for such situations I have earlier suggested a form of data analysis that I name qualitative analysis that, as discussed by Gradijan and Bergner (1972), is somewhat difficult to describe in reasonably simple terms (there is yet no general formalism), and therefore a description of the methodology must be somewhat heuristic.

 

Perhaps it would be correct to identify quantitative analysis with what might be called classic analysis. An aspect on this kind of data analysis should then be that it employs, in my personal terminology, “Fisher statistics”, a characteristic feature of which is that words such as level of confidence and of significance appear in the basic terminology. A typical feature of qualitative analysis should thus be that such notions appear nowhere, simply because there is no attempt to parameter estimation.

 

For instance, it could be that one is able to state “data are consistent with the hypothesis that if the quantity X increases also the quantity Y increases”, where one should note the use of the term “consistent”  at the same time as nothing is said about how fast the increases are. And if we first consider this notion of consistent, one could perhaps say that in qualitative analysis one tries to collect independent consistencies (e.g. a number of independent sets of data all pointing in the same direction), and the larger that number of consistencies is the stronger is the empirical support for the contemplated hypothesis; in a way the number of consistencies replaces the common notion of significance level in quantitative analysis.

If we then turn to the matter that nothing is said about the magnitudes of the increase of the involved quantities, one may observe that a formal expression of this aspect would be:

 

 

 

Personally I feel that this expression illustrates pretty well not only the qualitative character of this particular hypothesis but also the very essence of the term “qualitative” as such.

It could perhaps be tempting to refer to the present notion of qualitative analysis as nonparametric analysis. However, that term is since many decades occupied by a type of data analysis that should rather have been named “distribution free analysis” or, why not, “robust analysis”. An old and almost time less manual of this form of “classic statistics” is the monograph by S. Siegel (1956) that has appeared in quite a number of editions.

Another aspect would be that whereas in classic quantitative analysis one might use statements like “in agreement with data” a corresponding statement in qualitative analysis would be “consistent with data”. Often this means that in qualitative analysis equalities are replaced by inequalities.

To illustrate this point let us assume that we observe two time processes for which we have constructed the models  and  and that the observations have resulted in the data  and  corresponding to the time points . If then for all values of i we have  and  we might say that the models are consistent with data; yes, it would be too much to say that “the models agree with data”, and at least to me the phrase “the models are consistent with data” sounds more appropriate (more modest as it were).

 

It would perhaps be alluring here to say that there are n consistencies in this example but  and this cannot be stated too often  typical for time processes is frequently that consecutive values are not mutually independent and, therefore, consecutive observations are not mutually independent. In fact, it is occasionally pretty difficult to say what the number of independent consistencies actually is.

 

Unquestionably there is commonly a need for as many independent consistencies as possible, and application of the notion of stochastic process might then be called for. For instance, should the time processes in the preceding paragraph be stochastic it could be that the quantities a and b are mean values of the observed quantities and, hence, it would be meaningful and possible to consider the corresponding standard deviations sa and sb as functions of time: should inequalities similar to those above hold also for sa and sb, one could rather safely speak of at least two independent consistencies (with respect to the mean values a and b, and with respect to standard deviations sa and sb). This aspect has been explored by Gradijan and Bergner (loc cit.): the model is “fitted” to data with respect to the behaviour of the mean values and the standard deviations as time functions; the variance is then not merely an annoyance but in fact a property of the investigated system.

 

To view statistical variance as a system property is old hat in today’s physics but not so in biology and medicine, where the methodology has for a long time been focused on methods for making that data feature as “small” as possible. As a matter of fact, when such variance appears as a significant system property in a biomedical investigation conceptual problems arise. For instance, imagine an animal study where the toxicity of two drugs is investigated and that the drugs have the same  but fundamentally different variance. To the best of my knowledge the average pharmacologist of today does not have an appropriate conceptual apparatus for dealing with that kind of situation; i.e. to meaningfully discriminate between the two drugs with respect to that particular data feature.

 

Perhaps it is correct to say that one of the purposes of qualitative analysis is to save data, in the sense of preventing data from being discarded as “insufficient” (i.e. insufficient when viewed in terms of  quantitative analysis: too large variance and too small sample). An example of this is a dose-response study where it could be demonstrated  from hopelessly looking data  that change in dose did not only have a quantitative effect but a qualitative effect as well: a low dose could be more harmful than data revealed to the naked eye (Bergner, 1969).

But qualitative analysis is seemingly more subjective than is quantitative analysis, though the term “seemingly” should be carefully noted: for instance, often statistical tests in quantitative analysis require conditions that are assumed (Sic!) to be valid, and frequently such assumptions are quite subjective indeed.

In summary, a basic difference between quantitative and qualitative analysis is that in the latter no attempts are made to estimate parameter values. In qualitative analysis one frequently experiences this as some extra freedom at selection of models, and technically that equalities (in quantitative analysis) are replaced by inequalities.

REFERENCES

Bergner, P.-E. E. (1969): Theory of quantitative radiation-response time-data. USAEC Report ORAU-109, 25pp. Available through the U.S. Clearinghouse, Springfield, Virginia

           

Gradijan, J.R. & Bergner, P.-E. E. (1972): Qualitative consequences of randomness in a linear kinetic system. Biometrics. 28, 313-328

 

Siegel, S. (1956): Nonparametric  STATISTICS . McGew-Hill. Inc.