QUALITATIVE ANALYSIS
by
Per-
A
fundamental procedure of science is comparison. Perhaps one could go so far and
say that the very purpose of science is comparison of systems; it could be the same
“system” at two different occasions (e.g. a lake in summer and in winter) or it
could just be two different systems like two rabbits. By the term quantitative analysis I mean that the comparison is in
terms of numerical estimates of some parameter(s). An example would be a simple
irreversible first order chemical reaction of the
type taking place in a small well-stirred vessel. If the volume concentration C of a is
observed as a function of time, which according to classic theory should be of
the form
,
the so-called rate constant k can be estimated. What has been a rather
popular project is to study such a reaction at different temperatures in order
to see how the rate of reaction depends on temperature: one compares
the same system (reaction) at different temperatures, and those
comparisons are made in terms of the parameter k. That is, for each selected value of
the temperature the value of k is estimated from the above
function and, hence, one can see how the rate constant depends on the
temperature. Thus, one can say that the reaction goes so and so much faster at
one temperature than at another temperature: the value of k allows for a quantitative measure of “how much faster”. From such a
temperature dependency one might estimate another parameter, namely the activation
energy (the Arrehnius equation), and then compare quantitatively different but similar reactions (e.g.
)
different systems that is
in terms of that parameter (yes, the
temperature behaviour of each such reaction must be determined
separately, so we are considering an investigation that usually takes little
more time than just a few
hours).
There are parameters that can be numerically estimated directly as it were. And as we here are dealing with a well-stirred system the simple notion of temperature is an example of such a quantity. As a principle it is just a matter of placing a thermometer into the mixture and read, which in a more magnificent language sounds “making an empirical observation”.
Of importance is that the numerical estimates are sufficiently good in the sense that a comparison of parameters is feasible in terms of the estimates. Almost invariably this means that the statistical variance is reasonable and that the sample size is large.
It is this demand that often causes difficulties, especially in bio-medical work where often the variances are annoyingly large and the sample sizes frustratingly small, and this to the extent that usual quantitative comparison of parameter values becomes meaningless. An example is when data are a result of an accident and the amount of data is quite limited but nevertheless can be expected to contain valuable information (e.g. in the past, radiation accidents have been a dominant source of information for understanding of man’s response to radioactive irradiation). And for such situations I have earlier suggested a form of data analysis that I name qualitative analysis that, as discussed by Gradijan and Bergner (1972), is somewhat difficult to describe in reasonably simple terms (there is yet no general formalism), and therefore a description of the methodology must be somewhat heuristic.
Perhaps it would be correct to identify quantitative analysis with what might be called classic analysis. An aspect on this kind of data analysis should then be that it employs, in my personal terminology, “Fisher statistics”, a characteristic feature of which is that words such as level of confidence and of significance appear in the basic terminology. A typical feature of qualitative analysis should thus be that such notions appear nowhere, simply because there is no attempt to parameter estimation.
For instance, it could be that one is able to state “data are consistent with the hypothesis that if the quantity X increases also the quantity Y increases”, where one should note the use of the term “consistent” at the same time as nothing is said about how fast the increases are. And if we first consider this notion of consistent, one could perhaps say that in qualitative analysis one tries to collect independent consistencies (e.g. a number of independent sets of data all pointing in the same direction), and the larger that number of consistencies is the stronger is the empirical support for the contemplated hypothesis; in a way the number of consistencies replaces the common notion of significance level in quantitative analysis.
If we then turn to the matter that nothing is said about the magnitudes of the increase of the involved quantities, one may observe that a formal expression of this aspect would be:
Personally I feel that this expression illustrates pretty well not only the qualitative character of this particular hypothesis but also the very essence of the term “qualitative” as such.
It could perhaps be tempting to refer to the present notion of qualitative analysis as nonparametric analysis. However, that term is since many decades occupied by a type of data analysis that should rather have been named “distribution free analysis” or, why not, “robust analysis”. An old and almost time less manual of this form of “classic statistics” is the monograph by S. Siegel (1956) that has appeared in quite a number of editions.
Another aspect would be that whereas in classic quantitative analysis one might use statements like “in agreement with data” a corresponding statement in qualitative analysis would be “consistent with data”. Often this means that in qualitative analysis equalities are replaced by inequalities.
To illustrate this point let us assume that we observe
two time processes for which we have constructed the models and
and that the observations have resulted in the
data
and
corresponding to the time points
.
If then for all values of i we have
and
we might say that the models are consistent
with data; yes, it would be too much to say that “the models agree with data”,
and at least to me the phrase “the models are consistent with data” sounds more
appropriate (more modest as it were).
It would
perhaps be alluring here to say that there are n consistencies in this example but and this cannot be stated too often
typical for time processes is frequently that
consecutive values are not mutually independent and, therefore, consecutive
observations are not mutually independent. In fact, it is occasionally pretty
difficult to say what the number of independent consistencies actually is.
Unquestionably there is commonly a need for as many independent consistencies as possible, and application of the notion of stochastic process might then be called for. For instance, should the time processes in the preceding paragraph be stochastic it could be that the quantities a and b are mean values of the observed quantities and, hence, it would be meaningful and possible to consider the corresponding standard deviations sa and sb as functions of time: should inequalities similar to those above hold also for sa and sb, one could rather safely speak of at least two independent consistencies (with respect to the mean values a and b, and with respect to standard deviations sa and sb). This aspect has been explored by Gradijan and Bergner (loc cit.): the model is “fitted” to data with respect to the behaviour of the mean values and the standard deviations as time functions; the variance is then not merely an annoyance but in fact a property of the investigated system.
To view statistical variance as a system property is old hat
in today’s physics but not so in biology and medicine, where the methodology
has for a long time been focused on methods for making that data feature as
“small” as possible. As a matter of fact, when such variance appears as a
significant system property in a biomedical investigation conceptual problems
arise. For instance, imagine an animal study where the toxicity of two drugs is
investigated and that the drugs have the same but fundamentally different variance. To the
best of my knowledge the average pharmacologist of today does not have an
appropriate conceptual apparatus for dealing with that kind of situation; i.e.
to meaningfully discriminate between the two drugs with respect to that
particular data feature.
Perhaps it is correct to say that one of the purposes of
qualitative analysis is to save data, in the sense of preventing data from
being discarded as “insufficient” (i.e. insufficient when viewed in terms
of quantitative analysis: too large
variance and too small sample). An
example of this is a dose-response study where it could be demonstrated from hopelessly looking data
that change in dose did not only have a
quantitative effect but a qualitative effect as well: a low dose could be more
harmful than data revealed to the naked eye (Bergner, 1969).
But qualitative analysis is seemingly more subjective than is quantitative analysis, though the term “seemingly” should be carefully noted: for instance, often statistical tests in quantitative analysis require conditions that are assumed (Sic!) to be valid, and frequently such assumptions are quite subjective indeed.
In summary, a basic difference between quantitative and qualitative analysis is that in the latter no attempts are made to estimate parameter values. In qualitative analysis one frequently experiences this as some extra freedom at selection of models, and technically that equalities (in quantitative analysis) are replaced by inequalities.
REFERENCES
Bergner, P.-E. E. (1969): Theory of
quantitative radiation-response time-data. USAEC Report ORAU-109, 25pp. Available
through the
Gradijan, J.R. & Bergner, P.-E. E. (1972): Qualitative consequences of randomness in a linear kinetic system. Biometrics. 28, 313-328
Siegel, S. (1956): Nonparametric STATISTICS . McGew-Hill. Inc.