Equation 0

The following is an excerpt from The Boxes Paper (see the bottom-most entry in the Library). It lays out a very simple idea regarding how we interpret the world around us. As simple as it is, I have found that keeping the basic idea in mind has helped me be clear about what I think I know and what I am uncertain about.

The job of the scientist is to make sense of the world. A common way to further this effort is to make analogies between things which are understood and those which are not. The analogy is often in the form of a physical model and we can always write:

data = model + residual (0

The terms in Eqn. 0 can be interpreted on a number of levels. At the most general level, data are observations, model is an analog to the processes being studied, and residual is that which the model cannot explain.

In this interpretation, the form of Eqn. 0 is slightly odd; one might expect something more like:

data – model = residual (0a

The form of Eqn. 0 was chosen to illustrate an often overlooked aspect of the observation process. When we collect data, we have in mind, consciously or unconsciously, a model of what we expect to find. It is that model which guides the experiment design process; we design experiments/measurements with expected results in mind. We do not design equipment to measure things which are not expected, but this does not mean that the unexpected would not be measured if it was looked for. In this way, what we actually do find is determined in large part by our expectations (Kuhn 1970). In this abstraction of the data gathering process, residual is a nagging sense that something is not entirely right.

An individual model is identified by a set of equations whose free variables, the model parameters, take on specific values. A class of models is a set of models which are defined by the same equations, but whose model parameters are unspecified. A class of models (e.g., y = ax + b) is a subset of all the possible models, and an individual model (e.g., a = 1; b = 0) is a member of that subset.

As with boxes, models divide the world into two bits; the bit which the model explains (that which is understood) and the bit which the model fails to explain (that which is still a mystery). The objective in a modeling effort is to maximize the portion of the data which is understood and to minimize that which is not. To do this we look for model parameters which minimize the residual in Eqn. 0a. This is equivalent to choosing the member of a class of models which most resembles the data on hand. An unavoidable part of the modeling process is the selection of {selecting} the class of models from which the best representative will be chosen. If an inappropriate class of models is chosen, the best representative will still be an inadequate analog for the process being studied. Following this, another approach to minimizing residual is to choose another class of models; thus the modeling process has two levels: 1) selecting the class of models to be considered; and 2) choosing the best model from within the selected class. The multiple working hypothesis idea of Chamberlain (1897) implies that we should always use at least two classes of models in our attempts to understand our data. There is no guarantee that the “best” (meaning True) model is a member of any class of models.

Models which are interesting and add significantly to our understanding have far fewer adjustable parameters than there are data which need explaining. In such a situation the residual will always be finite and the possibility of finding a better model will always exist.