C. West Churchman's Systems Epistemology and LLMs
Inquiry as a generative process and as cultural technology
I recently finished reading The Design of Inquiring Systems by C. West Churchman. This book, published in 1971, examines the problem of designing automated systems that conduct inquiry, which the author defines as “an activity which produces knowledge,” from the viewpoint of epistemology, cybernetics (or, more properly, general systems theory), and artificial intelligence. Given when this book was written, artificial intelligence was firmly grounded in the symbolic information processing paradigm1, but the questions it considers take on new relevance given the proliferation of Large Language Models (LLMs). Another interesting aspect of the book is the way Churchman reinterprets the ideas of 17th-century rationalists (Spinoza, Leibniz, Descartes), empiricists (Locke, Hume), German idealists (Kant, Hegel), and American pragmatists (John Dewey and Edgar Singer, who was Churchman’s PhD advisor) using the systems theory framework. What follows is not really a book review, but more a summary of key ideas and how they make contact with current developments in AI.
A philosopher in a business school
First, a few words on Churchman himself. As I already noted above, his PhD advisor was Edgar A. Singer Jr., who himself had studied with William James at Harvard. So, Churchman was trained as a philosopher, yet he spent most of his career as a professor of business administration at Berkeley. He brought his philosophical training into his research interests and wrote books on formal logic, operations research, ethics, philosophy of science.
Inquiring systems as cultural technologies
The way Churchman described inquiring systems, it is a very broad concept that includes entities like universities, corporations, markets, the scientific enterprise as a whole, political arrangements, etc. All of these complex assemblages produce knowledge, in one way or another. In other words, it is convenient to think of Churchman’s inquiring systems in the same vein as Alison Gopnik’s concept of cultural technologies, which she defines as tools that allow individuals to take advantage of vast amounts of collective knowledge and to transmit and amplify the knowledge generated in the process. Henry Farrell applies this idea specifically to LLMs and frames them as engines of cultural transmission, just like libraries and just like language itself. Likewise, Cosma Shalizi argues that one should view LLMs as vast library catalogues. On the other hand, as Farrell and Shalizi write elsewhere, LLMs are also more than just vast repositories of knowledge, they are closer to complex assemblages like markets and bureaucracies, i.e., exactly the kinds of inquiring systems Churchman had in mind.
In fact, the idea of libraries as depositories of knowledge is discussed by Churchman early on in the book. He takes a pragmatist-inflected view, which has a lot in common with the ideas of Clarence I. Lewis:
To conceive of knowledge as a collection of information seems to rob the concept of all of its life. Knowledge is a vital force that makes an enormous difference in the world. Simply to say that it is a storage of sentences is to ignore all that this difference amounts to.
In other words, knowledge resides in the user and not in the collection. It is how the user reacts to a collection of information that matters. Hence we should turn to the other concepts of knowledge, action and potential action. The action conception of knowledge is pragmatic; knowledge is an ability of some person to do something correctly. The person exhibits a form of knowledge if he can perform an assigned task correctly. … Thus, knowledge is a potential for a certain type of action, by which we mean that the action would occur if certain tests were run.
For example, a library plus its user has knowledge if a certain type of response will be evoked under a given set of stipulations, e.g., a correct sentence given a certain type of question. This way of conceptualizing a collection of information is far more useful from the design point of view than thinking of a library alone as a collection; a library so designed that the retrieval of information is either impossible or much too time-consuming is not a collection of information, no matter now many correct sentences are stored there. It is not a collection, because it fails to provide the correct response, given a query.
Thus, both Gopnik’s cultural technologies and Churchman’s inquiring systems are open systems that reside in an environment and interact with it reciprocally. They interface with their environment, receive inputs from it, process them in conjunction with their own internal state, and generate outputs. This now brings us to Churchman’s interesting taxonomy of inquiring systems, which is meaningful both chronologically and conceptually:
Leibnizian inquiring systems: fact nets
Lockean inquiring systems: consensus
Kantian inquiring systems: representation
Hegelian inquiring systems: dialectic
Singerian inquiring systems: progress
I will offer only very schematic description of the above notions. But first we need to take up an important question that, in this form, goes back at least to Descartes.
Descartes: The guarantor problem
Descartes famously inquired into the question of reliability of our senses. What guarantee do we have that our sensory experience comes from a reliable source, that it is not supplied by a deceitful demon? His own “resolution” to this was unsatisfactorily circular: reliability of our perception is guaranteed by God and hence implies existence of God. Nevertheless, the guarantor problem is quite real if we view inputs in the system-theoretic sense as external influences that are treated as free, i.e., unexplained by our model of the system.
Leibniz: monads, fact nets, internalism
Churchman’s definition of the Leibnizian inquirer is a gloss on Leibniz’s monadology. There is no need of external inputs here (“monads have no windows”), and the entire assemblage exists in a pre-established harmony. Thus, the system in question is a collection of facts and propositions relating these facts and arising from them through valid rules of logical inference. Some of these propositions are logical tautologies and others are contradictory or false, and, crucially, the inquirer can establish this using formal reasoning and logic. The propositions that are neither are viewed as contingent truths and are added to the fact net. In a sense, this is similar to Chomsky’s “internalist” view of linguistics: It is an innate generative system that requires only minimal external input (just enough to set some global parameters) and will, nevertheless, eventually arrive at the truth. The process of arriving at the truth is represented as a rank ordering of fact nets, such that eventually we converge to the global optimum. This eventual convergence is, in a nutshell, Leibnizian optimism. However, this is a closed system for symbol manipulation, governed only by its axioms and rules of inference, so any semantic notion of truth is excluded at the outset. Of course, from Leibniz’s point of view, truth is secured by pre-established harmony, but we can’t exactly say this about LLMs. In fact, as Farrell points out (with reference to the work of Eunice Yiu, Eliza Kosoy, and Alison Gopnik):
LLMs operate in a space of information that is disconnected from base reality. An LLM, doesn’t ‘know’ (to the extent that it ‘knows’ anything) that the phrase ‘soft drink stand’ refers to something that exists in the physical universe. For it, soft, drink, and stand is a series of “tokens,” individual strings of letters that don’t refer to, or have relationships with anything except other tokenized strings of letters. What the LLM ‘knows’ is the statistical weights associated with each of these tokens, which summarize its relationship with other tokens, rather than the world we live in.
A Leibnizian inquirer is just such a statistical token manipulator, no more and no less. It has an internal syntax, but where does the semantic grounding come from?
Locke: consensus
John Locke’s ideas are often crudely viewed as a ‘blank slate’ theory: An inquirer has no innate ideas, just a minimal mechanism for assimilating sensory inputs and synthesizing them into the overall experience. However, as Churchman points out, this is not at all accurate. Locke’s ideas should be properly viewed as a design improvement on the Leibnizian inquirer, who has no mechanism for ascertaining the reliability of the strings of tokens it receives and operates on. The key point here is that inquiry is not only a logical process, but also a social process. The inquirer is embedded in a community of inquirers, the Lockean community, which form agreements about the responses of their sensory systems to stimuli. On the blank slate issue, Churchman writes:
The question of design is whether a Lockean inquirer … can legitimately be considered as doing anything significant. The answer might be that the Lockean system is at least a filing system that can grow its own categories. Thus every item is given an elementary label (code) which indicates the elementary properties of the item; furthermore, the use of a given label will evoke an item with a certain Boolean compounding of properties. If we now assume that Lockean inquirers have a memory, then a label will evoke the response that the associated item is stored in memory, e.g., has been observed.
Also, we can assume that if exactly the same item is received on two different occasions, the second item will arrive at the same “station” as the first, and that the inquiring system thus has the capability of recognizing that the two are identical and simply making a note of two instances of the same item having been received. Even at this stage it can be seen that the Lockean inquiring system is far more than a “blank tablet.” It needs considerable processing power to enable it to store different items in different places and to recognize that two items of like kind are to be regarded as two instances of the same input.
Thus, Lockean inquirers are equipped with feature detectors and can form judgments about co-occurrences and repeated occurrences of various stimuli. The semantic grounding comes from the environment, which contains other inquirers:
Nothing has been said as yet about the nature of the elementary labeling process; how do Lockean inquirers come to have common labels? For example, if one asks a child, “What color is this?” how does the child learn to respond correctly? Evidently, the correctness of the response is judged by the adults and in general by a group of “normal” observers. Hence we must somehow design what we can call a “community” of Lockean systems having the same basic set of property labels, as well as the same labels for compounds. The community becomes the basis for judging whether a specific inquiring system is responding correctly.
Thus one teaches a child to join the community by showing the child a yellow object and repeating the label “yellow.” If later when the child is shown a yellow object, he repeats the label, then we feel some assurance that he has “joined” our Lockean community of inquirers.
This is already reminiscent of something like semantic bootstrapping. Moreover, in Churchman’s schema, Lockean inquirers are capable of reflection, i.e., they not only receive inputs, but can also recognize the act of receiving an input and thus they “can act on [their] own activities and label them in very much the same manner as [they label] received inputs.” We can think of this as hierarchical processing in neural nets, where hidden layers receive as their sensory inputs the outputs generated by earlier layers. This can be viewed as a basis of abstraction and generalization.
There is, however, still the issue of grounding because agreement or consensus between members of a Lockean community can only be used as a necessary condition for justifying the inquirer’s epistemic statements. Both Leibnizian and Lockean inquirers operate under the coherence theory of truth, but the key difference is that, for Leibniz, coherence is an internalist notion embodied in a closed system performing logical inference against the background of pre-established harmony, whereas in the Lockean case it takes on an externalist hue since now it is not just a matter of logical consistency, but rather an agreement or consistency of multiple judgments. The relevance for LLMs is obvious, with the important caveat that they should be viewed as embedded in a Lockean community that will contain people and other LLMs. Here, though, it is also important to note Churchman’s notion of conventional Lockean inquirers, i.e., when “the basis of their agreements is a choice of the designer and depends solely on his personal values.” This is the issue of conventionality of language as well, as Churchman notes:
If the inquirers have a Leibnizian capability, they may be speaking in different formal languages, but it may be possible to find a dictionary for each pair of inquirers such that the assertions of one become truths for the other (e.g., “When I say ‘green’ it’s the same thing as ‘yellow’ in your language”). Any apparent disagreement might thus be removed by translation. Would any community of inquirers prove a satisfactory set of empirical inquirers?
Now language is itself a cultural technology; moreover, it is generative in a strong sense as a generator of both genuine novelty and persuasive nonsense. Used by a community of inquirers, it can produce surprising new discoveries or we can end up with QAnon. Again, this is what Churchman says:
At this point, suppose we study the suggestion that the essence of the nonconventional is the human quality of agreement. While a group of computers could be “tuned in” to agreement or disagreement about their inputs, as the designer wishes, no designer can capture that immense force of feeling that takes place when a group of people recognize their complete agreement about the properties of an event. The “objectivity” of their experience rests in its clear, inevitable, unchangeable character, without a hint of the conventional or arbitrary. Of course, the rational designer will want to know whether this feeling tone of the common experience was “built into” the community or comes to it unaltered from nature. After all, a group of computers can also smile and frown and otherwise reinforce each other. And a group of humans can be as silly in their common agreement as any contrived group.
Kant: a priori as world model
The next step in the hierarchy of inquirers takes us from Locke to Kant. As Churchman puts it, Kant
attempts to find the particular set of a priori assertions that are absolutely necessary in order that an inquiring system be capable of receiving inputs. … This capability can be generated provided the inquiring system presupposes a geometry, kinematics and mechanics as well as a logic. One might mention that in the case of the Lockean inquiring system the attempt was made to reduce this list to logic alone, so that the only a priori science for the pure empiricist is the science capable of generating analytic sentences, i.e., sentences whose truth depends solely on logic.
Actually, Kant’s own theory, as outlined in the “Schematism of Pure Reason” in the first Critique, does not seem to go far beyond the Lockean. The data gatherer must presuppose certain properties of a clock, i.e., the clock-events must obey certain laws. Hence there is an exact prediction from a given clock-event to a future clock-event. Beyond the clock, however, is the rest of reality. How does the inquirer decide that there is an objective causal connection between events that are not themselves part of the mechanism of the clock? Also, how does the clock work in assigning time to events, and finally, can the inquirer choose its own clock?
This is yet another step beyond Leibnizian monadology, in that not only do the inquirers have to be capable of receiving sensory inputs and acting on both these inputs and on their abstractions and generalizations formed on the basis of the inputs, but here we are talking about world models in the same sense that researchers like Yann Le Cun talk about them. The Kantian view is also elaborated in the writings of French neuroscientists like Jean Petitot and Alain Berthoz, who view world models not as features of disembodied cognition but as built-in at the “hardware” level of the whole brain-body system.
Hegel: objectivity, dialiectic, observers
Now, both Locke and Kant operate with the concept of the “given.” What is the nature of the given? Churchman, echoing Edgar Singer’s ideas, says that “what is ‘given’ to an inquiring system is a problem of another inquiring system observing the first in its problem-solving activities. The ‘given’ is a concept of the observing inquiring system.” This brings Churchman to Hegel, but in a very unusual manner. First, early on in the book he says that “it is really not until the time of Hegel that it occurred to philosophers that external vs. internal only makes sense to a third mind observing and/or controlling the process of learning. The three minds then become parts of the total inquiring system." This is the starting point of his discussion of objectivity, starting with the dichotomy between the mechanist and the teleological approaches. The mechanist approach relies on the notion of information:
The essence of “mechanical” observation is alienation: the observed subject is opposed to the observer. Either the subject is passive and the observer active, or else the observer receives “inputs” (and hence is passive) while the subject creates “outputs” (and hence is active). The observed and the observer cannot be the same mind, and must be two opposing aspects of a process.
…
The flavor of the opposition between the observer and the subject seems to be well captured by the term “information.” The inquirer is “formed” by a certain type of input, much as a computer is formed by a program. Hence the“information” that is stored in an inquirer is taken to be the set of all reactions of the inquirer to inputs of a certain type. Specifically, we imagine an observer-of-the-subject who can identify an input as an accurate sentence that describes some aspect of the natural world. If this sentence is received and stored by the subject, then the subject has reliable information. The mechanist theory of information goes on to say that a “state of the world” is simply a conjunction of sentences about the properties of objects in the world. The mechanist has an answer to the question: What set of representations capture the essence of an object? The set is comprised of all sentences that accurately describe the object, i.e., all sentences that ascribe all the correct properties to the object.
This is stipulated as “value-free;” as Churchman writes next, “the experts can tell us `facts’ but they can’t tell us what our ultimate values should be.” By contrast,
teleological observation, on the other hand, is a way of observing the world so that the resulting information is useful to a purposive being. To know that a subject has observed “objectively” we need to know the total system in which the subject acts. We can justify the appointment of the master observer-of-the-subject by means of a teleological argument, i.e., the master is the appointed servant of the teleological subject. But this justification simply complicates the relationship because the subject cannot decide without teleological information, and yet he cannot acquire objective teleological information without knowing the whole system.
This is the matter of judgment, in particular among competing values or worldviews (Weltanschauungen, if we want to sound like fancy German idealists). Thisis how the Hegelian dialectic (thesis, antithesis, synthesis) is refracted through Churchman’s systems-theoretic lens:
Can we design judgment in the inquiring system? ... The general idea is to design the class of models (Weltanschauungen) in such a way that each model can be expanded into a more general model, or else can be made more refined by introducing finer distinctions. The straight-faced inquiring system that has created a thesis and an antithesis … now searches for an expanded Weltanschauung which, when conjoined with the data, makes both the thesis and the antithesis maximally irrelevant in the teleological sense. Neither is important relative to the broader objectives of the inquirer. Simultaneously, the broader and/or deeper Weltanschauung maximizes the credence of the “super-proposal” or synthesis. The inquirer can also work on the data bank, either expanding it or making it more precise, and search for the optimal change in the data bank that will maximize the irrelevance of the thesis and antithesis and maximize the credence in the synthesis.
I’m going out on a bit of a limb here, but it seems like these ideas could be conceivably formalized using recent work on self-supervised learning (in fact, Churchman’s “Sketch of an explicit Hegelian inquiring system” bears certain similarities with it, e.g., maximizing the “credence” of the current synthesis).
Singer: measurements, pragmatism, progress
We now come to the final item in Churchman’s hierarchy: Singerian inquirers, named after his PhD advisor Edgar Singer. The Hegelian idea of progress as recursive self-improvement, where the inquirer’s perspective is encompassed in progressively wider Weltanschauungen, seems rather like the old Leibnizian wine in new bottles. Is this idea of progress at all justified?
Interestingly, Churchman presents Singer’s program as an attempt to settle this by a closer examination of what consistutes measurement. Indeed, we can’t really get away from the question of measurement as long as we insist on revising our Weltanschauung by maximizing the credence of our current synthesis. Here, Churchman takes on a decidedly un-idealist turn, reminiscent more of pragmatism and of operationalism espoused by philosophically inclined experimental scientists like Percy Bridgman:
Singer chose as his starting point metrology, a science which has been remarkably neglected by philosophers. Metrology is the science of measurement. Now philosophers have shown an interest in the formal language of measurement (transitivity, asymmetry, etc.), but language is only a part of the story. The really fascinating aspect of metrology from a philosophical point of view is the operational design of measurement, i.e., the steps that must be performed to produce measurements, and the justification that the produced readings accurately describe some aspect of reality.
This, at once, gets at some issues that were left unexplored in the context of Lockean communities, for example, how do communities agree on common standards or units of measurement? From here, Churchman follows Singer in the discussion of replication as approximate agreement of measurements made according to the same protocol, with due acknowledgment of the fact that replicability is only an indicator of coherence among the local measurements but not of correspondence, i.e., agreement, with “the real world.” Interestingly, Singer used the term “natural image” as a rough analog of Kant’s a priori or of Hegel’s Weltanschauung, and this concept seems to sit on the boundary between Wilfrid Sellars’ “manifest image” and “scientific image.” Churchman now synthesizes these ideas in the notion of Singerian inquiring systems:
We can now appreciate the most subtle and difficult design problem of Singerian inquiring systems, which, in honor of its originator, might be called Kant’s problem. It is the problem of revision of the a priori (Kant) or Weltanschauung (Hegel) or natural image (Singer): when and how to revise? The design problem depends on the response to the teleological question—why revise?—which in turn depends on the purpose and measure of performance of the system.
In Singer’s view, the process of revision never stops, and, even though Churchman uses the word “progress,” it is not the cumulative progress of betterment, as envisioned by Hegel. Instead, there is a drive, a restlesness pushing the inquirer onward. Yet, it is not aimless but imbued with the telos of local improvement, similar to gradient descent, trial and error, successive approximation, experimentation (which can, paradoxically, be the basis of revolutionary change as James C. Scott pointed out in Seeing Like a State). This is how Churchman phrases it:
Singer’s theory of progress is far more subtle than the theory of “linear progress” which was popular in the nineteenth century. To understand it, one needs to adopt a dialectical point of view. On one side, call it the light side, is production-science-cooperation, the trilogy of nineteenth-century optimism. The progress toward this trilogy is toward a world of enlightenment, where men have the means to live out their individual lives in their own unique ways, without having to disrupt the lives of others, or, more strongly, with the natural urge to help others to enrich their lives. But the lessons of history tell us that when production and science begin to dominate, then society becomes fragmented; only some men reap the benefits and they do so by exploiting the environment and their fellow man.
This is taking us into the domain of some of the more perceptive critics of science and technology, people like Lewis Mumford or Ivan Illich. Indeed, as Churchman goes on to say,
“Oh,” says the scientist, “then we must use our science to see how we can get men to cooperate more, to reduce population growth rates, air-water pollution, labor exploitation. The measure of progress must include cooperation, which cannot be separated from production-science. Refining our measures and producing more effective machines is not progress ifthereby more conflict occurs. In other words, progress isnot linear, but a very complicated nonlinear relationship between the en- abling forces of production, science, and cooperation.”
This is all very well, but one cannot help noting who is speaking: the scientist. He wants to make science, i.e., the inquiring system, the leading edge of progress because for him there can be no progress without understanding. Even if we grant him his premise that science has created more and more knowledge, why should we also grant him his other premise—that the net benefit has been positive? Why not simply say that making knowledge is like any other form of life: it happens and it is neither good nor bad. You make knowledge, he makes love; you both simply live out an existence.
According to Churchman, to Singer scientific inquiry was a hero’s journey, inspired by Joseph Campbell. And yet, the idea is democratic, not elitist:
It is very important to note that the hero’s journey is not restricted to great men or to semi-gods. The hero is in every one of us, and it is impossible to say whether a Newton or Theseus is a greater hero than the individual who risks his security in the quest for self-knowledge. To be sure, the heroic mood is often suppressed by other emotions and thoughts; to free it in every man is an ideal.
Does this have any bearing on LLMs as inquiring systems? Perhaps, but, to see it, we have to get back to the right notion of what they are: Cultural technologies, a means of exteriorizing knowledge and memory, the record of a historical narrative of both progress and change. The tokens we feed to LLMs are produced by measurements made on the world, the way we interpret it through our current framework of prediction and control. I am reminded here of Yuk Hui’s take on technology expressed as a Kantian antinomy:
Thesis: Technology is an anthropological universal, understood as an exteriorization of memory and the liberation of organs, as some anthropologists and philosophers of technology have formulated it;
Antithesis: Technology is not anthropologically universal; it is enabled and constrained by particular cosmologies, which go beyond mere functionality or utility. Therefore, there is no one single technology, but rather multiple cosmotechnics.
His proposed synthesis is a pluralistic one, influenced by Taoist process philosophy. Interestingly, Churchman closes the discussion of Singerian inquirers with this perspective, which is also Taoist, or possibly Feyerabendian, in spirit:
But what about the question: is there progress or merely process? … The response is: it depends on where you are. If you are at home, in the status quo, there is a kind of quiet progress, an orderliness, cleanness, comfort, in which little discoveries here and there push back the decimal places and provide better ways of doing things. If you are on the road, then there is no progress, just change, which can be bright or dark, funny or sad, tragic or comic. The rules are gone, laws make no sense. If you are fighting the battle, or whatever the mission may be, you are risking your soul for something overwhelmingly important and central. Progress is no longer diffuse, but here and now in your actions; revolution is one word for it. If you are on the way back, you may be disillusioned, angry, dead in spirit, or playful, or senile.
In fact, Churchman was involved in the design of Dendral, one of the first expert systems for organic chemistry applications, together with Ed Feigenbaum, Bruce Buchanan, Joshua Lederberg, and Carl Djerassi.
Very enjoyable essay/review. Thank you, Maxim.