Return to articles on mental architecture


The cognitive functions of language


Peter Carruthers

Department of Philosophy,

University of Maryland,

College Park, MD 20742.


Abstract: This paper explores a variety of different versions of the thesis that natural language is involved in human thinking. It distinguishes amongst strong and weak forms of this thesis, dismissing some as implausibly strong and others as uninterestingly weak. Strong forms dismissed include the view that language is conceptually necessary for thought (endorsed by many philosophers) and the view that language is de facto the medium of all human conceptual thinking (endorsed by many philosophers and social scientists). Weak forms include the view that language is necessary for the acquisition of many human concepts, and the view that language can serve to scaffold human thought processes. The paper also discusses the thesis that language may be the medium of conscious propositional thinking, but argues that this cannot be its most fundamental cognitive role. The idea is then proposed that natural language is the medium for non-domain-specific thinking, serving to integrate the outputs of a variety of domain-specific conceptual faculties (or central-cognitive ‘quasi-modules’). Recent experimental evidence in support of this idea is reviewed, and the implications of the idea are discussed, especially for our conception of the architecture of human cognition. Finally, some further kinds of evidence which might serve to corroborate or refute the hypothesis are mentioned. The overall goal of the paper is to review a wide variety of accounts of the cognitive function of natural language, integrating a number of different kinds of evidence and theoretical consideration in order to propose and elaborate the most plausible candidate.


Keywords: cognitive evolution, conceptual module, consciousness, domain-general, inner speech, logical form (LF), language, thought.


1            Introduction

Natural language looms large in the cognitive lives of ordinary folk. Although proportions vary, many people seem to spend a good deal of their waking activity engaged in ‘inner speech’, with imaged natural language sentences occupying a significant proportion of the stream of their conscious mentality.

This bit of folk-wisdom has been corroborated by Hurlburt (1990, 1993), who devised a method for sampling people’s inner experience. Subjects wore headphones during the course of the day, through which they heard, at various intervals, a randomly generated series of bleeps. When they heard a bleep, they were instructed to immediately ‘freeze’ what was passing through their consciousness at that exact moment and then make a note of it, before elaborating on it later in a follow-up interview. Although frequency varied widely, all normal (as opposed to schizophrenic) subjects reported experiencing inner speech on some occasions – with the minimum being 7% of occasions sampled, and the maximum being 80%. Most subjects reported inner speech on more than half of the occasions sampled. (The majority of subjects also reported the occurrence of visual images and emotional feelings – on between 0% and 50% of occasions sampled in each case). Think about it: more than half of the total set of moments which go to make up someone’s conscious waking life occupied with inner speech - that’s well nigh continuous!

            Admittedly, the sample-sizes in Hurlburt’s studies were small; and other interpretations of the data are possible. (Perhaps the reports of linguistically-clothed thoughts occurring at the time of the beep were a product of confabulation, for example, reflecting people’s naïve theory that thought must be in natural language. If so, this should be testable.) But let us suppose that inner verbalization is as ubiquitous as common-sense belief and Hurlburt’s data would suggest. Just what would all this inner verbalization be doing? What would be its function, or cognitive role? The naïve common-sense answer is that inner verbalization is constitutive of our thinking – it is that we think by talking to ourselves in inner speech (as well as by manipulating visual images etc.). Anyone who holds such a view endorses a version of what I shall call ‘the cognitive conception of language’, which maintains that, besides its obvious communicative functions, language also has a direct role to play in normal human cognition (in thinking and reasoning).

            Quite a different answer would be returned by most members of the cognitive science community, however. For they endorse what I shall call ‘the (purely) communicative conception of language’, according to which language is but an input–output system for central cognition. Believing that language is only a channel, or conduit, for transferring thoughts into and out of the mind, they are then obliged to claim that the stream of inner verbalization is more-or-less epiphenomenal in character. (Some possible minor cognitive roles for inner speech, which should nevertheless be acceptable to those adopting this perspective, will be canvassed later.) The real thinking will be going on elsewhere, in some other medium of representation.

            One reason for the popularity of the communicative conception amongst cognitive scientists is that almost all now believe that language is a distinct input–output module of the mind (at least in some sense of ‘module’, if not quite in Fodor’s classic - 1983 – sense). And they find it difficult to see how the language faculty could both have this status and be importantly implicated in central cognition. But this reasoning is fallacious. For compare the case of visual imagination. Almost everyone now thinks that the visual system is a distinct input-module of the mind, containing a good deal of innate structure. But equally, most cognitive scientists now accept that visual imagination re-deploys the resources of the visual module for purposes of reasoning – for example, many of the same areas of the visual cortex are active when imagining as when seeing. (For a review of the evidence, see Kosslyn, 1994.)

What is apparent is that central cognition can co-opt the resources of peripheral modules, activating some of their representations to subserve central cognitive functions of thinking and reasoning. The same is then possible in connection with language. It is quite consistent with language being an innately structured input and output module, that central cognition should access and deploy the resources of that module when engaging in certain kinds of reasoning and problem solving.

Note, too, that hardly anyone is likely to maintain that visual imagery is a mere epiphenomenon of central cognitive reasoning processes, playing no real role in those processes in its own right. On the contrary, it seems likely that there are many tasks which we cannot easily solve without deploying a visual (or other) image. For example, suppose you are asked (orally) to describe the shape which is enclosed within the capital letter ‘A’. It seems entirely plausible that success in this task should require the generation of a visual image of that letter, from which the answer (‘a triangle’) can then be read off. So it appears that central cognition operates, in part, by co-opting the resources of the visual system to generate visual representations, which can be of use in solving a variety of spatial-reasoning tasks. And this then opens up the very real possibility that central cognition may also deploy the resources of the language system to generate representations of natural language sentences (in ‘inner speech’), which can similarly be of use in a variety of conceptual reasoning tasks.

There is at least one further reason why the cognitive conception of language has had a bad press within the cognitive science community in recent decades. (It continues to be popular in some areas of the social sciences and humanities, including philosophy.) This is that many of the forms of the thesis which have been defended by philosophers and by social scientists are implausibly strong, as we shall see in section 3 below. The unacceptability of these strong views has then resulted in all forms of the cognitive conception being tarred with the same brush.

A crucial liberalizing move, therefore, is to realize that the cognitive conception of language can come in many different strengths, each one of which needs to be considered separately on its own merits. In this paper I shall distinguish between some of the many different versions of the cognitive conception. I shall begin (in section 2) by discussing some weak claims concerning the cognitive functions of language which are largely uncontroversial. This will help to clarify just what (any interesting form of) the cognitive conception is, by way of contrast. I shall then (in section 3) consider some claims which are so strong that cognitive scientists are clearly right in rejecting them, before zeroing in on those which are both interesting and plausible (in sections 4 and 5). I shall come to focus, in particular, on the thesis that natural language is the medium of inter-modular integration. This is a theoretical idea which has now begun to gather independent empirical support. Finally (in sections 6 and 7) some additional implications, elaborations, and possible further empirical tests of this idea are discussed.

I should explain at the outset, however, that the thesis I shall be working towards is that it is natural language syntax which is crucially necessary for inter-modular integration. The hypothesis is that non-domain-specific thinking operates by accessing and manipulating the representations of the language faculty. More specifically, the claim is that non-domain-specific thoughts implicate representations in what Chomsky (1995) calls ‘Logical Form’ (LF). Where these representations are only in LF, the thoughts in question will be non-conscious ones. But where the LF representation is used to generate a full-blown phonological representation (an imaged sentence), the thought will generally be conscious.[1]

I should emphasize that I shall not be claiming that syntax is logically required for inter-modular integration, of course. Nor shall I be claiming that only natural language syntax – with its associated recursive and hierarchical structures, compositionality, and generativity – could possibly play such a role in any form of cognition, human or not. (In fact it is the phrase-structure element of syntax which does the work in my account; see section 6.1 below.) Rather, my claim will be that syntax does play this role in human beings. It is a factual claim about the way in which our cognition happens to be structured, not an unrestricted modal claim arrived at by some sort of task-analysis.

I should also declare at the outset how I shall be using the word ‘thought’ in this paper. Unless I signal otherwise, I intend all references to thought and thinking to be construed realistically. Thoughts are discrete, semantically-evaluable, causally-effective states, possessing component structure, and where those structures bear systematic relations to the structures of other, related, thoughts. So distinct thoughts have distinct physical realizations, which may be true or false, and which cause other such thoughts and behavior. And thoughts are built up out of component parts, where those parts belong to types which can be shared with other thoughts. It is not presupposed, however, that thoughts are borne by sentence-like structures. Although I shall be arguing that some thoughts are carried by sentences (viz. non-domain-specific thoughts which are carried by sentences of natural language), others might be carried by mental models or mental images of various kinds.

It is hugely controversial that there are such things as thoughts, thus construed, of course. And while I shall say a little in defense of this assumption below (in section 3.3), for the most part it is just that – an assumption – for present purposes. I can only plead that one can’t do everything in one paper, and that one has to start somewhere. Those who don’t want to share this assumption should read what follows conditionally: if we were to accept that there are such things as realistically-construed thoughts, then how, if at all, should they be seen as related to natural language sentences?

Finally, a word about the nature of the exercise before we proceed further. This paper ranges over a great many specialist topics and literatures in a number of distinct disciplines. Of necessity, therefore, our discussion of any given subject must be relatively superficial, with most of the detail, together with many of the required qualifications and caveats, being omitted. Similarly, my arguments against some of the competitor theories are going to have to be extremely brisk, and some quite large assumptions will have to get taken on board without proper examination. My goal, here, is just to map out an hypothesis space, using quite broad strokes, and then to motivate and discuss what I take to be the most plausible proposal within it.


2          Weak claims

Everyone will allow that language makes some cognitive difference. For example, everyone accepts that a human being with language and a human being without language would be very different, cognitively speaking. In this section I shall outline some of the reasons why.


2.1            Language as the conduit of belief

Everyone should agree that natural language is a necessary condition for human beings to be capable of entertaining at least some kinds of thought. For language is the conduit through which we acquire many of our beliefs and concepts, and in many of these cases we could hardly have acquired the component concepts in any other way. So concepts which have emerged out of many years of collective labor by scientists, for example – such as electron, neutrino, and DNA – would de facto be inaccessible to someone deprived of language. This much, at any rate, should be obvious. But all it really shows is that language is required for certain kinds of thought; not that language is actually involved in or is the representational vehicle of those thoughts.

            It is often remarked, too, that the linguistic and cognitive abilities of young children will normally develop together. If children’s language is advanced, then so will be their abilities across a range of tasks; and if children’s language is delayed, then so will be their cognitive capacities. To cite just one item from a wealth of empirical evidence: Astington (1996) and Peterson and Siegal (1998) report finding a high correlation between language-ability and children’s capacity to pass false-belief tasks, whose solution requires them to attribute, and reason from, the false belief of another person. Does this and similar data show that language is actually involved in children’s thinking?

In the same spirit, we might be tempted to cite the immense cognitive deficits which can be observed in those rare cases where children grow up without exposure to natural language. Consider, for example, the cases of so-called ‘wolf children’, who have survived in the wild in the company of animals, or of children kept by their parents locked away from all human contact (Malson, 1972; Curtiss, 1977). Consider, also, the cognitive limitations of profoundly deaf children born of hearing parents, who have not yet learned to sign (Sachs, 1989; Schaller, 1991). These examples might be thought to show that human cognition is constructed in such a way as to require the presence of natural language if it is to function properly.

            But all that such data really show is, again, that language is a necessary condition for certain kinds of thought and types of cognitive process; not that it is actually implicated in those forms of thinking. And this is easily explicable from the standpoint of someone who endorses the standard cognitive science conception of language, as being but an input–output system for central cognition, or a mere communicative device. For language, in human beings, is a necessary condition of normal enculturation. Without language, there are many things which children cannot learn; and with delayed language, there are many things which children will only learn later. It is only to be expected, then, that cognitive and linguistic development should proceed in parallel. It does not follow that language is itself actually used in children’s central cognition.

            Another way of putting the point is that this proposed cognitive function of language is purely developmental - or diachronic - rather than synchronic. Nothing is said about the role of language in the cognition of adults, once a normal set of beliefs and concepts has been acquired. And the evidence from aphasia suggests that at least many aspects of cognition can continue to operate normally once language has been removed.

            Aphasias come in many forms, of course, and in many different degrees of severity. And it is generally hard to know the extent of any collateral damage – that is, to know which other cognitive systems besides the language faculty may have been disabled as a result of the aphasia-causing brain-damage. But many patients with severe aphasia continue to be adept at visuo-spatial thinking, at least (Kertesz, 1988), and many continue to manage quite well for themselves in their daily lives.

            Consider, for example, the a-grammatic aphasic man studied in detail by Varley (1998, 2002). He is incapable of either producing or comprehending sentences, and he also has considerable difficulty with vocabulary, particularly verbs. He has lost all mentalistic vocabulary (‘belief’, ‘wants’, etc.), and his language system is essentially limited to nouns. Note that there is not a lot of explicit thinking that you can do using just nouns! (It should also be stressed that he has matching deficits of input and output, suggesting that it is the underlying system of linguistic knowledge which has been damaged). Yet he continues to drive, and to have responsibility for the family finances. He is adept at communicating, using a mixture of single-word utterances and pantomime. And he has passed a range of tests of theory of mind (the standard battery of false-belief and deception tasks, explained using nouns and pantomime), as well as various tests of causal thinking and reasoning. It appears that, once language has done its developmental work of loading the mind with information, a good deal of adult cognition can thereafter survive its loss.

            Since natural language is the conduit for many of our beliefs and for much of our enculturation, everyone should accept that language is immensely important for normal cognitive development. That language has this sort of cognitive function should be no news to anyone.[2]


2.2            Language as sculpting cognition

A stronger and more controversial thesis has been proposed and defended by some researchers over recent decades. This is that the process of language acquisition and enculturation does not merely serve to load the mind with beliefs and concepts, but actually sculpts our cognitive processes to some degree (Lucy, 1992a, 1992b; Nelson, 1996; Bowerman and Levinson, 2001).[3]

For example, acquisition of Yucatec (as opposed to English) – in which plurals are rarely marked and many more nouns are treated grammatically as substance-terms like ‘mud’ and ‘water’ – leads subjects to see similarities amongst objects on the basis of material composition rather than shape (Lucy, 1992b; Lucy and Gaskins, 2001). And children brought up speaking Korean (as opposed to English) – in which verbs are highly inflected and massive noun ellipsis is permissible in informal speech – leads children to be much weaker at categorization tasks, but much better at means–ends tasks such as using a rake to pull a distant object towards them (Choi and Gopnik, 1995; Gopnik et al., 1996; Gopnik, 2001).

            Fascinating as these data are, they do not, in themselves, support any version of the cognitive conception of language. This is because the reported effects of language on cognition are still entirely diachronic and developmental, rather than synchronic. The fact that acquiring one language as opposed to another causes subjects to attend to different things and to reason somewhat differently doesn’t show that language itself is actually involved in people’s thinking. Indeed, on the hypothesis proposed by Gopnik (2001), language-acquisition has these effects by providing evidence for a pre-linguistic theorizing capacity, which operates throughout development to construct children’s systems of belief and inference.


2.3            Language as a cognitive scaffold

Other claims can be extracted from the work of Vygotsky (1934/1986), who argues that language and speech serve to scaffold the development of cognitive capacities in the growing child. Researchers working in this tradition have studied the self-directed verbalizations of young children – for example, observing the effects of their soliloquies on their behavior (Diaz and Berk, 1992). They have found that children tend to verbalize more when task demands are greater, and that those who verbalize most tend to be more successful in problem-solving.

This claim of linguistic scaffolding of cognition admits of a spectrum of readings, however. At its weakest, it says no more than has already been conceded above, that language may be a necessary condition for the acquisition of certain cognitive skills. At its strongest, on the other hand, the idea could be that language forms part of the functioning of the highest-level executive system – which would then make it a variant of the ideas to be discussed in sections 4 and 5 below.

            Clark (1998) argues for a sort of intermediate-strength version of the Vygotskian idea, defending a conception of language as a cognitive tool. (Chomsky, too, has argued for an account of this sort. See his 1976, ch.2.) According to this view – which Clark labels ‘the supra-communicative conception of language’ – certain extended processes of thinking and reasoning constitutively involve natural language. The idea is that language gets used, not just for communication, but also to augment human cognitive powers.

Thus by writing an idea down, for example, I can off-load the demands on memory, presenting myself with an object of further leisured reflection; and by performing arithmetic calculations on a piece of paper, I may be able to handle computational tasks which would otherwise be too much for me (and my short-term memory). In similar fashion, it may be that inner speech serves to enhance memory, since it is now well-established that the powers of human memory systems can be greatly extended by association (Baddeley, 1988). Inner speech may thus facilitate complex trains of reasoning (Varley, 1998).

Notice that on this supra-communicative account, the involvement of language in thought only arises when we focus on a process of thinking or reasoning extended over time. So far as any given individual (token) thought goes, the account can (and does) buy into the standard input–output conception of language. It maintains that there is a neural episode which carries the content of the thought in question, where an episode of that type can exist in the absence of any natural language sentence and can have a causal role distinctive of the thought, but which in the case in question causes the production of a natural language representation. This representation can then have further benefits for the system of the sort which Clark explores (off-loading or enhancing memory).

            According to stronger forms of the cognitive conception to be explored in later sections, in contrast, a particular tokening of an inner sentence is (sometimes) an inseparable part of the mental episode which carries the content of the thought-token in question. So there is no neural or mental event at the time which can exist distinct from that sentence, which can occupy a causal role distinctive of that sort of thought, and which carries the content in question; and so language is actually involved in (certain types of) cognition, even when our focus is on individual (token) thinkings.

In this section I have discussed two weak claims about the role of language (that language is necessary for the acquisition of many beliefs and concepts; and that language may serve as a cognitive tool, enhancing the range and complexity of our reasoning processes). These claims should be readily acceptable to most cognitive scientists. In addition, I have briefly introduced a more controversial thesis, namely that the acquisition of one or another natural language can sculpt our cognitive processes, to some degree. But this thesis relates only to the developmental, or diachronic, role of language. It says nothing about the role of language in adult cognition. We will in future focus on more challenging versions of the cognitive conception of language.


3          Strong claims

As is starting to emerge, the thesis that language has a cognitive function admits of a spectrum of readings. In this section I shall jump to the other end of that spectrum, considering forms of the cognitive conception of language which are too strong to be acceptable.


3.1            Language as necessarily required for thought

When the question of the place of natural language in cognition has been debated by philosophers the discussion has, almost always, been conducted a priori in universalist terms. Various arguments have been proposed for the claim that it is a conceptually necessary truth that all thought requires language, for example (Wittgenstein, 1921, 1953; Davidson, 1975, 1982; Dummett, 1981, 1989; McDowell, 1994). But these arguments all depend, in one way or another, upon an anti-realist conception of the mind – claiming, for instance, that since we cannot interpret anyone as entertaining any given fine-grained thought in the absence of linguistic behavior, such thoughts cannot even exist in the absence of such behavior (Davidson, 1975). Since the view adopted in this paper – and shared by most cognitive psychologists – is quite strongly realist about thought, I do not propose to devote any time to such arguments.

            Notice, too, that Davidson et al. are committed to denying that any non-human animals can entertain genuine thoughts, given that it is very doubtful whether any such animals are capable of understanding and using a natural language (in the relevant sense of ‘language’, that is; see Premack, 1986; Pinker, 1994). This conclusion conflicts, not just with common-sense belief, but also with what can be discovered about animal cognition, both experimentally and by observation of their behavior in the wild (de Waal, 1982, 1996; Walker, 1983; Gallistel, 1990; Savage-Rumbaugh and Lewin, 1994; Byrne, 1995; Dickinson and Shanks, 1995; Allen and Bekoff, 1997; Hauser, 2000; Povinelli, 2000). So not only are the arguments of Davidson et al. unsound, but we have independent reasons to think that their conclusion is false.

            Dummett (1994) makes some attempt to accommodate this sort of point by distinguishing between concept-involving thoughts (which are held to be necessarily dependent upon language) and what he calls ‘proto-thoughts’, which are what animals are allowed to possess. Proto-thoughts are said to consist of ‘visual images superimposed on the visually perceived scene’, and are said to be possible only when tied to current circumstances and behavior. But such an account vastly under-estimates the cognitive capacities of non-human animals, I believe. If an animal can decide whom to form an alliance with, or can calculate rates of return from different sources of food, or can notice and exploit the ignorance of another, then these things cannot be accounted for in Dummett’s terms. And given that conceptual thinking of this sort is possible for animals, then he will be left without any principled distinction between animal thought and human thought.

            I do not expect that these brief considerations will convince any of my philosophical opponents, of course; and they aren’t meant to. Given the intended readership of this target-paper, their position is not really one that I need to take seriously. It is mentioned here just to set it aside, and (most importantly) in order that other, more plausible, versions of the cognitive conception of language shouldn’t be confused with it.

I propose, therefore, to take it for granted that thought is conceptually independent of natural language, and that thoughts of many types can actually occur in the absence of such language. But this leaves open the possibility that some types of thought might de facto involve language, given the way in which human cognition is structured. It is on this – weaker but nevertheless still controversial – set of claims that I shall focus. Claims of this type seem to me to have been unjustly under-explored by researchers in the cognitive sciences; partly, no doubt, because they have been run together with the a priori and universalist claims of some philosophers, which have been rightly rejected.


3.2       The Joycean machine

Another overly-strong form of cognitive conception of language – which has been endorsed by some philosophers and by many social scientists – is that language is, as a matter of fact, the medium of all human conceptual thinking. Most often it has been associated with a radical empiricism about the mind, according to which virtually all human concepts and ways of thinking, and indeed much of the very structure of the human mind itself, are acquired by young children from adults when they learn their native language – these concepts and structures differing widely depending upon the conceptual resources and structures of the natural language in question. This mind-structuring and social-relativist view of language is still dominant in the social sciences, following the writings early in this century of the amateur linguist Whorf (many of whose papers have been collected together in his 1956) indeed, Pinker (1994) refers to it disparagingly as ‘the Standard Social Science Model’ of the mind.

Perhaps Dennett (1991) provides one of the clearest exponents of this view. He argues that human cognitive powers were utterly transformed following the appearance of natural language, as the mind became colonized by memes (ideas, or concepts, which are transmitted, retained and selected in a manner supposedly analogous to genes; see Dawkins, 1976). Prior to the evolution of language, on this picture, the mind was a bundle of distributed connectionist processors – which conferred on early hominids some degree of flexibility and intelligence, but which were quite limited in their computational powers. The arrival of language then meant that a whole new – serial and compositionally structured – cognitive architecture could be programmed into the system.

This is what Dennett calls the Joycean machine (named after James Joyce’s ‘stream of consciousness’ writing). The idea is that there is a highest-level processor which runs on a stream of natural-language representations, utilizing learned connections between ideas, and patterns of reasoning acquired in and through the acquisition of linguistic memes. On this account, then, the concept-wielding mind is a kind of social construction, brought into existence through the absorption of memes from the surrounding culture. And on this view, the conceptual mind is both dependent upon, and constitutively involves, natural language.

Admittedly, what Dennett will actually say is that animals and pre-linguistic hominids are capable of thought, and engage in much intelligent thinking. But this is because he is not (in my sense) a realist about thoughts. On the contrary, he (like Davidson) is what is sometimes called an ‘interpretationalist’ – he thinks that there is nothing more to thinking than engaging in behavior which is interpretable as thinking. Yet he does seem committed to saying that it is only with the advent of natural language that you get a kind of thinking which involves discrete, structured, semantically-evaluable, causally-effective states – that is, thoughts realistically construed.

            Bickerton’s proposals (1990, 1995) are somewhat similar, but more biological in flavor. He thinks that, before the evolution of language, hominid cognition was extremely limited in its powers. On his view these early forms of hominid cognition consisted largely of a set of relatively simple computational systems, underpinning an array of flexible but essentially behavioristic conditioned responses to stimuli. But then the evolution of language some 100,000 years ago involved a dramatic re-wiring of the hominid brain, giving rise to distinctively human intelligence and conceptual powers.[4]

Bickerton, like Dennett, allows that subsequent to the evolution of language the human mind would have undergone further transformations, as the stock of socially transmitted ideas and concepts changed and increased. But the basic alteration was coincident with, and constituted by, a biological alteration – namely, the appearance of an innately-structured language-faculty. For Bickerton is a nativist about language. (Indeed, his earlier work on the creolization of pidgin languages – 1981 – is often cited as part of an argument for the biological basis of language; see Pinker, 1994.) And it is language which, he supposes, conferred on us the capacity for ‘off-line thinking’ – that is, the capacity to think and reason about topics and problems in the abstract, independent of any particular sensory stimulus.

            These strong views seem very unlikely to be correct. This is so for two reasons. First, they undervalue the cognitive powers of pre-linguistic children, animals, and earlier forms of hominid. Thus Homo erectus and archaic forms of Homo sapiens, for example, were able to survive in extremely harsh tundra environments, presumably without language (see below). It is hard to see how this could have been possible without a capacity for quite sophisticated planning and a good deal of complex social interaction (Mithen, 1996). Second, the views of Dennett and Bickerton are inconsistent with the sort of central-process modularism which has been gaining increasing support in recent decades. On this account the mind contains a variety of conceptual modules – for mind-reading, for doing naïve physics, for reasoning about social contracts, and so on – which are probably of considerable ancestry, pre-dating the appearance of a modular language-faculty.[5] So hominids were already capable of conceptual thought, and of reasoning in a complex, and presumably ‘off-line’, fashion before the arrival of language.

In sections 3.3 and 3.4 which follow I shall elaborate briefly on these points. But first, I want to consider a potential reply which might be made by someone sympathetic to Bickerton’s position. For Bickerton actually thinks that earlier hominids probably used a form of ‘proto-language’ prior to the evolution of syntax, similar to the language used by young children and to pidgin languages. (This is, in fact, a very plausible intermediate stage in the evolution of natural language.) It might be claimed, then, that insofar as hominids are capable of intelligent thought, this is only because those thoughts are framed in proto-language. So the view that thought is dependent upon language can be preserved.

Such a reply would, indeed, give Bickerton a little extra wiggle-room; but only a little. For as we shall see in section 3.3 below, a good deal of the evidence for hominid thinking is provided by the capacities of our nearest relatives, the great apes, who are known to lack even a proto-language (without a good deal of human enculturation and explicit training, at any rate; Savage-Rumbaugh and Lewin, 1994). And some of the other evidence – e.g. provided by hominid stone knapping – is not plausibly seen as underpinned by proto-language. Moreover, the various thought-generating central modules, to be discussed in section 3.4 below, are almost surely independent both of language and proto-language. So it remains the case that much hominid thought is independent even of proto-language.


3.3            Hominid intelligence

Since social intelligence is something which we share with the other great apes (especially chimpanzees), it is reasonable to conclude that the common ancestor of all apes - and so, by implication, all earlier forms of hominid - will also have excelled in the social domain. While it is still disputed whether chimpanzees have full-blown mind-reading, or ‘theory of mind’, abilities, of the sort attained by a normal four-year-old child, it is not in dispute that the social behavior of great apes can be extremely subtle and sophisticated (Byrne and Whiten, 1988, 1998; Byrne, 1995; Povinelli, 2000).

Two points are worth stressing in this context. One is that it is well-nigh impossible to see how apes can be capable of representing multiple, complex, and constantly changing social relationships (who is friends with whom, who has recently groomed whom, who has recently fallen out with whom, and so on) unless they are capable of structured propositional thought.[6] This is a development of what Horgan and Tienson (1996) call ‘the tracking argument for Mentalese’ (i.e. an argument in support of the claim that thoughts are structured out of recombinable components). Unless the social thoughts of apes were composed out of elements variously representing individuals and their properties and relationships, then it is very hard indeed to see how they could do the sort of one-off learning of which they are manifestly capable. This surely requires separate representations for individuals and their properties and relations, so that the latter can be varied while the former are held constant. So (contra Dennett and Bickerton) we have reason to think that all earlier forms of hominid would have been capable of sophisticated conceptual thought (realistically construed), at least in the social domain.[7]

            The second point to note is that the social thinking of apes seems sometimes to be genuinely strategic in nature, apparently involving plans which are executed over the course of days or months. Consider, for example, the way in which a band of male chimpanzees will set out quietly and in an organized and purposive manner towards the territory of a neighboring group, apparently with the intention, either of killing some of the males of that group, or of capturing some of its females, or both (Byrne, 1995). Or consider the way in which a lower-ranking male will, over the course of a number of months, build up a relationship with the beta male, until the alliance is strong enough for them to co-operate in ousting the alpha male from his position (de Waal, 1982). Presumably the thinking which would generate such long-term plans and strategies would have to be ‘off-line’, in the sense of not being tied to or driven by current perceptions of the environment.

            We can conclude, then, that all of our hominid ancestors would have had a sophisticated social intelligence. In addition, the stone-tool-making abilities of later species of Homo erectus indicate a sophisticated grasp of fracture dynamics and the properties of stone materials. Making stone tools isn’t easy. It requires judgment, as well as considerable hand-eye co-ordination and upper-body strength. And since it uses a reductive technology (starting from a larger stone and reducing it to the required shape) it cannot be routinized in the way that (presumably) nest-building by weaver birds and dam-building by beavers can be. Stone knappers have to hold in mind the desired shape and plan two or more strikes ahead in order to work towards it using variable and unpredictable materials (Pelegrin, 1993; Mithen, 1996). Moreover, some of the very fine three-dimensional symmetries produced from about half-a-million years ago would almost certainly have required significant capacities for visual imagination - in particular, an ability to mentally rotate an image of the stone product which will result if a particular flake is struck off (Wynn, 2000). And this is surely ‘off-line’ thinking if anything is!

            We can also conclude that early humans were capable of learning and reasoning about their natural environments with a considerable degree of sophistication. They were able to colonize much of the globe, ranging from Southern Africa to North-Western Europe to South-East Asia. And they were able to thrive in a wide variety of habitats (including extremely harsh marginal tundra environments), adapting their life-style to local - and sometimes rapidly changing - circumstances (Mithen, 1990, 1996). This again serves as a premise for a version of the ‘tracking argument’, suggesting that early humans were capable of compositionally-structured thoughts about the biological as well as the social worlds.


3.4       The modular mind

The above claims about the cognitive powers of our early ancestors both support, and are in turn supported by, the evidence of modular organization in the minds of contemporary humans. On this account, besides a variety of input and output modules (including early vision, face-recognition, and language, for example), the mind also contains a number of innately channeled conceptual modules, designed to process conceptual information concerning particular domains. Although these would not be modules in Fodor’s classic (1983) sense, in that they wouldn’t have proprietary transducers, might not have dedicated neural hardware, and might not be fully encapsulated, they would still be innately channeled dedicated computational systems, generating information in accordance with algorithms which are not shared with, nor accessible to, other systems.

Plausible candidates for such conceptual modules might include a naïve physics system (Leslie, 1994; Spelke, 1994; Spelke et al., 1995; Baillargeon, 1995), a naïve psychology or ‘mind-reading’ system (Carey, 1985; Leslie, 1994; Baron-Cohen, 1995), a folk-biology system (Atran, 1990, 1998, 2002), an intuitive number system (Wynn, 1990, 1995; Gallistel and Gelman, 1992; Dehaerne, 1997), a geometrical system for re-orienting and navigating in unusual environments (Cheng, 1986; Hermer and Spelke, 1994, 1996) and a system for processing and keeping track of social contracts (Cosmides and Tooby, 1992; Fiddick et al., 2000).

Evidence supporting the existence of at least the first two of these systems (folk-physics and folk-psychology) is now pretty robust. Very young infants already have a set of expectations concerning the behaviors and movements of physical objects, and their understanding of this form of causality develops very rapidly over the first year or two of life. And folk-psychological concepts and expectations also develop very early, and follow a characteristic developmental profile. Indeed, recent evidence from the study of twins suggests that three-quarters of the variance in mind-reading abilities amongst three year olds is both genetic in origin and largely independent of the genes responsible for verbal intelligence, with only one quarter of the variance being contributed by the environment (Hughes and Plomin, 2000).[8]

            Now, of course the thesis of conceptual modularity is still highly controversial, and disputed by many cognitive scientists. And I cannot pretend to have said enough to have established it here; nor is there the space to attempt to do so. This is going to be one of the large assumptions which I need to ask my readers to take on board as background to what follows. However, there is one sort of objection to conceptual modularity which I should like to respond to briefly here. This is that there simply hasn’t been time for all of these modular systems to have evolved (or at any rate, not those of them that are distinctively human – geometry and folk-physics might be the exceptions).

            Tomasello (1999) argues that the mere six million years or so since the hominid line diverged from the common ancestor of ourselves and chimpanzees is just too short a time for the processes of evolution to have sculpted a whole suite of conceptual modules. He thinks that explanations of distinctively-human cognition need to postulate just one – at most two – biological adaptations, in terms of which all the other cognitive differences between us and chimpanzees should be explained. His preferred option is theory of mind ability, which underpins processes of cultural learning and cultural accumulation and transmission. Others might argue in similar fashion that the only major biological difference is the language faculty (Perner, personal communication).

            The premise of this argument is false, however; six million years is a lot of time, particularly if the selection pressures are powerful ones. (Only 10,000 years separate polar bears and grizzlies, for example.) And this is especially so when, as in the present case, many of the systems in question don’t have to be built ab initio, but can result from a deepening and strengthening of pre-existing faculties. Thus theory of mind would surely have developed from some pre-existing social-cognition module; folk-biology from a pre-existing foraging system; and so on. In order to reinforce the point, one just has to reflect on the major, and multiple, physical differences between ourselves and chimpanzees – including upright gait, arm-length, physical stature, brain size, nasal shape, hairlessness, whites of eyes, and so on and so forth. These, too, have all evolved – many of them independently, plainly – over the last six million years.


3.5       Taking stock

What has happened in the cognitive sciences in recent decades, then, is this. Many researchers have become increasingly convinced, by neuropsychological and other evidence, that the mind is more or less modular in structure, built up out of isolable, and largely isolated, components (Fodor, 1983; Sachs, 1985; Shallice, 1988; Gallistel, 1990; Barkow et al., 1992; Hirschfeld and Gelman, 1994; Sperber et al., 1995; Pinker, 1997). They have also become convinced that the structure and contents of the mind are substantially innate (Fodor, 1981, 1983; Carey, 1985; Spelke, 1994), and that language is one such isolable and largely innate module (Fodor, 1983; Chomsky, 1988; Pinker, 1994). There has then been, amongst cognitive scientists, a near-universal reaction against the cognitive conception of language, by running it together with the Whorfian hypothesis. Most researchers have assumed, without argument, that if they were to accept any form of cognitive conception of language, then that would commit them to Whorfian linguistic relativism and radical empiricism, and would hence be inconsistent with their well-founded beliefs in modularity and nativism (Pinker, 1994).

            It is important to see, however, that someone endorsing the cognitive conception of language does not have to regard language and the mind as cultural constructs, either socially determined or culturally relative. In fact, some form of cognitive conception of language can equally well be deployed along with a modularist and nativist view of language and mind. There are a range of positions intermediate between the input–output conception of language on the one hand, and Whorfian relativism (the Standard Social Science Model) on the other, which deserve the attention of philosophers and cognitive scientists alike. These views are nativist as opposed to empiricist about language and much of the structure of the mind, but nevertheless hold that language is constitutively employed in many of our thoughts.


4            Language and conscious thinking

What is at stake, then, is the question whether language might be constitutively involved in some forms of human thinking. But which forms? In previous work I suggested that language might be the medium in which we conduct our conscious propositional thinking - claiming, that is, that inner speech might be the vehicle of conscious-conceptual (as opposed to conscious visuo-spatial) thinking (Carruthers, 1996). This view takes seriously and literally the bit of folk-wisdom with which this paper began - namely, that much of our conscious thinking (viz. our propositional thinking) is conducted in inner speech.

            Now, if the thesis here is that the cognitive role of language is confined to conscious thinking, then it will have to be allowed that much propositional thinking also takes place independently of natural language - for it would hardly be very plausible to maintain that there is no thinking but conscious thinking. And there are then two significant options regarding the relations between non-conscious language-independent thought, on the one hand, and conscious language-involving thinking, on the other. For either we would have to say that anything which we can think consciously, in language, can also be thought non-consciously, independently of language; or we would have to say that there are some thought-types which can only be entertained at all, by us, when tokened consciously in the form of an imaged natural language sentence.

            Suppose that it is the first - weaker and more plausible - of these options which is taken. Then we had better be able to identify some element of the distinctive causal role of an imaged sentence which is sufficiently thought-like or inference-like for us to be able to say that the sentence in question is partly constitutive of the (conscious) tokening of the thought-type in question, rather than being merely expressive of it. For otherwise - if everything which we can think consciously, in language, we can also think non-consciously, without language - what is to block the conclusion that inner speech is merely the means by which we have access to our occurrent thoughts, without inner speech being in any sense constitutive of our thinking? (On this, at length, see Carruthers 1998b.)

            There would seem to be just two distinct (albeit mutually consistent) possibilities here. One (implicit in Carruthers, 1996) would be to propose a suitably weakened version of Dennett’s Joycean machine hypothesis. While allowing (contra Dennett) that much conceptual thinking (realistically construed) and all conceptual thought-types are independent of language (in the sense of not being constituted by it), we could claim that there are certain learned habits and patterns of thinking and reasoning which are acquired linguistically, and which are then restricted to linguistic (and conscious) tokenings of the thoughts which they govern. It is surely plausible, for example, that exact long-division or multiplication can only be conducted consciously, in imaged manipulations of numerical symbols. Similarly, it may be that the result of taking a course in logic is that one becomes disposed to make transitions between sentences, consciously in language, where one would otherwise not have been disposed to make the corresponding transitions between the thoughts expressed. If these sorts of possibilities are realized, then we would have good reason to say of a token application of a particular inference-form, that the imaged natural language sentences involved are constitutive of the inference in question, since it could not have taken place without them.

A second possibility is proposed and defended by Frankish (1998a, 1998b, and forthcoming; see also Cohen, 1993). This is that the distinctive causal role of inner speech is partly a function of our decisions to accept, reject, or act on the propositions which our imaged sentences express. I can frame a hitherto unconsidered proposition in inner speech and decide that it is worthy of acceptance, thereby committing myself to thinking and acting thereafter as if that sentence were true. Then provided that I remember my commitments and execute them, it will be just as if I believed the proposition in question. (In his published work Frankish describes this level of mentality as the ‘virtual mind’ and the beliefs in question as ‘virtual beliefs’.) But, by hypothesis, I would never have come to believe what I do, nor to reason as I do reason, except via the tokening of sentences in inner speech. Frankish argues, in effect, that there is a whole level of mentality (which he now dubs ‘supermind’) which is constituted by our higher-order decisions and commitments to accept or reject propositions; and that language is constitutive of the thoughts and beliefs which we entertain at this level.

Such views have considerable plausibility; and it may well be that one, or other, or both of these accounts of the causal role of inner speech is correct. Indeed, the dual process theory of human reasoning developed over the years by Evans and colleagues (Wason and Evans, 1975; Evans and Over, 1996), and more recently by Stanovich (1999), combines elements of each of them. On this account, in addition to a suite of computationally powerful, fast, and implicit reasoning systems (from our perspective, a set of conceptual modules), the mind also contains a slow, serial, and explicit reasoning capacity, whose operations are conscious and under personal control, and which is said (by some theorists at least; e.g. Evans and Over, 1996) to involve natural language. The emphasis here on learned rules in the operations of the explicit system is reminiscent of Dennett’s ‘Joycean machine’, whereas the stress on our having personal control over the operations of that system seems very similar to Frankish’s conception of ‘supermind’.

Not only is some form of dual-process theory plausible, but it should also be stressed that these accounts are independent of central-process modularism. Those who deny the existence of any conceptual modules can still accept that there is a level of thinking and reasoning which is both language-involving and conscious. It is surely plain, however, that none of the above accounts can amount to the most fundamental cognitive function of language once conceptual modularity is assumed.

Given conceptual modularity, then unless the above views are held together with the thesis to be developed in section 5 below – namely, that language provides the medium for inter-modular communication and non-domain-specific thinking – then we can set their proponents a dilemma. Either they must claim that a domain-general architecture was in place prior to the evolution of language. Or they must allow that there was no significant domain-general cognition amongst hominids prior to the appearance of language and language-involving conscious thinking; and they must claim that such cognition still evolved as a distinct development, either at the same time or later. Since contemporary humans are manifestly capable of conjoining information across different domains in both their theoretical thought and their planning, then either pre-linguistic humans must already have had domain-general theoretical and practical reasoning faculties, or they must have evolved them separately at the same time or after the evolution of the language faculty (that is, if it isn’t language itself which enables us to combine information across modules).

The problem with the first alternative, however – namely, that domain-general reasoning capacities pre-dated language – is that the evidence from cognitive archaeology suggests that this was not the case. For although the various sub-species of Homo erectus and archaic forms of Homo sapiens were smart, they were not that smart. Let me briefly elaborate.

As Mithen (1996) demonstrates at length and in detail, the evidence from archaeology is that the minds of early humans were in important respects quite unlike our own. While they successfully colonized diverse and rapidly changing environments, the evidence suggests that they were incapable of bringing together information across different cognitive domains. It seems that they could not (or did not) mix information from the biological world (utilized in hunting and gathering) with information about the physical world (used in tool making); and that neither of these sorts of information interacted with their social intelligence. Although they made sophisticated stone tools, they did not use those tools for specialized purposes (with different kinds of arrow-head being used for different kinds of game, for example); and they did not make tools out of animal products such as antler and bone. There is no sign of the use of artifacts as social signals, in the form of body ornaments and such-like, which is so ubiquitous in modern human cultures. And there is no indication of totemization or other sorts of linkages between social and animal domains, such as lion-man figurines, cave-paintings, or the burying of the dead with (presumably symbolic) animal parts - which all emerge onto the scene for the first time with modern humans. As Mithen summarizes the evidence, it would appear that early humans had sophisticated special intelligences, but that these faculties remained largely isolated from one another.

The problem with the second horn of the dilemma sketched above is just that it is hard to believe, either that a domain-general reasoning faculty might have evolved after the appearance of language some 100,000 years ago (in just the 20,000 years or so before the beginning of the dispersal of modern humans around the globe), or that language and domain-general capacities might have co-evolved as distinct faculties. For as we shall see in section 5, the evolution of language would in any case have involved the language faculty taking inputs from, and sending outputs to, the various modular systems, if there wasn’t already a domain-general system for it to be linked to. And it is hard to discern what the separate selection pressures might have been, which would have led to the development of two distinct faculties at about the same time (language and domain-general thought), when just one would serve.


5            Language as the medium of non-domain-specific thinking

The hypothesis which I particularly want to explore, then, is that natural language is the medium of non-domain-specific thought and inference. Versions of this hypothesis have been previously proposed by Carruthers (1996, 1998a), by Mithen (1996), and by Spelke and colleagues (Hermer-Vazquez et al., 1999; Spelke and Tsivkin, 2001; Spelke, forthcoming). I shall sketch the thesis itself, outline the existing experimental evidence in its support, and then (in the section following) consider some of its ramifications and possible elaborations. Finally (in section 7) I shall discuss what further evidence needs to be sought as a test of our thesis.


5.1       The thesis

The hypothesis in question assumes a form of central-process modularism. That is, it assumes that in addition to the various input and output modules (vision, face-recognition, hearing, language, systems for motor-control, etc.), the mind also contains a range of conceptual modules, which take conceptual inputs and deliver conceptual outputs. Evidence of various sorts has been accumulating in support of central-process modularism in recent decades (some of which has already been noted above). One line of support is provided by evolutionary psychologists, who have argued on both theoretical and empirical grounds that the mind contains a suite of domain-specific cognitive adaptations (Barkow et al., 1992; Sperber, 1996; Pinker, 1997). But many who would not describe themselves as ‘evolutionary psychologists’ have argued for a modular organization of central cognition, on developmental, psychological, and/or neuro-pathological grounds (Carey, 1985; Shallice, 1988; Gallistel, 1990; Carey and Spelke, 1994; Leslie, 1994; Spelke, 1994; Baron-Cohen, 1995; Smith and Tsimpli, 1995; Hauser and Carey, 1998).

            What cognitive resources were antecedently available, then, prior to the evolution of the language faculty? Taking the ubiquitous laboratory rat as a representative example, I shall assume that all mammals, at least, are capable of thought – in the sense that they engage in computations which deliver structured (propositional) belief-like states and desire-like states (Dickinson, 1994; Dickinson and Balleine, 2000). I shall also assume that these computations are largely carried out within modular systems of one sort or another (Gallistel, 1990) – after all, if the project here is to show how cross-modular thinking in humans can emerge out of modular components, then we had better assume that the initial starting-state was a modular one. Furthermore, I shall assume that mammals possess some sort of simple non-domain-specific practical reasoning system, which can take beliefs and desires as input, and figure out what to do.

            I shall assume that the practical reasoning system in animals (and perhaps also in us) is a relatively simple and limited-channel one. Perhaps it receives as input the currently-strongest desire and searches amongst the outputs of the various belief-generating modules for something which can be done in relation to the perceived environment which will satisfy that desire. So its inputs have the form DESIRE [Y] and BELIEF [IF X THEN Y], where X should be something for which an existing motor-program exists. I assume that the practical reasoning system is not capable of engaging in other forms of inference (generating new beliefs from old), nor of combining together beliefs from different modules; though perhaps it is capable of chaining together conditionals to generate a simple plan – e.g. BELIEF [IF W THEN X], BELIEF [IF X THEN Y] ® BELIEF [IF W THEN Y].

            The central modules will take inputs from perception, of course. And my guess is that many of the beliefs and desires generated by the central modules will have partially indexical contents – thus a desire produced as output by the sex module might have the form, ‘I want to mate with that female’, and a belief produced by the causal-reasoning module might have the form, ‘That caused that’. So if the practical reasoning system is to be able to do anything with such contents, then it, too, would need to have access to the outputs of perception, to provide anchoring for the various indexicals. The outputs of the practical reasoning system are often likely to be indexical too, such as an intention of the form, ‘I’ll go that way’.

The inputs to central-process modules can presumably include not only conceptualized perceptions but also propositional descriptions (in the latter case deriving from linguistic input - for we surely use our mind-reading system, for example, when processing a description of someone’s state of mind as well as when observing their behavior). And in some cases, too, the inputs to a module will include the outputs of other central-process modules; for we might expect that there will be cases in which modules are organized into some sort of hierarchy. But what of the outputs from central-process modules? Besides being directed to other modules (in some instances), and also to the practical reasoning system, where is the information which is generated by central-process modules normally sent? And in particular, is there some non-domain-specific central arena where all such information is collated and processed?

            The hypothesis being proposed here is that there is such an arena, but one which crucially implicates natural language, and which cannot operate in the absence of such language. Moreover, the hypothesis is not just that our conscious propositional thinking involves language (as sketched in section 4 above), but that all non-domain-specific reasoning of a non-practical sort (whether conscious or non-conscious) is conducted in language. And as for the question of what a non-conscious tokening of a natural language sentence would be like, we can propose that it would be a representation stripped of all imagistic-phonological features, but still consisting of natural language lexical items and syntactic structures. (The role of syntax in the present account will be further explored in section 6.1 below.)

Chomsky (1995) has maintained, for example, that there is a level of linguistic representation which he calls ‘Logical Form’ (LF), which is where the language faculty interfaces with central cognitive systems. We can then claim that all cross-modular thinking consists in the formation and manipulation of these LF representations. The hypothesis can be that all such thinking operates by accessing and manipulating the representations of the language faculty. Where these representations are only in LF, the thoughts in question will be non-conscious ones. But where the LF representation is used to generate a full-blown phonological representation (an imaged sentence), the thought will nomally be conscious. And crucially for my purposes, the hypothesis is that the language faculty has access to the outputs of the various central-process modules, in such a way that it can build LF representations which combine information across domains.

Let me say a just little more about the conscious / non-conscious distinction as it operates here. As I shall mention again in a moment (and as I shall return to at some length in section 6.2) language is both an input and an output module. Its production sub-system must be capable of receiving outputs from the conceptual modules in order to transform their creations into speech. And its comprehension sub-system must be capable of transforming heard speech into a format suitable for processing by those same conceptual modules. Now when LF representations built by the production sub-system are used to generate a phonological representation, in ‘inner speech’, that representation will be consumed by the comprehension sub-system, and made available to central systems. One of these systems is a theory of mind module. And on the sort of higher-order theory of consciousness which I favor (Carruthers, 2000), perceptual and imagistic states get to be phenomenally conscious by virtue of their availability to the higher-order thoughts generated by the theory of mind system (i.e. thoughts about those perceptual and imagistic states). So this is why inner speech of this sort is conscious: it is because it is available to higher-order thought.

            The hypothesis, then, is that non-domain-specific, cross-modular, propositional thought depends upon natural language - and not just in the sense that language is a necessary condition for us to entertain such thoughts, but in the stronger sense that natural language representations are the bearers of those propositional thought-contents. So language is constitutively involved in (some kinds of) human thinking. Specifically, language is the vehicle of non-modular, non-domain-specific, conceptual thinking which integrates the results of modular thinking.

            Before moving on to discuss the evidence in support of our thesis, consider one further question. Why does it have to be language, and not, for example, visual imagery which serves the integrative function? For visual images, too, can carry contents which cross modular domains. But such visual thinking will access and deploy the resources of a peripheral input module. It cannot, therefore, play a role in integrating information across conceptual modules, because the latter exist down-stream of the input-systems. Vision provides input to conceptual modules, and doesn’t receive output from them. The language faculty, in contrast, while also ‘peripheral’, has both input and output functions. (I shall return to this point again in section 6.2 below.) I would hypothesize, therefore, that in cases where visual images have cross-modular contents (and aren’t memory images), they are always generated from some linguistic representation which originally served to integrate those contents.


5.2       The evidence

What evidence is there to support the hypothesis that natural language is the medium of inter-modular communication, or of non-domain-specific integrated thinking? Until recently, the evidence was mostly circumstantial. For example, one indirect line of argument in support of our thesis derives from cognitive archaeology, when combined with the evidence of contemporary central-process modularism (Mithen, 1996). For as we noted above, it seems that we only have significant evidence of cross-modular thought following the emergence of contemporary humans some 100,000 years ago; whereas independent evidence suggests that language, too, was a late evolutionary adaptation, only finally emerging at about the same time (perhaps from an earlier stage of ‘proto-language’ - Bickerton, 1990, 1995). So the simplest hypothesis is that it is language which actually enables cross-modular thinking.

            Another strand of indirect evidence can be provided if we take seriously the idea that the stream of inner verbalization is constitutive of (some forms of) thinking (Carruthers, 1996). For as we saw in section 4 above, such views can only plausibly be held (given the truth of central-process modularism) together with the present hypothesis that language is the main medium of inter-modular communication.

            Much more importantly, however, direct tests of (limited forms of) our hypothesis have now begun to be conducted. The most important of these is Hermer-Vazquez et al. (1999), which provides strong evidence that the integration of geometric properties with other sorts of information (color, smell, patterning, etc.) is dependent upon natural language. The background to their studies with human adults is the apparent discovery of a geometric module in rats by Cheng (1986), as well as the discovery of a similar system in pre-linguistic human children (Hermer and Spelke, 1994, 1996).

            Cheng (1986) placed rats in a rectagonal chamber, and allowed them to discover the location of a food source. They were then removed from the chamber and disoriented, before being placed back into the box with the food now hidden. In each case there were multiple cues available - both geometric and non-geometric - to guide the rats in their search. For example, the different walls might be distinctively colored or patterned, one corner might be heavily scented, and so on. In fact in these circumstances the rats relied exclusively on geometric information, searching with equal frequency, for example, in the two geometrically-equivalent corners having a long wall on the left and a short wall on the right. Yet rats are perfectly well capable of noticing and remembering non-geometric properties of the environment and using them to solve other tasks. So it appears that, not only are they incapable of integrating geometric with non-geometric information in these circumstances, but that geometric information takes priority.

            (This makes perfectly good ecological-evolutionary sense. For in the rat’s natural environment, overall geometrical symmetries in the landscape are extremely rare, and geometrical properties generally change only slowly with time; whereas object-properties of color, scent-markings, and so on will change with the weather and seasons. So a strong preference to orient by geometrical properties is just what one might predict.)

            Hermer and Spelke (1994, 1996) found exactly the same phenomenon in pre-linguistic human children. Young children, too, rely exclusively on geometric information when disoriented in a rectangular room, and appear incapable of integrating geometrical with non-geometrical properties when searching for a previously seen but now-hidden object. Older children and adults are able to solve these problems without difficulty - for example, they go straight to the corner formed with a long wall to the left and a short blue wall to the right. It turns out that success in these tasks isn’t directly correlated with age, nonverbal IQ, verbal working-memory capacity, vocabulary size, or comprehension of spatial vocabulary. In contrast, the only significant predictor of success in these tasks which could be discovered, was spontaneous use of spatial vocabulary conjoined with object-properties (e.g. ‘It’s left of the red one’). Even by themselves, these data strongly suggest that it is language which enables older children and adults to integrate geometric with non-geometric information into a single thought or memory.

            Hermer-Vazquez et al. (1999) set out to test this idea with a series of dual-task experiments with adults. In one condition, subjects were required to solve one of these orientation problems while shadowing (i.e. repeating back) speech played to them through a set of headphones. In another condition, they were set the same problems while shadowing (with their hands) a rhythm played to them in their headphones. The hypothesis was that speech-shadowing would tie up the resources of the language faculty, whereas the rhythm-shadowing tasks would not; and great care was taken to ensure that the latter tasks were equally if not more demanding of the resources of working memory.

            The results of these experiments were striking. Shadowing of speech severely disrupted subjects’ capacity to solve tasks requiring integration of geometric with non-geometric properties. In contrast, shadowing of rhythm disrupted subjects’ performance relatively little. Moreover, a follow-up experiment demonstrated that shadowing of speech didn’t disrupt subjects’ capacities to utilize non-geometric information per se - they were easily able to solve tasks requiring only memory for object-properties. So it would appear that it is language itself which enables subjects to conjoin geometric with non-geometric properties, just as the hypothesis that language is the medium of cross-modular thinking predicts.

            Of course, this is just one set of experiments - albeit elegant and powerful - concerning the role of language in enabling information to be combined across just two domains (geometrical, and object-properties). In which case, little direct support is provided for the more-demanding thesis that language serves as the vehicle of inter-modular integration in general. But the evidence does at least suggest that the more general thesis may be well worth pursing.


5.3              Challenging the data

The position taken by Hermer-Vazquez et al. (1999) has come under pressure from two different directions. First, there are claims that other species (chickens, monkeys) can integrate geometric and landmark information when disoriented (Vallortigara et al., 1990; Gouteux et al., in press). And second, there is the finding that success in these tasks amongst young children is somewhat sensitive to the size of the room – in a larger room, significantly more young (4-year-old) children make the correct choice, utilizing both geometric and landmark information; and even more 5- and 6-year-old children are also able to make the correct choice (Learmonth et al., 2001, in press).

To begin unpicking the significance of these new results, we need to return to some of the original claims. It is too strong to say that the original data with rats (Cheng, 1986) showed the existence of a geometric module in that species. For rats can use landmark information when navigating  in other circumstances. The fact is just that they don’t use such information when disoriented. Nor is it established that rats cannot integrate geometric with landmark information. The fact is just that they do not utilize both forms of information when disoriented. So the data are consistent with the following model: there are no modules; rather, geometric and landmark information are both processed according to general-purpose algorithms and made available to some sort of practical reasoning system. But when disoriented rats only pay attention to, and only make use of, the geometric information.

Even if one thinks (as I do) that other forms of evidence and other arguments make some sort of modularist architecture quite likely, the following proposal is still consistent with the data: both the geometric and landmark modules normally make their information available to some sort of practical reasoning system; but when disoriented, rats show a strong preference to make use only of the geometric information.

Equally, however, the fact that other species are able to solve these problems doesn’t show that members of those species can integrate geometric with landmark information into a single belief or thought. For it is possible to solve these tasks by making use of the information sequentially. The problems can be solved by first re-orienting to the landmark, and then using geometric information to isolate the correct corner. So the data are consistent with the following modularist model: both the geometric and landmark modules make their outputs available to a limited-channel practical reasoning system, where the latter doesn’t have the inferential resources to integrate information from different modules; rather, it can only utilize that information sequentially, using a variety of heuristics (both innate and learned) in selecting the information to be used, and in what order. On this view, the difference between monkeys and rats is just that the former utilize landmark information first, before using geometry; whereas the latter use geometry exclusively in these circumstances. Neither species may in fact be capable of integrating geometrical with landmark information.

(It is tempting to seek an adaptionist explanation of these species differences. Open-country dwellers such as rats and pre-linguistic humans may have an innate pre-disposition to rely only on geometric information when disoriented because such information alone will almost always provide a unique solution (given that rectagonal rooms don’t normally occur in nature!). Forest dwellers such as chickens and monkeys, in contrast, have an innate pre-disposition to seek for landmark information first, only using geometric information to navigate in relation to a known landmark. This is because geometric information is of limited usefulness in a forest – the geometry is just too complex to be useful in individuating a place in the absence of a landmark such as a well-known fruit-tree.)

What of the new data concerning the effects of room size? Well, the first thing to say is that this data leaves intact the finding by Hermer-Vazquez et al. (1999) that the best predictor of success in children (in small room experiments) is productive use of left-right vocabulary. This suggests, both that language has something to do with their success, and that it is specifically syntax (the capacity to integrate different content-bearing items into a single thought) which is required. For if the role of language were simply to help fix the salience and importance of landmark information, one would expect that it should have been productive use of color vocabulary, rather than spatial vocabulary, which was the best predictor of success. For by hypothesis, after all, children are already disposed to use geometric information in reorienting; their problem is to make use of color information as well.

Equally untouched are the experiments with adults involving speech shadowing and rhythm shadowing, which found that the former greatly disrupts the capacity to use geometric and landmark information together, whereas the latter does not. These results, too, suggest that it is language which enables adults to integrate the two forms of information.

Why should room size have any effect upon children’s performance, however? Here is one testable possibility, which is consistent with the theoretical framework of Hermer-Vazquez et al. (1999) and the present author. In a small room (4 feet by 6 feet) it requires but very little time and energy to select a corner and turn over a card. In a larger room (8 feet by 12 feet), in contrast, children have to take a few steps in order to reach a selected corner, giving them both a motive, and the time, to reflect. It may then be that the children who were able to succeed in the large-room condition were on the cusp of having the linguistic competence necessary to integrate geometric and landmark information. Perhaps they could do this, but only haltingly and with some effort. Then it is only to be expected that such children should succeed when given both the time and the motive to do so.


5.4       More data: language and arithmetic

We have examined one set of data which provides strong support for a limited version of our thesis. Data is now available in one other domain - that of number. This comes from a recent bilingual training study conducted by Spelke and Tsivkin (2001). The background to this study is the discovery of numerical capacities in animals and human infants, of two different sorts. One is the capacity possessed by many different kinds of animal (including birds and fish) to represent the approximate numerosity of largish sets of items (Gallistel, 1990; Dehaene, 1997). This capacity is utilized especially in foraging, enabling animals to estimate rates of return from different food sources. The other numerical capacity is possessed by monkeys and human infants, at least. It is a capacity to represent the exact number of small sets of items (up to about four), keeping track of their number using simple forms of addition and subtraction (Gallistel and Gelman, 1992; Hauser and Carey, 1998).

            The developmental hypothesis which forms the backdrop to Spelke and Tsivkin’s study is that language-learning in human children - specifically, learning to pair number words with items in a set through the process of counting - builds upon these two pre-linguistic numerical capacities to enable humans to represent exact numbers of unlimited magnitude. But the developmental hypothesis could be interpreted in two ways. On one interpretation, the role of language is to load the child’s mind with a set of language-independent exact numerical concepts - so that, once acquired, the capacity to represent exact large magnitudes is independent of language. The other interpretation is that it is the numerical vocabulary of a specific natural language which forms the medium of exact-magnitude representation, in such a way that natural language is the vehicle of arithmetic thought. It is this latter interpretation which Spelke and Tsivkin (2001) set out to test.

            They conducted three different bilingual arithmetic training experiments. In one experiment, bilingual Russian-English college students were taught new numerical operations; in another, they were taught new arithmetic equations; and in the third, they were taught new geographical and historical facts involving both numerical and non-numerical information. After learning a set of items in each of their two languages, subjects were tested for knowledge of those and of new items in both languages. In all three studies subjects retrieved information about exact numbers more effectively in the language in which they were trained on that information, and they solved trained problems more effectively than new ones. In contrast, subjects retrieved information about approximate numbers and about non-numerical (geographical or historical) facts with equal ease in their two languages, and their training on approximate number facts generalized to new facts of the same type.[9] These results suggest that one or another natural language is the vehicle of thought about exact numbers, but not for representing approximate numerosity (a capacity shared with other animals).

            What we have, then, is the beginnings of evidence in support of our general thesis that natural language is the medium of inter-modular non-domain-specific thinking. In section 7 I shall briefly consider where and how one might search for yet more evidence. But first I shall look at some of the questions raised by our thesis.


6            Ramifications and implications

In this section I shall consider - very briefly - some of the implications of the hypothesis just proposed, as well as discussing a number of outstanding questions.


6.1       Speech production

It is plain that the present hypothesis commits us to a non-classical account of speech production. Classically, speech begins with thought - with a mental representation of the message to be communicated; and then linguistic resources (lexical and phonological items, syntactic structures and so on) are recruited in such a way as to express that thought in speech. (See, e.g., Levelt, 1989.) But of course this is a picture which we cannot endorse. We cannot accept that the production of the sentence, ‘The toy is to the left of the blue wall’ begins with a tokening of the thought, the toy is to the left of the blue wall (in Mentalese), since our hypothesis is that such a thought cannot be entertained independently of being framed in natural language.[10] How, then, does the sentence get assembled? I have to confess that I don’t have a complete answer to this question in my pocket at the moment! But then this need be no particular embarrassment, since classical theorists don’t have an account of how their initial Mentalese thoughts are assembled, either.

            In fact our hypothesis enables us to split the problem of speech production / domain-general thought-generation into two, each of which may prove individually more tractable. For we are supposing that central modules are capable of generating thoughts with respect to items in their domain. Thus the geometrical module might build a thought of the form, the toy is in the corner with a long wall on the left and a short wall on the right, whereas the object-property system might build a thought of the form, the toy is by the blue wall. Each of these thoughts can be taken as input by the language faculty, we may suppose, for direct translation into natural language expression. It is, then, not so very difficult to suppose that the language faculty might have the resources to combine these two thoughts into one, forming a representation with the content, The toy is in the corner with a long wall on the left and a short blue wall on the right.[11] Nor is this such a very large departure from classical accounts. (Certainly it is much less radical than Dennett’s endorsement of pandemonium models of speech production; see his 1991.)

            The two tasks facing us in explaining speech production, then, are first, to explain how the thoughts generated by central modules are used to produce a natural language sentence with the same content; and second, to explain how the language faculty can take two distinct sentences, generated from the outputs of distinct conceptual modules, and combine them into a single natural language sentence. There is some reason to hope that, thus divided, the problem may ultimately prove tractable. In responding to the first part of the problem we can utilize classical accounts of speech production. Here just let me say a brief word about the second of the above problems, by way of further explaining the role of syntax in my model.

            Two points are suggestive of how distinct domain-specific sentences might be combined into a single domain-general one. One is that natural language syntax allows for multiple embedding of adjectives and phrases. Thus one can have, ‘The food is in the corner with the long wall on the left’, ‘The food is in the corner with the long straight wall on the left’, and so on. So there are already ‘slots’ into which additional adjectives – such as ‘blue’ – can be inserted. The second point is that the reference of terms like ‘the wall’, ‘the food’, and so on will need to be secured by some sort of indexing to the contents of current perception or recent memory. In which case it looks like it would not be too complex a matter for the language production system to take two sentences sharing a number of references like this, and combine them into one sentence by inserting adjectives from one into open adjective-slots in the other. The language faculty just has to take the two sentences, ‘The food is in the corner with the long wall on the left’ and, ‘The food is by the blue wall’ and use them to generate the sentence, ‘The food is in the corner with the long blue wall on the left’, or the sentence, ‘The food is in the corner with the long wall on the left by the blue wall’.


6.2       Cycles of LF activity

The present proposals may enable us to rescue one other aspect of Dennett’s (1991) ‘Joycean machine’ hypothesis (in addition to the ‘learned linguistic habits’ idea, discussed in section 4 above), again without commitment to his claim that language is the medium of all (realistically-construed and structured) conceptual thought. This is the suggestion that by asking ourselves questions we can initiate searches in a number of different modular systems, perhaps generating new information which in turn generates new questions, and so on. In general this cycle of questions and answers will go on consciously, in inner speech (as Dennett supposes), but it might also be conducted non-consciously (either below the threshold of attention or perhaps in LF – I shall not pursue this suggestion here), through over-learning. Let me elaborate.

            The crucial point for these purposes is that natural language is both an input and an output system. It is the output sub-system of the language faculty which will initially play the role of conjoining information from different conceptual modules, since it is this sub-system which will have been designed to receive inputs from those modules. This is because the evolution of a language system would already have required some sort of interface between the Mentalese outputs of the conceptual modules and the speech-production sub-system of the language faculty, so that those thoughts could receive expression in speech. And we are supposing that this interface became modified during the evolution of language so that thoughts deriving from distinct conceptual modules could be combined into a single natural language sentence. But when the resulting LF representation is used to generate a phonological representation of that sentence, in inner speech, this might normally co-opt the resources of the input sub-system of the language faculty, in such a way as to generate a ‘heard’ sentence in auditory imagination. By virtue of being ‘heard’, then, the sentence would also be taken as input to the conceptual modules which are down-stream of the comprehension sub-system of the language faculty, receiving the latter’s output. So the cycle goes: thoughts generated by central modules are used to frame a natural language representation, which is used to generate a sentence in auditory imagination, which is then taken as input by the central modules once again.

A comparison with visual imagination may be of some help here. According to Kosslyn (1994), visual imagination exploits the top-down neural pathways (which are deployed in normal vision to direct visual search and to enhance object recognition) in order to generate visual stimuli in the occipital cortex, which are then processed by the visual system in the normal way, just as if they were visual percepts. A conceptual or other non-visual representation (of the letter ‘A’, as it might be) is projected back through the visual system in such a way as to generate activity in the occipital cortex, just as if a letter ‘A’ were being perceived. This activity is then processed by the visual system to yield a quasi-visual percept.

            Something very similar to this presumably takes place in auditory (and other forms of) imagination. Back-projecting neural pathways which are normally exploited in the processing of heard speech will be recruited to generate a quasi-auditory input, yielding the phenomenon of ‘inner speech’. In this way the outputs of the various conceptual modules, united into a sentence of LF by the production sub-system of the language faculty, can become inputs to those same modules by recruiting the resources of the comprehension sub-system of the language faculty, in inner speech.

            But now, how would a sentence which combines information across a number of distinct central-modular domains have its content ‘split up’ so as to be taken as input again by those modules, with their proprietary and domain-specific concepts? One plausible suggestion is that this is one of the primary functions of so-called ‘mental models’ - non-sentential, quasi-imagistic, representations of the salient features of a situation being thought about (Johnson-Laird, 1983). For it is now well-established that mental models play an indispensable role in discourse comprehension (see Harris, 2000, for reviews). When listening to speech, what people do is construct a mental model of the situation being described, which they can then use to underpin further inferences. The reason why this may work is that mental models, being perception-like, are already of the right form to be taken as input by the suite of conceptual modules. For of course those modules would originally have been built to handle perceptual inputs, prior to the evolution of language.

            So the suggestion is that language, by virtue of its role in unifying the outputs of conceptual modules, and by virtue of our capacity for auditory imagination, can be used to generate cycles of central-modular activity, hence recruiting the resources of a range of specialized central-modular systems in seeking solutions to problems. This may be one of the main sources of the cognitive flexibility and adaptability which is so distinctive of our species. But how, exactly, are the LF questions which are used in such cycles of enquiry to be generated? How does the language system formulate interrogative sentences which are both relevant and fruitful? I do not have an answer to this question. But I am not embarrassed by this lack, since I suspect that no one has, as yet, a worked-out story about how interrogative thoughts are formed.


6.3       LF consumers

What are the consumer systems for the LF sentences generated from the outputs of the central modules? What can be done with an LF sentence, once it has been formulated? One thing which it can be used for, obviously, is to generate an imaged natural language sentence with the same content, thereby rendering the thought in question conscious, and triggering the kinds of mental activity and further consequences distinctive of conscious thinking. Specifically, it may make possible sequences of thought in accordance with learned habits or rules (see section 4 above); it will make that sentence and its content accessible to a variety of central-process systems for consideration, and for acceptance or rejection (see section 4 above); and it may make possible cycles of LF activity involving central modules, generating new thought-contents which were not previously available (see section 6.2 above).

            In addition, I can think of two plausible special-purpose systems which may have been designed to consume LF representations. First, it may be that there is a domain-general factual memory system. (It is already known that there is a factual–semantic memory system which is distinct from the experiential–personal memory system, and that the latter is experience-driven whereas the former is not. See Baddeley, 1988.) This system would either store domain-general information in the form of LF sentences, or (more plausibly) in some other format (mental models?) generated by the LF sentences which it takes as input. (Recall the data from Spelke and Tsivkin, 2001, that geographical and historical information is recalled equally readily whether or not the language of learning is the same as the language of testing.)

Second, it may be that there is, in addition, some sort of innately-channeled abductive reasoning faculty, which places constraints upon sentence acceptance (Carruthers, 1992, 2002). This would be a domain-general reasoning system, taking LF sentences as input and generating LF sentences as output. The reasons for believing in a faculty of ‘inference to the best explanation’ are two-fold. First, there are certain very general constraints on theory-choice employed in science which are equally valid in other areas of enquiry, and which appear to be universal amongst humans. While no one any longer thinks that it is possible to codify these principles of abductive inference, it is generally agreed that the good-making features of a theory include such features as: accuracy (predicting all or most of the data to be explained, and explaining away the rest); simplicity (being expressible as economically as possible, with the fewest commitments to distinct kinds of fact and process); consistency (internal to the theory or model); coherence (with surrounding beliefs and theories, meshing together with those surroundings, or at least being consistent with them); fruitfulness (making new predictions and suggesting new lines of enquiry); and explanatory scope (unifying together a diverse range of data).

Essentially these same principles are employed in many contexts of everyday reasoning. Most strikingly for our purposes, however, such principles are employed by human hunter-gatherers, especially when tracking prey. Successful hunters will often need to develop speculative hypotheses concerning the likely causes of the few signs available to them, and concerning the likely future behavior of the animal; and these hypotheses are subjected to extensive debate and further empirical testing by the hunters concerned. When examined in detail these activities look a great deal like science, as Liebenberg (1990) demonstrates.

First, there is the invention of one or more hypotheses concerning the unobserved (and now unobservable) causes of the observed signs, and the circumstances in which they may have been made. These hypotheses are then examined and discussed for their accuracy, coherence with background knowledge, and explanatory and predictive power.[12] One of them may emerge out of this debate as the most plausible, and this can then be acted upon by the hunters, while at the same time searching for further signs which might confirm or count against it. In the course of a single hunt one can then see the birth, development, and death of a number of different ‘research programs’ in a manner which is at least partly reminiscent of theory-change in science (Lakatos, 1970).

The second point supporting the existence of an abductive ‘consumer system’ is this. Not only are abductive principles universal amongst humans, but it is hard to see how they could be other than substantially innate (Carruthers, 1992). For since these principles are amongst the basic principles of learning, they cannot themselves be learned. And neither are they explicitly taught, at least in hunter–gatherer societies. While nascent trackers may acquire much of their background knowledge of animals and animal behavior by hearsay from adults and peers, very little overt teaching of tracking itself takes place. Rather, young boys will practice their observational and reasoning skills for themselves, first by following and interpreting the tracks of insects, lizards, small rodents, and birds around the vicinity of the camp-site, and then in tracking and catching small animals for the pot (Liebenberg, 1990). Nor are abductive principles taught to younger school-age children in our own society, in fact. Yet experimental tests suggest that children’s reasoning and problem-solving is almost fully in accord with those principles, at least once the tests are conducted within an appropriate scientific-realist framework (Koslowski, 1996). This is in striking contrast with many other areas of cognition, where naïve performance is at variance with our best normative principles. (For reviews see Evans and Over, 1996; Stein, 1996.)

In addition to a domain-general factual memory system, then, I have suggested that there may well be a domain-general faculty of abductive inference. So there are, it seems, at least two likely domain-general consumer-systems for LF representations, which either co-evolved with language, or which were specially designed by evolution at some point after language had taken on its role as the medium of inter-modular integration.


6.4       LF and mind-reading

There are a number of reasons for thinking that the language faculty and our mind-reading (or ‘theory of mind’) faculty will be intimately connected with one another. First, there is no question but that mind-reading is vitally implicated in the processing and interpretation of speech, especially in its pragmatic aspects, including such phenomena as metaphor and irony (Sperber and Wilson, 1986/1995). Second, there is good reason to think that the evolution of the two faculties will have been intertwined in a kind of evolutionary ‘arms race’ (Gómez, 1998), and that language is one of the crucial inputs for normal mind-reading development in young children (Harris, 1996; Peterson and Siegal, 1998). Finally, our mind-reading faculty - specifically our capacity for higher-order or meta-representational thought - will be crucial to the operations of the sort of serial, conscious, language-using level of mentality discussed in sections 4 and 6.2 above (Sperber, 1996; Perner, 1998).

            Almost everyone now accepts that our mind-reading capacity comes in degrees, developing in stages between nine months (or earlier) and around four years of age; and a number of related proposals have been made concerning the stages which young children pass through (Wellman, 1990; Perner, 1991; Gopnik and Melzoff, 1997). Those who want to claim that language is implicated in mind-reading capacities would, I think, restrict their claims to the sort of meta-representational mind-reading of which children become capable at about four (Segal, 1998; de Villiers, 2000). That is, what they find attractive is the suggestion that two- and three-year-old psychology (of the sort which we may well share with chimpanzees, perhaps) is independent of language, whereas full-blown theory of mind (four-year-old psychology or ToM) partly depends upon it. But we should distinguish between two different claims here, one of which is very likely true, yet the other of which is probably false.

The first (weaker) suggestion is just that full-blown ToM needs to access the resources of the language faculty in order to describe the contents of (some of) the thoughts being attributed to self or other. (I say this is weaker because the concept of thought in general, as a representational state of the agent which can represent correctly or incorrectly, can still be held to be independent of language - see below.) And if you think about it, something like this has got to be true if any version of the present proposal about the role of language in linking together different modules is correct. For if geometry and color (say) can’t be combined in a single thought without language, then one could hardly expect the mind-reading faculty to be able to attribute a thought to another (or to oneself) which conjoins geometry with color without deploying language! This would be to give that faculty almost-magical super-properties possessed by no other module. No, if entertaining the thought that the object is to the left of the blue wall requires tokening the LF representation, ‘The object is to the left of the blue wall’, then ascribing to someone the belief that the object is to the left of the blue wall would similarly require the use of that LF sentence.

There may be a more general point here, which gives the element of truth in so-called ‘simulationist’ theories of our mind-reading abilities (Gordon, 1986, 1995; Heal, 1986, 1995; Goldman, 1989, 1993). For in order for you to know what someone is likely to infer from a given thought, you will have to deploy your general - non-ToM - inferential resources, including any that are modular in nature. Otherwise we will have to think of the mind-reading system as somehow encompassing all others, or as containing a meta-theory which describes the operations of all others. This is in fact the reason why many - including Nichols and Stich, 1998 and forthcoming; Botterill and Carruthers, 1999; and others - are now defending a sort of mixed theory / simulation view of our mind-reading abilities.

The second, stronger, hypothesis would be that the very concept of thought is dependent upon language, and requires an LF vehicle. On this view, you can only have the concept of belief, say, as a representational and potentially-false state of an agent, if you are a language-user - specifically, if you have mastered some version of the that-clause construction made available in all natural languages (Segal, 1998). It is this hypothesis which I would be inclined to deny. For there is evidence from aphasic adults, at least, that people who have lost their capacity for mentalistic vocabulary can nevertheless pass false-belief tasks of various sorts (Varley, 1998). So I think that the full, four-year-old, ToM system is a language-independent theory which comes on line at a certain stage in normal development (albeit with that development being especially accelerated by the demands of interpreting linguistic input - Harris, 1996; Peterson and Siegal, 1999), which nevertheless has to access the resources of other systems (including the language faculty) in order to go about its work of deducing what to expect of someone who has a given belief, and so on. But of course the question is an empirical one, and the stronger hypothesis may well turn out to be right.

One way of pointing up the difference, here, is that on the weaker hypothesis you can do false-belief without language (at least in intra-modular contexts), although there will be many thoughts which you will be incapable of attributing (viz. those which are dependent upon language); whereas on the stronger hypothesis the capacity to solve false-belief tasks will be language-dependent across all contexts. Another empirical difference between the two accounts would appear to be that on the weaker hypothesis false-belief tasks which deal with contents drawn from a single module should be easier than those dealing with cross-modular contents - for the latter but not the former will need to operate on an LF representation. But on the stronger hypothesis there should be less or no difference, since all higher-order thoughts will deploy LF representations. These predictions cry out for experimental investigation.


7          Future evidence

What further sorts of evidence would either confirm, or disconfirm, the hypothesis that natural language is the medium of cross-modular thinking?

            One obvious way forward would be to undertake many more dual-task studies of the sort conducted by Hermer-Vazquez et al. (1999). Subjects might be asked to solve problems which require information to be conjoined from two or more central-modular domains - for example: geometrical and folk-biological, or geometrical and folk-physical, or folk-biological and folk-physical. They might be asked to solve these tasks in two conditions – in one of which they are asked to shadow speech (hence tying up the resources of their language faculty), and in the other of which they are asked to shadow a rhythm (tying up their working memory to an equal or greater degree). If they fail on the task when shadowing speech but not when shadowing rhythm then this would be further evidence in support of the thesis that it is language which is the medium for integrating knowledge from different conceptual modules.

            If it were to turn out that subjects perform equally well in the two conditions (either succeeding in both or failing in both), then would this be evidence against our thesis? Perhaps. But some caution would need to be shown before drawing any such conclusion. For one difficulty standing in the way of developing such studies is that of ensuring that the tasks genuinely do involve the conjoining of information across domains, and that they cannot be solved by accessing that information sequentially. So if subjects succeed under each of the conditions (whether shadowing speech or shadowing rhythm) in a task requiring them to tap into both folk-physical and folk-biological information, say, this might be because they first use information from one domain to make progress in the task and then use information from the other to complete it. (One of the distinctive features of the problems chosen for study by Hermer-Vazquez et al., is that it was known independently that in conditions of spatial disorientation, geometric information is relied upon exclusively in the absence of language.) Some care will therefore need to be taken in designing the relevant experiments.

            In addition, of course, it is still to some degree controversial whether (and if so which) central modules exist. So a negative result in a dual-task experiment might be because one or more of the supposed conceptual modules chosen for the study doesn’t really exist, not because language isn’t the medium of inter-modular communication. But as often has to happen in science, we can regard such dual-task studies as jointly testing both the thesis of conceptual modularity and the claim that language is the medium of inter-modular integration. A positive result will count in favor of both claims; a negative result will count against one or other of them (but not both).

            In principle, dual-task studies might fruitfully be devised wherever researchers have independent reason to believe in the existence of a conceptual module. For example, those who believe that there is a special system for recognizing and figuring out the degree of relatedness of kin, or those who believe that there is a special system for processing social contracts and detecting cheaters and free-riders, might construct a dual-task study to test whether the conjoining of information from these domains with others requires language. But I want to stress that such studies should not be conducted in the domain of mind-reading (despite the fact that many believe in its central-modular status), because of the points made in section 6.4 above. Since there is reason to think that mind-reading routinely co-opts the resources of the language faculty in any case, failure in a speech-shadowing task but not in a rhythm-shadowing task would not necessarily count in favor of the thesis that language is the medium of cross-modular thinking.[13]

            The other obvious place to look for evidence for or against our thesis, is in connection with either global or a-grammatic aphasia. If subjects who are known to lack any capacity for formulating natural language sentences fail at tasks requiring them to conjoin information from different conceptual-modular domains, but can pass equivalently demanding tasks within a given domain, then this would be strong evidence in support of our claim. (This is a big ‘if’, of course, given the difficulties of discriminating between patients who have lost all grammatical competence and those who only have problems with linguistic input and output.) And here, as before, if such subjects should turn out to pass both types of task, then this will count either against the thesis that language is the medium of inter-modular integration, or the existence of one or other of the supposed modules in question (but not both).


8            Conclusion

This paper has reviewed a wide range of claims concerning the cognitive functions of language. At one extreme is the purely communicative (or input-output) conception of language, and at the other extreme is the claim that language is required for all propositional thought as a matter of conceptual necessity, with a variety of positions in between these two poles. Section 2 discussed some versions of the cognitive conception of language which are too weak to be of any deep interest; and section 3 considered some claims which are too strong to be acceptable. Section 4 expressed sympathy for a variety of ‘dual process’ models of cognition, especially the claim that language is the vehicle of conscious-conceptual thinking. But it pointed out that to be plausible (given the truth of central-process modularism) such views must depend on the prior and more fundamental claim that language is the medium of cross-modular thought. This then became the focus of our enquiries in sections 5 through 7, where evidence was adduced in its support, its implications discussed, and a call for further experimental testing was posted.

            In closing, however, let me provide a reminder of the character of the exercise we have undertaken. Almost every paragraph in this paper has contained claims which are still controversial to some degree, and yet there hasn’t been the space to pursue those controversies or to defend my assumptions. This has been inevitable, given the array of theories we have considered, and the range of considerations and types of evidence which are relevant to their truth, drawn from a variety of academic disciplines. But then the task has only been to survey those theories, and to show that some of them are well enough motivated to warrant further investigation - not to nail down and conclusively establish a precisely formulated thesis. And in that task, I hope, the paper has succeeded.[14]



Allen, C. and Bekoff, M. (1997). Species of Mind: the philosophy and psychology of cognitive ethology. MIT Press.

Astington, J. (1996). What is theoretical about the child’s theory of mind? In: Theories of Theories of Mind, eds. P. Carruthers and P.K. Smith. Cambridge University Press.

Atran, S. (1990). Cognitive Foundations of Natural History: Towards an anthropology of science. Cambridge University Press.

Atran, S. (1998). Folk biology and the anthropology of science: cognitive universals and cultural particulars. Behavioral and Brain Sciences, 21:547-568.

Atran, S. (2002). An experimental approach to the cognitive basis of science: universal and cultural factors in biological understanding. In: The Cognitive Basis of Science, eds. P. Carruthers, S. Stich and M. Siegal. Cambridge University Press.

Baddeley, A. (1988). Human Memory. Erlbaum.

Baillargeon, R. (1995). Physical reasoning in infancy. In: The Cognitive Neurosciences, ed. M. Gazzaniga. MIT Press.

Barkow, J., Cosmides, L. and Tooby, J., eds., (1992). The Adapted Mind. MIT Press.

Baron-Cohen, S. (1995). Mindblindness. MIT Press.

Bickerton, D. (1981). Roots of Language. Ann Arbor.

Bickerton, D. (1990). Language and Species. University of Chicago Press.

Bickerton, D. (1995). Language and Human Behavior. University of Washington Press. (UCL Press, 1996.)

Botterill, G. and Carruthers, P. (1999). The Philosophy of Psychology. Cambridge University Press.

Bowerman, M. and Levinson, S. eds. (2001). Language Acquisition and Conceptual Development. Cambridge University Press.

Byrne, R. (1995). The Thinking Ape. Oxford University Press.

Byrne, R. and Whiten, A. eds. (1988). Machiavellian Intelligence. Oxford University Press.

Byrne, R. and Whiten, A. eds. (1998). Machiavellian Intelligence II: Evaluations and extensions. Cambridge University Press.

Carey, S. (1985). Conceptual Change in Childhood. MIT Press.

Carey, S. and Spelke, E. (1994). Domain-specific knowledge and conceptual change. In: Mapping the Mind, eds. L. Hirshfeld and S. Gelman. Cambridge University Press.

Carruthers, P. (1992). Human Knowledge and Human Nature. Oxford University Press.

Carruthers, P. (1996). Language, Thought and Consciousness. Cambridge University Press.

Carruthers, P. (1998a). Thinking in language?: evolution and a modularist possibility. In: Language and Thought, eds. P. Carruthers and J. Boucher. Cambridge University Press.

Carruthers, P. (1998b). Conscious thinking: language or elimination? Mind and Language, 13:323-342.

Carruthers, P. (2000). Phenomenal Consciousness: a naturalistic theory. Cambridge University Press.

Carruthers, P. (2002). The roots of scientific reasoning: infancy, modularity and the art of tracking. In: The Cognitive Basis of Science, eds. P. Carruthers, S. Stich and M. Siegal. Cambridge University Press.

Chater, N. (1999). The search for simplicity: A fundamental cognitive principle? Quarterly Journal of Experimental Psychology, 52A:273-302.

Cheng, K. (1986). A purely geometric module in the rat’s spatial representation. Cognition, 23:149-178.

Choi, S. and Gopnik, A. (1995). Early acquisition of verbs in Korean: a cross-linguistic study. Journal of Child Language, 22:497-529.

Chomsky, N. (1976). Reflections on Language. Temple Smith.

Chomsky, N. (1988). Language and Problems of Knowledge. MIT Press.

Chomsky, N. (1995). The Minimalist Program. MIT Press.

Clark, A. (1998). Magic words: how language augments human computation. In: Language and Thought, eds. P. Carruthers and J. Boucher. Cambridge University Press.

Cohen, L.J. (1993). An Essay on Belief and Acceptance. Oxford University Press.

Cosmides, L. and Tooby, J. (1992). Cognitive adaptations for social exchange. In: The Adapted Mind, eds. J. Barkow, L. Cosmides and J. Tooby. Oxford University Press.

Curtiss, S. (1977). Genie: a psycholinguistic study of a modern-day wild child. Academic Press.

Davidson, D. (1975). Thought and talk. In: Mind and Language, ed. S. Guttenplan. Oxford University Press.

Davidson, D. (1982). Rational Animals. In: Actions and Events, eds. E. Lepore and B. McLaughlin. Blackwell.

Dawkins, R. (1976). The Selfish Gene. Oxford University Press.

de Villiers. J. (2000). Language and Theory of mind: what is the developmental relationship? In: Understanding other minds:  perspectives from autism and developmental cognitive neuroscience, eds. S. Baron-Cohen, H. Tager-Flusberg, and D. Cohen. Cambridge University Press.

de Waal, F. (1982). Chimpanzee Politics. Jonathan Cape.

de Waal, F. (1996). Good Natured. Harvard University Press.

Dehaene, S. (1997). The Number Sense. Oxford University Press.

Dennett, D. (1991). Consciousness Explained. Penguin Press.

Diaz R. and Berk, L. eds. (1992). Private Speech: from social interaction to self-regulation. Erlbaum.

Dickinson, A. (1994). Instrumental conditioning. In: Animal Learning and Cognition, ed. N. Mackintosh. Academic Press.

Dickinson, A. and Balleine, B. (2000). Causal cognition and goal-directed action. In: The Evolution of Cognition, eds. C. Heyes and L. Huber. MIT Press.

Dickinson, A. and Shanks, D. (1995). Instrumental action and causal representation. In: Causal Cognition, eds. D. Sperber, D. Premack and A. Premack. Blackwell.

Dummett, M. (1981). The Interpretation of Frege’s Philosophy. Duckworth.

Dummett, M. (1989). Language and communication. In: Reflections on Chomsky, ed. A. George. Blackwell.

Dummett, M. (1994). Origins of Analytical Philosophy. Harvard University Press.

Evans, J. and Over, D. (1996). Rationality and Reasoning. Psychology Press.

Fiddick, L., Cosmides, L. and Tooby, J. (2000). No interpretation without representation: the role of domain-specific representations and inferences in the Wason selection task. Cognition, 77:1-79.

Fodor, J. (1981). The present status of the innateness controversy. In his: RePresentations. Harvester Press.

Fodor, J. (1983). The Modularity of Mind. MIT Press.

Frankish, K. (1998a). Natural language and virtual belief. In: Language and Thought, eds. P. Carruthers and J. Boucher. Cambridge University Press.

Frankish, K. (1998b). A matter of opinion. Philosophical Psychology, 11:423-442.

Frankish, K. forthcoming. Mind and Supermind. Book typescript.

Gallistel, R. (1990). The Organization of Learning. MIT Press.

Gallistel, R. and Gellman, R. (1992). Preverbal and verbal counting and computation. Cognition, 44.

Goldman, A. (1989). Interpretation psychologized. Mind and Language, 4:161-185.

Goldman, A. (1993). The psychology of folk-psychology. Behavioral and Brain Sciences, 16:15-28.

Gómez, J. (1998). Some thoughts about the evolution of LADS, with special reference to TOM and SAM. In: Language and Thought, eds. P. Carruthers and J. Boucher. Cambridge University Press.

Gopnik, A. (2001). Theories, language and culture. In: Language Acquistion and Conceptual Development, eds. M. Bowerman and S. Levinson. Cambridge University Press.

Gopnik, A. and Melzoff, A. (1997). Words, Thoughts and Theories. MIT Press.

Gopnik, A., Choi, S. and Baumberger, T. (1996). Cross-linguistic differences in early semantic and cognitive development. Cognitive Development, 11:197-227.

Gordon, R. (1986). Folk psychology as simulation. Mind and Language, 1:158-171.

Gordon, R. (1995). Simulation without introspection or inference from me to you. In: Mental Simulation: evaluations and applications, eds. T. Stone and M. Davies. Blackwell.

Gouteux, S., Thinus-Blanc, C. and Vauclair, S. (in press). Rhesus monkeys use geometric and non-geometric information during a reorientation task. Journal of Experimental Psychology: Gen. Proc., 130:505-519.

Harris, P. (1996). Desires, beliefs and language. In: Theories of Theories of Mind, eds. P. Carruthers and P.K. Smith. Cambridge University Press.

Harris, P. (2000). The Work of the Imagination. Blackwell.

Harris, P. (2002). What do children learn from testimony? In: The Cognitive Basis of Science, eds. P. Carruthers, S. Stich and M. Siegal. Cambridge University Press.

Hauser, M. (2000). Wild Minds. Penguin Press.

Hauser, M. and Carey, S. (1998). Building a cognitive creature from a set of primitives. In: The Evolution of Mind, eds. D. Cummins and C. Allen. Oxford University Press.

Heal, J. (1986). Replication and functionalism. In: Language, Mind and Logic, ed. J. Butterfield. Cambridge University Press.

Heal, J. (1995). How to think about thinking. In: Mental Simulation: evaluations and applications, eds. T. Stone and M. Davies. Blackwell.

Hermer, L. and Spelke, E. (1994). A geometric process for spatial reorientation in young children. Nature, 370:57-59.

Hermer, L. and Spelke, E. (1996). Modularity and development: the case of spatial reorientation. Cognition, 61:195-232.

Hermer-Vazquez, L., Spelke, E., and Katsnelson, A. (1999). Sources of flexibility in human cognition: Dual-task studies of space and language. Cognitive Psychology, 39:3-36.

Hirschfeld, L. and Gelman, S. eds. (1994). Mapping the Mind: domain specificity in cognition and culture. Cambridge University Press.

Horgan T. and Tienson, J. (1996). Connectionism and Philosophy of Psychology. MIT Press.

Hughes, C. and Plomin, R. (2000). Individual differences in early understanding of mind: genes, nonshared environment and modularity. In: Evolution and the Human Mind, eds. P. Carruthers and A. Chamberlain. Cambridge University Press.

Hughes, L. (1993). ChimpWorld: a wind-tunnel for the social sciences. Ph.D. Thesis, Yale University.

Hurlburt, R. (1990). Sampling Normal and Schizophrenic Inner Experience. Plenum Press.

Hurlburt, R. (1993). Sampling Inner Experience with Disturbed Affect. Plenum Press.

Johnson-Laird, P. (1983). Mental Models. Cambridge University Press.

Koslowski, B. (1996). Theory and Evidence. MIT Press.

Kosslyn, S. (1994). Image and Brain. MIT Press.

Lakatos, I. (1970). The methodology of scientific research programmes. In: Criticism and the Growth of Knowledge, eds. I. Lakatos and A. Musgrave. Cambridge University Press.

Learmonth, A., Newcombe N. and Huttenlocher J. (2001). Toddlers’ use of metric information and landmarks to reorient. Journal of Experimental Child Psychology, 80:225-44.

Learmonth, A. and Nadel, L. and Newcombe, N. (in press). Children’s use of landmarks: implications for modularity theory. Psychological Science.

Leslie, A. (1994). ToMM, ToBY and Agency: Core architecture and domain specificity. In: Mapping the Mind, eds. L. Hirschfeld and S. Gelman. Cambridge University Press.

Levelt, W. (1989). Speaking: from intention to articulation. MIT Press.

Liebenberg, L. (1990). The Art of Tracking: the origin of science. David Philip Publishers.

Lucy, J. (1992a). Language Diversity and Thought: a reformulation of the linguistic relativity hypothesis. Cambridge University Press.

Lucy, J. (1992b). Grammatical Categories and Cognition: a case-study of the linguistic relativity hypothesis. Cambridge University Press.

Lucy, J. and Gaskins, S. (2001). Grammatical categories and classification preferences. In: Language Aquistion and Conceptual Development, eds. M. Bowerman and S. Levinson. Cambridge University Press.

Malson, L. (1972). Wolf Children and the Problem of Human Nature. Monthly Review Press.

McDowell, J. (1994). Mind and World. MIT Press.

Mithen, S. (1990). Thoughtful Foragers: A study of prehistoric decision making. Cambridge University Press.

Mithen, S. (1996). The Prehistory of the Mind. Thames Hudson.

Nelson, C. (1996). Language in Cognitive Development. Cambridge University Press.

Nichols, S. and Stich, S. (1998). Theory-theory to the max. Mind and Language, 13:421-449.

Nichols, S. and Stich, S. (forthcoming). Mindreading. Oxford University Press.

Pelegrin, J. (1993). A framework for analyzing prehistoric stone tool manufacture and a tentative application of some early stone industries. In: The Use of Tools by Human and Non-human Primates, eds. A. Berthelet and J. Chavaillon. Oxford University Press.

Perner, J. (1991). Understanding the Representational Mind. MIT Press.

Perner, J. (1998). The meta-intentional nature of executive functions and theory of mind. In: Language and Thought, eds. P. Carruthers and J. Boucher. Cambridge University Press.

Peterson, C. and Siegal, M. (1998). Representing inner worlds: theory of mind in autistic, deaf and normal hearing children. Psychological Science, 9:117-133.

Pinker, S. (1994). The Language Instinct. Penguin Press.

Pinker, S. (1997). How the Mind Works. Penguin Press.

Povinelli, D. (2000). Folk Physics for Apes. Oxford University Press.

Premack, D. (1986). Gavagai! Or the future history of the ape language controversy. MIT Press.

Sachs, O. (1985). The Man who Mistook his Wife for a Hat. Picador.

Sachs, O. (1989). Seeing Voices. Picador.

Savage-Rumbaugh, S. and Lewin, R. (1994). Kanzi, the ape at the brink of the human mind. John Wiley.

Schaller, S. (1991). A Man Without Words. Summit Books.

Segal, G. (1998). Representing representations. In: Language and Thought, eds. P. Carruthers and J. Boucher. Cambridge University Press.

Shallice, T. (1988). From Neuropsychology to Mental Structure. Cambridge University Press.

Smith, N. and Tsimpli, I-M. (1995). The Mind of a Savant: language-learning and modularity. Blackwell.

Spelke, E. (1994). Initial knowledge: six suggestions. Cognition, 50:433-447.

Spelke, E. and Tsivkin, S. 2001. Language and number: A bilingual training study. Cognition, 57:45-88.

Spelke, E. (in press). Developing knowledge of space:  Core systems and newcombinations. In S. M. Kosslyn and A. Galaburda (eds.), Languages of the Brain. Cambridge, MA: Harvard Univ. Press.

Spelke, E., Vishton, P. and von Hofsten, C. (1995). Object perception, object-directed action, and physical knowledge in infancy. In: The Cognitive Neurosciences, ed. M.Gazzaniga. MIT Press.

Sperber, D. (1996). Explaining Culture: a naturalistic approach. Blackwell.

Sperber, D. and Wilson, D. (1986). Relevance: communication and cognition. Blackwell. (Second edition, 1995.)

Sperber, D., Premack, D., and Premack, A. eds. (1995). Causal Cognition. Oxford University Press.

Stanovich, K. (1999). Who is Rational? Studies of individual differences in reasoning. Laurence Erlbaum.

Stein, E. (1996). Without Good Reason. Oxford University Press.

Tomasello, M. (1999). The Cultural Origins of Human Cognition. Harvard University Press.

Valortigara, G., Zanforlin, M. and Pasti, G. (1990). Geometric modules in animal’s spatial representations: a test with chicks. Journal of Comparative Psychology, 104:248-54.

Varley, R. (1998). Aphasic language, aphasic thought. In: Language and Thought, eds. P. Carruthers and J. Boucher. Cambridge University Press.

Varley, R. (2002). Science without grammar: scientific reasoning in severe agrammatic aphasia. In: The Cognitive Basis of Science, eds. P. Carruthers, S. Stich, and M. Siegal. Cambridge University Press.

Vygotsky, L. (1934). Thought and Language. Trans. Kozulin, MIT Press, 1986.

Walker, S. (1983). Animal Thought. Routledge.

Wason, P. and Evans, J. (1975). Dual processes in reasoning? Cognition, 3:141-154.

Wellman, H. (1990). The Child’s Theory of Mind. MIT Press.

Whorf, B. (1956). Language, Thought, and Reality. Wiley.

Wittgenstein, L. (1921). Tractatus Logico-Philosophicus. Routledge.

Wittgenstein, L. (1953). Philosophical Investigations. Blackwell.

Wynn, K. (1990). Children’s understanding of counting. Cognition, 36:155-193.

Wynn, K. (1995). Origins of mathematical knowledge. Mathematical Cognition, 1:35-60.

Wynn, T. (2000). Symmetry and the evolution of the modular linguistic mind. In: Evolution and the Human Mind, eds. P. Carruthers and A. Chamberlain. Cambridge University Press.


[1] Philosophers and logicians should note that Chomsky’s LF is very different from what they are apt to mean by ‘logical form’. In particular, sentences of LF don’t just consist of logical constants and quantifiers, variables, and dummy names. Rather, they are constructed from lexical items drawn from the natural language in question. They are also syntactically structured, but regimented in such a way that all scope-ambiguities and the like are resolved, and with pronouns cross-indexed to their referents and so on. Moreover, the lexical items will be semantically interpreted, linked to whatever structures in the knowledge-base secure their meanings.

                Note, too, that an appeal to LF isn’t strictly necessary for the purposes of the main thesis of this paper. I use it more by way of illustration, and for the sake of concreteness. All that is truly essential is that there should exist a separate mental faculty for processing natural language, with both input and output functions, and that this faculty should deal in structured representations.


[2] Admittedly, developmental psychologists have - until very recently - tended to down-play the significance of testimony (and hence of language) in child development. Following Piaget, they have mostly viewed children as individualistic learners - acquiring information for themselves, and developing and testing theories in the light of the information acquired (e.g. Gopnik and Melzoff, 1997). See Harris (2002) who makes a powerful plea for the role of testimony to be taken much more seriously in accounts of child development.


[3] This is, in fact, a weak version of the Whorfian hypothesis, to be discussed in its strongest form in section 3.2 below.

[4] This date for the first appearance of fully-syntactic natural language seems to be quite widely adopted amongst cognitive archaeologists - see Mithen, 1996 - so I, too, propose to accept it (albeit tentatively) in what follows. But it is, of course, still highly controversial. And it should be noted that at least some of the evidence for it turns on assumptions about the cognitive role of language.


[5] Here and throughout the remainder of this paper I shall use the term ‘module’ loosely (following Smith and Tsimpli, 1995, and others) especially when talking about central-process, or conceptual, modules. (Another option would have been to use the stylistically-barbaric term ‘quasi-module’ throughout.) While these systems might not be modular in Fodor’s classic (1983) sense – they will not have proprietary inputs, for example, and might not be fully encapsulated – they should be understood to conform to at least some of the main elements of Fodorian modularity. As I shall henceforward understand it, modules should be innately channeled (to some significant degree) and subject to characteristic patterns of breakdown; their operations might be mandatory and relatively fast; and they should process information relating to their distinctive domains according to their own specific algorithms.

[6] Note that the computer programme ChimpWorld which successfully simulated chimpanzee behaviors and social structures without deploying higher-order thoughts nevertheless did employ structured propositional representations. (Hughes, 1993, reported in Povinelli, 2000.)

[7] What is the status of arguments which take the form, ‘It is very hard to see how otherwise’? Do they merely reflect a lack of imagination on our part? Perhaps. But a more sympathetic gloss is that these are just standard arguments to the best available explanation. All theorizing, of course, in whatever discipline, has to work with those theories which can be imagined, or thought of, to explain the data. And it is often the case that there is only one theory which can be thought of to explain a given set of data.

[8] How does this square with the data mentioned earlier, that there is nevertheless a substantial correlation between language ability and theory of mind in young children? Well, first, the Hughes and Plomin finding is that the genes for theory of mind and for verbal intelligence are not wholly independent of one another. And second, a quarter of the variance in theory of mind ability comes from the environment: and this may well be linguistically mediated in one way or another.

[9] Note that geographical information isn’t the same as geometric information; and nor do the kinds of fact in question require integration with geometry. (Knowing that Paris is the capital of France doesn’t need geometry.) So the finding that recall of geographical information is independent of language isn’t inconsistent with the thesis that language is necessary to integrate geometric information with information of other kinds.


[10] I follow the usual convention of using small capitals for Mentalese expressions, quotation-marks for natural language expressions, and italic to designate contents (as well as for emphasis).

[11] In order to do this, the language faculty would need the resources to know that the short wall on the right is the blue wall. This might be possible if the outputs of the central modules are always partly indexical, as suggested in section 5.1 above, with referring elements tied to the contents of current perception. For we can surely suppose that the contents of perception are integrated - one’s current perceptual state will contain a representation of the spatial layout of the room together with the colors of the walls. See Carruthers (2000, ch.11) for an argument that the demands of practical reasoning in relation to the perceived environment require an integrated perceptual field.

[12] I haven’t been able to find from my reading any direct evidence that trackers will also place weight upon the relative simplicity and internal consistency of the competing hypotheses. But I would be prepared to bet a very great deal that they do. For these are, arguably, epistemic values which govern much of cognition in addition to hypothesis selection and testing (Chater, 1999).


[13] Other sorts of dual-task study can be imagined which would test the thesis that language is implicated in the normal operations of the mind-reading faculty. For example, subjects might be asked to solve a pictoral version of one of the standard false-belief tasks while shadowing speech (in one condition) and while shadowing rhythm (in the other). But note that, given the different versions of the proposal discussed in section 6.4 above, care would need to be taken in choosing a mind-reading task. To test the stronger of those proposals (that language is required for the very idea of false belief), any sort of false-belief task would do; but to test the weaker proposal (that language is involved in drawing inferences from attributed beliefs) a more complex series of pictures might be required.

[14] My thanks to Colin Allen, José Bermudez, George Botterill, Daniel Dennett, Edouard Machery, David Papineau, Josef Perner, Michael Siegal, Liz Spelke and an anonymous referee for BBS for their comments on an earlier draft.