Archive 4

Adaptation and Representation
Description: The notion of representation plays a central role in philosophy, biology, ethology, linguistics and neuroscience. However, despite its importance, it still has not received an entirely satisfactory definition. One crucial question is whether it can be given a biological basis, or, in more philosophical terms, whether it can be naturalized. If there are innate representations, are they adaptations, i.e., are they optimal relative to the environmental pressures, which are supposed to have triggered them? If, as seems possible, the notion of adaptation should be modified, what would the consequences be for an adaptation-based notion of representation? If one adopts an externalist view of adaptation in as much as representational systems are optimal responses to external (environmental) pressures, does that automatically leads to an externalist semantic view according to which the content of a representation is determined by the external object being represented?
Given all these queries, it is high time to synthesize the point of view of biologists and their work on the notion of adaptation with the considerations of the scientists who work on the evolved knowing systems. It is at least relevant if not necessary to redefine the notions of 'evolving' system and 'cognitive' system. It is the goal of the present web conference to engage these pressing questions and to bring together the points of views of theoretical biologists and philosophers on the notion of adaptation and of the scientists which use it in everyday practice to formulate experimental protocols, which, hopefully, will lead to explanations of animat or animal behavior. Moderators:
Gloria Origgi (CNRS, Institut Jean-Nicod), Anne Reboul (CNRS - Institut des Sciences Cognitives Lyon ), Adrianna Wozniak (School of Computer Science, University of Windsor, Canada)
Guets Panel: Andr Ariew (University of Missouri-Columbia), Teresa Bejarano (University of Sevilla,
Spain) Andrew Brook (The Institute of Cognitive Science at Carleton University in Ottawa, Canada), Peter Carruthers (University of Maryland), Valrian Chambon (Institute for Cognitive Science), Nicolas Claidiere (Institut Nicod), Andy Clark (University in Scotland), John Collins (University of East Anglia, Norwich), Stephen Cowley (University of Hertfordshire, United Kingdom), Daniel Dennett (Center for Cognitive Studies at Tufts University), Ophelia Deroy (Institut Jean Nicod), Gordana DodigCrnkovic (Department of Computer Science and Electronics, University of Mlardalen, Sweden), John Dupr (University of Exeter), Keith Frankish (Open University), Hajo Greif (Interuniversity Research Centre for Technology, Work and Culture, IFZ) Jose Luis Guijarro (Department of Philosophy and Literature, University of Cadiz, Spain), Benoit Hardy-Valle (University of Waterloo, Ontario), Philippe Huneman ( Institut dHistoire et de Philosophie des Sciences et des Techniques), Frdric Kaplan (Ecole Polytechnique Federale de Lausanne), Alexander Kravchenko ( Baikal National University of Economics and Law, Russia), Ignazio Licata ( Institute for Basic Research, Florida), Franoise Longy (Institut d'Histoire et de Philosophie des Sciences et des Techniques), Marie-Claude Lorne (Institut d'Histoire et de Philosophie des Sciences et des Techniques), Edouard Machery (University of Pittsburgh), Marek Mcgann (Department of Psychology, University of Limerick, Ireland), Christophe Menant (IBM, France), Hugo Mercier (Institut Nicod), David Meunier (Institute for Cognitive Sciences, Lyon, France), Marcin Milkowski (Institute of Philosophy and Sociology, Polish Academy of Sciences, Section for Logic & Cognitive Science), Olivier Morin (Institut Nicod), Gualtiero Piccinini (University of Missouri), Georges Rey (University of Maryland, College Park, USA), Dan Ryder (Department of Philosophy, University of Connecticut), Colin Schmidt (Institute of Informatics, Universit du Maine), Chris Sinha (University of Portsmouth), Barry Smith (University of Buffalo), Dan Sperber (Institut Nicod CNRS), Paola Zizzi (Department of Pure and Applied Mathematics at the University of Padova).
INTERDISCIPLINES.ORG
AdaptationandRepresentation
In paternership with : lInstitut des Sciences Cognitives et lUniversit de Genve :
- Natural Intensions
Ron Chrisley (Department of Informatics, University of Sussex, UK.)
- On the cultural adaptation of linguistic representations: functional, environmental, and cognitive constraints.
Pierre Yves-Oudeyer (Sony Computer Science Lab, Paris) and Frdric Kaplan (Ecole Polytechnique Federale de Lausanne)
- The empirical content of the notion of linguistic mental representations.

Wolfram Hinzen (Durham University)
- Representation in digital systems.

Vincent C. Mller (American College of Thessaloniki, Greece)
- Population Thinking, Darwinism, and Cultural Change

Peter Godfrey-Smith (Harvard University)
- Representational Requirements for Evolving Cultural Evolution

Joanna Bryson (Department of Computer Science, University of Bath, UK)
- Content From Development

Nicholas Shea (Faculty of Philosophy, University of Oxford)
- An Evolutionary Solution to the Radical Concept Nativism Puzzle

Murray Clarke (Concordia University in Montreal, Canada and Carleton Universitys Institute of Cognitive Science in Ottawa, Canada)
-Ideas that stand the [evolutionary] test of time

Frdric Bouchard (Universit de Montral)
- The complex vehicles of human thought and the role of scaffolding, internalisation and semiotics in human representation
Robert Clowes (University of Sussex)
- The Theory of Biological Adaptation and Function

Robert Brandon (Duke University)
- The Evolution of Misbelief

Ryan McKay ( School of Social Sciences and Liberal Studies, Charles Sturt University, Australia).This paper is co-authored by Daniel Dennett.
-Functions, Modules and Dissociation: A Quibble

Bruce Glymour (Kansas State University)
-The Dynamic Nature of Representation

Mark Bickhard (Lehigh University)
- Adaptation and representation an introduction

Anne Reboul and Adrianna Wozniak (Institute for Cognitive Sciences, CNRS)
Natural Intensions
Ron Chrisley (Department of Informatics, University of Sussex, UK.)
(Date de publication 29 octobre 2007)
Abstract: I There is an attractive way to explain representation in terms of adaptivity: roughly, an item R represents a state of affairs S if it has the proper function of co-occurring with S (that is, if the ancestors of R co-occurred with S and this co-occurrence explains why R was selected for, and thus why R exists now). Although this may be an adequate account of the extension or reference of R, what such explanations often neglect is an account of the intension or sense of R: how S is represented by R. No doubt such an account, if correct, would be complex, involving such things as the proper functions of the mechanisms that use R, the mechanisms by which R fulfills its function, and more. But it seems likely that an important step toward such an account would be the identification of the norms that govern this process. The norms of validity and Bayes' Theorem can guide investigations into the actual inferences and probabilistic reasoning that organisms perform. Is there a norm that can do the same for intension-fixing? I argue that before this can be resolved, some problems with the biosemantic account of extension must be resolved. I attempt to do so by offering a complexity-based account of the natural extension of a representation R: for a given set of ancestral co-occurrences Z, the natural extension is the extension of the least complex intension that best covers Z. Minimal description length is considered as a means for measuring complexity. Some advantages of and problems with the account are identified.
Paper: There is an attractive way to explain representation in terms of adaptivity; roughly:
An item R represents a state of affairs S if it has the proper function of co-occurring with S (that is, if the ancestors of R co-occurred with S and this co-occurrence explains why R was selected for, and thus why R exists now; cf, e.g., (Millikan 1984)).
Call any such explanation a biosemantic account. Although biosemantic accounts may be adequate explanations of the extension or reference of R, what they often neglect is an account of the intension or sense or aspectual shape of R: how S is represented by R. As Fodor puts it, "Darwin cares how many flies you eat, but not what description you eat them under." (Fodor 1990:73, original emphasis). [Philosopher's nit-pick: In offering this slogan Fodor assumes, incorrectly, and possibly unintentionally, that all senses are descriptive. To avoid leaving open the possibility of a teleological account of non-descriptive (e.g. demonstrative) sense, we can modify the slogan to be: "Darwin cares how many flies you eat, but not what mode of presentation you eat them under."] If Fodor is right, a full account of representation needs to address this neglect.
For some people, that's a big "if". Millikan herself, for example, thinks that the biosemantic approach already has the materials necessary to give an account of intensions. Call it the "consumer is always right" story: roughly, the way R represents S depends on the proper functions of the processes that use R. So a state of the visual system in a frog that detects and represents flies represents them as
flies (and not black dots, say) because the mechanism that uses these representations, the frog's tongue-snapping mechanism, only fulfills it proper function of getting food into the frog's stomach when it snaps at flies.
The "consumer is always right" story is not without its own problems: 1. Circular: The "consumer is always right" story seems to beg, rather than answer, the question of aspectual shape. Fodor's slogan applies even more directly to the case of the tonguesnapping consumer of R than it does to R itself. Snapping at "black dots" will reliably allow the mechanism to fulfill its proper function in any environment where the cost of snapping at nonfly black dots is outweighed by the benefit of snapping at black dots that are flies. Explaining the intensions of representations in terms of the intensions of the proper functions of consumers of those representations is circular. 2. Non-deterministic I: The story massively underdetermines the intension of R, because there are an infinite number of intensions that correctly characterize the things that allow, e.g., the tongue-snapping mechanism to fulfill its proper function: flies, yes, but also flies smaller than a house, flies larger than an atom, fly-sized organisms, fly-sized animals, fly-sized insects, digestible fly-sized stuff, etc. 3. Non-deterministic II: Representations typically have more than one consumer. Or they may exist for some time without having any consumers. In either case, which consumer's proper function are we to appeal to in determining the intension of the representation in question? It would seem that there is no fact of the matter in these cases as to how the representation is representing the world. Why then, should we expect there to be a fact of the matter in the cases where, by chance, there only happens to be one consumer? You don't turn the duckrabbit into an unambiguous rabbit figure by killing off every person except those who have no concept of duck. 4. Facile: The story makes it an a priori truth that there is a match between the intension of R and the proper functions of the consumers of R; the consumer is always right, we are told. This is convenient; too convenient. A more satisfying account would give independent accounts of R's intension on one hand, and the proper functions of the consumers of R on the other, and explain the fitness of some organisms in terms of a match between these, by way of contrast with the low fitness of those organisms that did not benefit from such a match. But this can only be done if one's account does not render such a contrast conceptually impossible. Millikan and the other biosemanticists may have successfully addressed most or all of these problems. On the other hand, there may be problems with the story, not listed above, that they have not dealt with adequately. In any case, what follows will assume an interest in the general biosemantic approach to providing an adaptivity-based explanation of representation, but also an interest in finding an alternative to the "consumer is always right" account of intension.
One might start out by trying to apply the notion of proper function to intension in a way similar to how it was applied to extension. Having this or that intension is just another property of R, so one could say:
Direct biosemantic account of intension (first try): R has the proper function of having intension
I just in case the ancestors of R had intension I and their having that property explains why R was selected for, and thus why R exists now.
This suggestion fails to reproduce an important feature of the direct biosemantic account of extension. The power of that account lies in the fact that it promises to explain an intentional relation in terms of non-intentional (or non-problematically intentional) relations, by reducing "referring to S" to "having the proper function of co-occurring with S". Reference is replaced with mere co-occurrence and the (purportedly) naturalistically acceptable notion of a proper function.
The first try at a direct biosemantic account of intension does not reproduce this feat. It fails to reduce the having of an intension to the proper function of having some other property. It gives us an account of R having the proper function of having intension I, in terms of R having intension I, when what we want is an account of R having intension I in the first place. A better attempt would be of the form:
Direct biosemantic account of intension (second try): A representation R has the intension I if it has proper function X.
The problem is that no one to my knowledge has any idea of what could be non-circularly substituted for X.
Perhaps such a close parallel with reference is unnecessary. The reason for invoking proper functions was to allow us to explain an intentional relation in terms of naturalistic ones. But once that feat is achieved, why must it be repeated? Once we have broken into the intentional circle via extension, it may be possible to explicate other intentional notions, especially intension, in terms of extension and other purely naturalistic notions, without needing to invoke proper functions again. No doubt such an account would be complex, appealing to the mechanisms by which R fulfils its extension-fixing proper function, and perhaps even the mechanisms by which the consumers of R fulfill their proper functions.
On the other hand, once one has an account of reference in place, it might be that giving an account of the intension of a particular representation is a relatively straightforward affair, having only to do with its causal role. Given a representation and its referent, there is a set of intensions that determine that referent; the intension of the representation is the member of this set that best matches the causal role of that representation in the cognitive economy of the organism. If one wants to add a biosemantic twist, one can deem these causal roles to be the missing X in the second try, above, and deem the actual intension of R to be whatever causal role explains the success of R's ancestors.
II
Beyond the suggestions just given, I will not try to give a complete account of intension here. However, an important step toward such an account, whether or not it is of the suggested form, may
be made by identifying the norms that govern intensions. Just as the norms of validity and Bayes' Theorem can guide investigations into the actual inferences and probabilistic reasoning, respectively, that organisms perform, so also there may be a norm or set of norms that can do the same for an investigation into intension-determination. That is, answering the question "what intension is this organism using?" may be made easier, especially in an adaptationist context, by first answering the question "what intension should this organism be using?".
Some might balk at the idea of an intension being right or wrong. As long as you represent true things about an object, isn't how you represent that object a matter of preference? But examples of norms on intention use are easy to find:
Communication: If Tom asks you "Will Mike be at the seminar today?", it is in some sense incorrect to answer "Lefty will be there", even if Mike is Lefty, if you have reason to believe that Tom doesn't know Mike is Lefty. If you don't think proper names have intensions, substitute "the smartest guy in town" and "the tallest guy on campus" for "Mike" and "Lefty".
Deductive inference: On traditional views, the inference:
The Morning Star is bright The Morning Star is far away Therefore, there is something that is both bright and far away
is valid, whereas the inference:
The Morning Star is bright The Evening Star is far away Therefore, there is something that is both bright and far away
is not. The choice of the intension of "The Evening Star" is inappropriate in this context. A similar point can be made for the case of inductive inference.
These examples are offered to establish that the idea of uses of intensions being governed by a norm makes sense. With this established, one can revisit the suggestions at the end of the previous section in a different light: instead of an account of the actual intension of a representation, perhaps the suggested accounts provide us with a norm to be used in providing such an account. That is, intensions are causal roles, and R should have the causal role that explains why R's ancestors were
selected for (although it may not actually have that intension).
Cognitive scientists may have their worries. It might be that temporal externalism (historical determination) of reference isn't a problem for psychological explanation, since many cognitive scientists (especially computationalists) lean toward a kind of internalism where the extension of a representation plays no causal (and therefore no explanatory) role anyway. But if a biosemantic account is given of intension as well, then it might be thought that this poses a threat to causal/computational psychological explanation, since intensions will be individuated temporally externally (historically), while only temporally local distinctions can make a causal difference in the here and now. (As far as I am aware, the earliest use of the phrase temporal externalism, at least in this sense, is in (Chrisley 1993)). Millikan (Millikan 1993) considers this worry, but rejects the causal model of explanation that fuels it, preferring instead a biological model of explanation, in which history can and does play a role. But for those who think assimilation into biology isn't natural enough for true naturalization, the worry may remain. It needn't: The proposal here is only that biosemantics (with its temporal externalism) provide the norm for what counts as the best (or most natural) intension in a given context; the current causal powers of R themselves determine what the actual intension of R is. The possibility of a computational psychology is restored.
III
In the midst of this apparently happy situation, a problem lurks. I have until now assumed that the biosemantic account of extension was fine, and that all that was needed was a supplemental account of intension, be it biosemantic or otherwise. But upon further reflection, the direct biosemantic account of extension cannot be want we want.
Consider the case of a hypothetical biological representation T: a tiger detector. The direct account would have it that T represents the presence of tigers because it was the co-occurrence of the ancestors of T with tigers that explains why T was selected for and is thus here today. Hence no matter what the status of the intension of T may be, the extension of T is clear: tigers.
But something funny has happened here. Suppose there had been 100 ancestors of T, T1 to T100, and each of them enabled the organism they had been in to survive by being tokened in the presence of a particular tiger: Tony1 to Tony100. Then surely the set Z = {Tony1, Tony2, , Tony 100} is the true extension of T, according to the biosemantic account. For it was co-occurrence of the ancestors of T with the members of Z that explains why T is around today. But if Z is the extension of T, then T can never be true of any tiger in existence today; thus, all T-tokenings are misrepresentations (unless, by chance, Tony100 still happens to be alive and in the vicinity; but you get my point). Further, the intension of T cannot be "tigers" or anything usefully general like that, as the extension of that intension includes, obviously, all tigers that have been, are now, or are yet to come. And that extension is a lot bigger than Z (to put it mildly). This can be seen as an extensional version of the problem of reduced content (Peacocke 1992, pp 129-132).
So if Z isn't the extension of T, what is? And what is the relation, if any, between it and Z? Let's call Z
the proto-extension of T. The challenge is to use Z to find something that determines a more plausible extension of T (such as the set of all tigers). Once we have the extension of T, then we can worry about it's intension. Yes, it could very well be that the proto-extension Z determines the intension of T directly, and this in turn is what determines the true extension of T. But we shouldn't assume this restriction at the outset. In particular, I want to leave room for the possibility that while the protoextension of T, Z, determines a "natural" intension I that in turn determines the true extension of T, the actual intension of T may be distinct from I (even though it will necessarily have the same extension as I).
It may be hard to imagine how one can get from Z to I: clearly, I cannot be one of the intensions that have Z as their extension, since we want I's extension to be something much bigger (and more useful) than Z, such as the set of all tigers. On the other hand, we don't want Z to be extensionally relevant to I; as proto-extension, Z should play some role in determining I and thus T's true extension. How are we to find this middle way?
By way of closing, I offer an answer to this question based on the notion of complexity. There may be other, better, ways of doing it, but the following can at least serve as an example of what a solution looks like. The proposal is this:
Natural extension: Given a representation T with ancestral proto-extension Z, the natural extension of T is the extension of whatever intension I makes the best trade-off between covering Z on the one hand, and being of low complexity on the other.
Before discussing the trade-off itself, let me clarify each of the quantities being held in the balance. By covering Z, I just mean how well the extension of I matches Z. There are two kinds of extensional error possible here: inclusive and exclusive. The inclusive error of I relative to a proto-extension Z is the set of objects that are in Is extension but are not in Z; the exclusive error is the set of objects that are not in Is extension but are in Z. A rough measure of the total extensional error of I relative to Z is the size of the union of these inclusive and exclusive error sets (the case of intensions with infinite extensions requires a more sophisticated treatment, but there is no space to develop that here).
Clearly, extensional error is minimized when the extension of I just is Z: no inclusive and no exclusive error. So any intension that has Z as its extension would do. This would be useless for our purposes, however, since we want a principled way of establishing the extension of T to be something much larger than Z.
That is the point of the trade-off. What prevents I from being an intension with Z as its extension is the need to balance covering Z well with optimizing something else. In my proposal, that something else is simplicity (low complexity). For example, one could designate a canonical language L within which to express any given intension, and deem the complexity of an intension I to be the minimum description length of I in L: the fewest number of words it takes to express I in L.
To see how this would work, consider the above example of T, its 100 ancestors T1 to T100, and its ancestral proto-extension Z = {Tony1, Tony2, , Tony100}. Let L be English. Suppose, the sake of explication:
There has only ever been, and only ever will be, 150 tigers;
There are only two candidate intensions, I1 = Tony1 or Tony2 or, , or Tony100 and I2 = tigers; The total cost of an intension I relative to Z is just the extensional error of I relative to Z plus the minimum description length of I in L.
Then the cost of I1 is:
0 extensional error, plus at least 100 for the 100 Tonys in its canonical expression in L = at least 100
But the cost of I2 is: 50 (inclusive) extension error (the 50 tigers in the extension of I2 that are not in Z) + 1 (the length of I2s canonical expression in L) = 51
So the natural extension of T is the extension of I2, that is, the set of all 150 tigers. This is so even though the proto-extension Z only contained 50 tigers.
In addition to being a solution to our problem, this account has some salutary side-effects. There is only room enough for one example here. Sceptics of the biosemantic approach often wonder: how many generations does an item T have to enable avoiding a tigery death before it becomes a tiger detector? This account gives a definite answer: at whatever point the tiger intension becomes the intension with best trade-off between simplicity and covering the things that co-occurred with the ancestors of T.
On the other hand, there are some obvious difficulties with the simplistic version of the account presented here. For starters:
Relativity: Since complexity is relative to the canonical language L being used, so also is the natural extension of a representation. Free parameter: Which extension turns out for be natural depends on how one calculates the
trade off between extensional error and simplicity. How many objects in the error set is a oneword reduction in description length worth? What is the proper exchange rate between these quantities? Is there a fact of the matter about this? How could we possibly come to know it? Complexity is not the same as length: Some concepts seem more costly than others. If so, an expression of an intension might be quite short (one word, even), but more intuitively complex in that it uses an expensive concept, whereas the expression of another, possibly even co-extension intension might be longer, but only use simple, cheap concepts. Whose complexity?: The account as expressed above uses complexity in the theorists language L as a determinant of the natural extension. But surely it is complexity in the organisms own language that matters? If so, what about organisms that do not possess a language? How are we to measure the complexity of their intensions? Does it make sense to talk about the complexity of an intension with a conceptual (or non-conceptual) scheme? Non-determinism: What about cases in which there is more than one intension with same overall score, but different extensions? Which extension is the natural extension?
All of these are fair points, and deserve attention, response and rebuttal that cannot be given here; they should, to be sure, be the jumping-off point for further development of the account. However, once the notion of a natural extension is in place, the question of what the natural or right intension is, as well as the possibly distinct question of what the actual intension is, may finally be addressed.
References Chrisley, R. (1992) "Externalism before language: The real reason why 'thoughts ain't in the head'". Paper read to the University of Sussex Philosophy Society, March 12th, 1993. Available at http://www.cogs.susx.ac.uk/users/ronc/papers/externalism.pdf. . Fodor, J. (1990) A Theory of Content and Other Essays. Cambridge: MIT Press. Millikan, R. (1984) Language, Thought and Other Biological Categories: New Foundations for Realism. Cambridge: MIT Press. Millikan, R. (1993) White Queen Psychology; or, The Last Myth of the Given, in Millikan, R. (1993) White Queen Psychology and Other Essays for Alice. Cambridge: MIT Press. Peacocke, C. (1992) A Study of Concepts. Cambridge: MIT Press.
On the cultural adaptation of linguistic representations: functional, environmental, and cognitive constraints.
Pierre Yves-Oudeyer (Sony Computer Science Lab, Paris) and Frdric Kaplan (Ecole
Polytechnique Federale de Lausanne)
(Date de publication : 15 octobre 2007)
Abstract: The analogy between language evolution and biological evolution has been proposed many times since Darwins theory of natural selection. Through the review of several computational models, we argue in this paper that this analogy must be brought down to the details, rather than remaining at a general verbal level, otherwise misconceptions may be formed. We also illustrate how operational conceptualizations of this analogy can set the ground for a refoundation of linguistics. (Date de publication : 15 octobre 2007) Paper:
1. The evolution of linguistic representations as a cultural Darwinian process Even since the elaboration of the theory of natural selection by Charles Darwin, researchers have proposed that the mechanisms of language evolution may have strong similarities with the mechanisms of biological evolution (Schleicher, 1863). More recently, several kinds of analogies have been proposed. A first kind, tries to map units and structures in the genetic space directly to units and structures in the linguistic space (Berlinski, 1972; Searls, 2002; Stegmann, 2004). A second kind of parallel was developed in which the focus was on the Darwinian process of evolution rather that on the units themselves (Mufwene, 2005; Croft, 2000, 2002; Steels, 2004). The common process between genome and language evolution is here the following: 1) there exists a population of units capable of replication, 2) replication is not perfect: modifications can appear, 3) the units have different levels of efficiency in replication, which produces differential replication. This high level formulation, sometimes conceptualized as a generalization of Darwins theory of natural selection (Hull, 1988), has the advantage of not specifying the structure of units as well as the mechanisms of replication and variation. And indeed, researchers found ways to instantiate it into biological or language evolution by filling in those missing slots with the corresponding specific structures and mechanisms (Croft, 2000). As far as biology is concerned, the units are genes, the mechanisms of replication are those associated with meiosis/mitosis, and the mechanisms of variation are mutations and cross-over. As far as language is concerned, a wide variety of instantiations have been proposed. The units of replication were conceived as ideas, mnemotypes, idene, culturetype, socio-genes, tuition (van Driem, 2003), ranging from simple abstract concepts like words or expressions to complex neural structures implementing associations between phonological forms and meaning. Perhaps the most well-known notion of cultural unit of replication is the meme introduced by (Dawkins, 1976). Linguistic memes, sometimes called linguemes (Croft, 2000), are themselves a population of very diverse kinds of units: phonological features, phonemes, syllables, rules of phoneme sequencing, lexicons, rules of syntax, semantic categories, systems of world categorization, constructions mapping combinations of words and complex meanings, prosodic structures, social conventions involving gestures and gaze to coordinate linguistic interactions, etc. Dawkins gives imitation as an example of mechanism of replication for units for language evolution. As a matter of fact, all kinds of linguistic activities, which can be much more complex than just imitation, like conversation or reading, provoke the replication of linguistic units. The consequence is that leaping from brain to brain is a very complex process that can happen through a variety of mechanisms. What provokes variation is therefore also very diverse:
bad perception, erroneous interpretation, exaggeration, etc.
This shows that the conceptualization of language evolution as a Darwinian process may take quite different forms for different authors and is often presented only at a rather general level, especially in the memetics litterature. Yet, in order to be useful, we argue in this paper that this conceptualization must be precise, detailed and operational. Indeed, if language evolution is a Darwinian process, then it means that many of its features are systemic: they are the outcome of the complex interactions between replicators, replicating mechanisms and various kinds of constraints (e.g. learning biases, function or environment). Depending of each particular mechanism and of the particular ecological constraints, very different cultural dynamics might happen, and the adaptation of linguistic representation may (or may not) happen in many different manners. To make this point, we will review a number of computational models of the origins and evolution of language, showing the variety of replicating units and replication mechanisms that one can encounter at various levels, and showing what consequence they have on actual language evolution.
The common point of all the computational experiments we present is that they consist of populations of agents initially devoid of linguistic convention, and that will progressively and culturally build each time a new (simple) linguistic system. A first series of examples will focus on the functional constraints imposed on linguistic replicators due to their use as communication systems. In particular we will see that only very specific mechanisms of replication may allow for the efficient formation of shared linguistic conventions. Then, we will present an experiment showing how linguistic replicators can adapt and evolve under the specific constraints due to the external environment. We will then review an experiment studying the role of learning biases in the replication process and see how it can influence the evolution of linguistic representations.
2. Functional constraints on Darwinian dynamics Linguistic replicators have specific properties compared to biological replicators: they form a system that permits communication. For instance, in a vocabulary in which words are associated to concepts/meanings and are used to draw the attention of several speakers towards a particular referent in a given context, synonymy and homonymy tend to be reduced ensuring efficient communication. We will see that not every differential replication process permits the emergence of such communication systems. Typically, each linguistic interaction involves the semiotic triangle: there is a form (e.g. a word), an associated meaning, and an associated referent in a particular context. This entails that three kinds of entities can be replicated through communication: forms, meanings, and associations within certain forms and certain meanings. As a matter of act, each of these kinds of entities consists itself of a variety of entities which are also replicators. For example, words are composed of sounds like vowels and consonants, which can be grouped into syllables through sets of phonotactic rules, which can themselves be sequences according to certain rules to build up words, and all these hierarchically entities can replicate differentially. Although all these replicators are constantly interacting, the experiments we will now present make a number of simplifications that allow us to develop a better understanding of the fundamental dynamics associated to various functional, internal and environmental constraints. For example, in the first experiment we will describe, we will suppose that there is only one meaning in the world that agents inhabit, and that two possible words can be associated to this meaning. This experiment will show some basic properties of the replication mechanism so that a simple convention can be adopted by a population, i.e. so that linguistic coherence can be reached (speakers associate the same word to the same meaning). This experiment will then be made more complex in following sections, allowing to study progressively
more complex phenomena like linguistic distinctiveness (speakers associate different words to different meanings, next section).
2.1 Linguistic coherence and replication mechanisms: simple epidemiologic models are not enough Let us consider a simple problem: N agents have to choose between two conventional names c1 and c2. We will consider three simple models representative for many more complex ones studied in the field. The first model is an imitation-based model (model A) and the two others are frequency-based (model B and C). In model A, the speaker simply produces the conventional name he heard last as a listener. In model B, the speaker produces the name that he has heard most frequently as a listener. In model C, the speaker produces any name that he has heard as a listener with a probability proportional to the frequency. These three types of replication processes could be seen as possible models of how cultural replication occurs. Yet, results show that the entailed dynamics are quite different (for details, see Kaplan (2005)). With imitation-based model A, the population eventually converges to a state of complete co-ordination. However, convergence happens typically only after a very long series of oscillations. With model B, convergence towards a single conventional name also occurs. However, the oscillations observed are much smaller. As soon as a convention spreads more in the population than the other, its domination seems to amplify even more over time and convergence happens quickly. For model C, on the contrary, dynamics tend to maintain the distribution of c1 and c2 over time, after an initial drift, and so there is no convergence. An in-depth study of these three models reveals that despite their apparent similarity the types of dynamics they create are extremely different. Among the three models studied, only model B creates self-reinforcing dynamics that permit a fast coordination of the entire population towards the use of a single conventional name. Model A is approximatively similar to a random walk, converging in quadratic time, and thus is impractical in real life. Finally, the dynamics of model C tends to maintain the distribution of the convention at a fixed non-convergent level.
What we must remember from these results is that not all cultural transmission systems create a differential replication process that ensures the domination of some linguemes over others in a reasonable time span. Simple models of cultural transmission solely based on imitation are not sufficient to permit linguistic coordination. In that sense the dynamics of linguistic replication are likely to be different from the ones characterizing epidemiological processes, which have sometimes been presented as a possible metaphor for cultural transmission (Sperber, 1984).
2.2 Linguistic distinctiveness and the adaptation of form-meaning pairs For efficient communication, it is better that different words are associated with different meanings and vice versa. This obvious remark actually constrains systems of linguistic replicators in many important ways. Indeed, the replication process must not only address the issue on linguistic coherence but also permit linguistic distinctiveness. Let us consider a model in which individuals have to establish conventionalized associations between several words and several meanings. Each agent is now equipped with an associative memory, which is a list of word-meaning pairs with a numeric score. It is used to find the best word associated to a given meaning and reversely to find the best meaning associated to a given word. As in the model B of the previous section, the agents choose the association with the highest score when several solutions are possible. The associative memories of the agents are initially empty. Associations are progressively created as the agent interacts with other agents.
Studies of such systems were initiated by Steels in the mid 1990s (Steels, 1996). Several other experiments rapidly showed how collective dynamics could permit that each name eventually becomes associated with a single context and each context with a single convention (Hutchins and Hazlehurst, 1995; Oliphant, 1997; Arita and Koyama, 1998; De Jong and Steels, 2003; Vogt, 2005; Baronchelli et al., 2006; Kaplan, 2001). Some of the most interesting dynamics of such self-organizing lexicons are obtained in the presence of noise. Let us consider that each word/replicator ci is modelled with an integer value between 0 and 1000. Each time a word/replicator is transmitted, a random number between B/2 and +B/2 is added to its value. Thus, B is a measure of the global noise level. Each agent is equipped with a filter permitting to select all the words/replicators in its associative memory of which the values are at a distance D less than D = B. The structure of an interaction is the following: Agent 1 randomly chooses a meaning s1 among the different meanings available and uses a word c1 to express this meaning. If it does not have words associated with this meaning, the agent creates a new one (a random integer between 0 and 1000). Then, c1 is transmitted to agent 2 with an alternation between B/2 and +B/2. Then, agent 2 selects all the possible associations with a word close to the integer received (at a distance less than B). If several associations are possible, agent 2 chooses the one with the highest score: (c2, s2). If s1 = s2 the interaction is a success, in the other cases the interaction is a failure. If no association is close enough in agent 2s memory, the agent creates a new association between the received integer and the meaning s1. In case of success, agent 2 increases the score of the association (c2, s2) with +_ and diminishes the score of competing associations ((c2, _) and (_, s2)) where * is any meaning or word in the memory of the agent) with _. In case of failure, agent 2 decreases the score of (c2, s2) with _ (see (Kaplan, 2001; De Jong and Steels, 2003; Vogt, 2005) for discussions of the importance of such forms of lateral inhibition). Associations are initially created with a 0 score.
Figure 1: Evolution in the word space (idealized acoustic space) of the words associated with 5 meanings in an experiment involving 10 agents with a noise level B = 100. After a first period of ambiguity, five well separated bands appear associated with each meaning. Evolution of the average values (in the population) of the words associated with each meaning is highlighted in the middle of each band.
Can collective dynamics lead to choose the best word/replicators? A good word/replicator is a replicator that an agent will not confuse with another one that has a different usage. A good lexical system should have sets of words clearly distinct from one another depending on the meanings they associated to. Results show that indeed, distinctive sets of word-meaning associations are progressively formed and selected through cultural interaction. Figure 1 shows the evolution in the word space (i.e. an idealized acoustic space) of the words associated with 5 meanings in an experiment involving 10 agents with a noise level B = 100. After an initial ambiguity period, five well separated bands in the word space are clearly identifiable. Agents do not converge towards a unique value for each context. Each agent uses a different one. But these values tend to be very close. The band for one context is clearly distinct from bands associated with other contexts. No confusion is possible. Figure 1 plots also the average value of each band. Thus, it is easier to see the collective optimization of distinctivity leading to a solution compatible with the level of noise present in the environment. We also see on this graph that with this level of noise we approach the limit of expressiveness possible in this medium. If the agents had to communicate about a larger number of distinct meanings, ambiguity will inevitably arise.
2.3 Neutral drift External factors like language contact between populations are often cited as a major cause of language evolution. But it is also known that language can change spontaneously based on internal dynamics (Labov, 1994). We have seen with the previous model that in a noisy environment, agents can converge on a stable system in which distinct bands in the word/acoustic space are associated with distinct contexts. As we see in figure 1, this repartition in separated bands does not evolve anymore once a stable solution has been found. Figure 2 shows the evolution of the average word in the presence of an agent flux defined by a probability of replacing an old agent by a new one Pr = 0.01, for a population of 20 agents and 2 contexts. The centre of the bands is spontaneously evolving as new agents are entering the system. This is an example of a neutral drift.
Figure 2: Example of a neutral drift: Spontaneous evolution of the average forms in presence of an
agent flux.
This effect is easily understandable. A new agent tends to converge on words belonging to the existing bands for each meaning to express. But within this band, it does not converge towards the exact centre of the band. Thus the centre is moving as the flux of new agents enters the system. The higher the agent tolerance on noise, the higher the amplitude of this drift (see (Steels and Kaplan, 1998) for a first description of this phenomenon).
In this experiment, words/replicators are drifting spontaneously without any functional drive. However external pressures can direct these dynamics in one direction or another. This neutral drift provides novelty and thus can lead to a more efficient reorganization if needed. In some way, this is effect is similar to role of neutral mutation in evolution (Kimura, 1983).
Experiments on computational models of phonological systems have shown how similar collective dynamics in the presence of noise lead a population of agents to converge towards a set of vowels optimally distributed in the phonological space in order to favour distinctiveness between them (de Boer, 1997; De Boer, 1999; Oudeyer, 2005b). Such emerging phonetic systems have high similarity with real ones as observed in natural languages.
To summarize, noise during word transmission favours sets of words that are clearly distinct from one another when they are associated with different meanings. Finally, in the presence of noise and agent flux, we experimentally observe a spontaneous non functional evolution. This continuous exploration can lead to a more efficient reorganization of the replicator system if needed.
3. Environmental constraints and Darwinian dynamics Until now, we have only considered simple models of linguistic replication. Linguistic phenomena are obviously more complex. The previous experiments have focused on the replication of words, but their associated meanings and semantic categories are also entities that can replicate from brain to brain through linguistic interactions. In many computational models, these categories are modelled as points in category space. But more complex systems of meanings, also referred as categories or concepts, were also investigated (Steels and Belpaeme, 2005). The Talking Heads experiment conducted between 1999 and 2000 by Steels and co-workers has provided a large set of data on how systems of categories can adapt to particular environments (Steels and Kaplan, 2002; Kaplan, 2001). In this experiment robotic agents were capable of segmenting the image perceived through the camera into objects and of collecting various sensory data about each object, such as the colour, position, size, shape, etc. A couple of robots were placed in front of a white board on which various types of objects were placed. At each interaction, the speaker chose one object from this context, reused or constructed a category that would identify this object from the other object present in the background and uttered a word associated with that category. Based on this word, the other robot had to guess which object was named (Figure 3).
Figure 3: The Talking Heads set up. Two robotic cameras are placed in front of white board. On the board, objects of various shapes and colours are placed. The robots have to construct categories and words to name the objects on the board and have the other agent guess the right object based on that word. Categories referring to colour, position, size or shape can be used.
In the first run of the experiment, a total of 8000 words and 500 concepts were created, with a core vocabulary consisting of 10 basic words expressing concepts like up, down, left, right, green, red, large, small, etc. The dynamics that pushed the population towards coherence and distinctivity ensured the collective choice of a set of word-category associations adapted to the environment that the robots were perceiving. Interestingly, and in spite of a set of possible perceptual features with no internal cognitive bias for some over others, some features like shape were used very rarely whereas position, colour and size categories were preferred. This can be explained by the properties of the environment in which the language games took place. Indeed, it happened that the kind of objects places on the white boards were very similar in shape, but much more distinctive in terms of colors and positions.
Such types of indirect competitions between perceptual categories were observed during the whole experiment. Some categories were general and other specific (e.g. one was used to describe a
particular shade of green, and another one to describe green contexts in general). Usually, general categories were preferred because they were both easier to learn by the agent and adapted to a larger number of contexts (see also Smith (2005) for another series of experiments in this line). However, in several cases, a precise category adapted to reoccurring specific context survived as other categories were present to back it up. Therefore, when analyzing these types of complex dynamics, considering competition between isolated categories is not always sufficient. The quality of a category needs to be evaluated regarding the category set to which it belongs and the adaptivity of the whole to particular environments.
Another interesting phenomenon was observed. Most of the words of the core vocabulary were coherently interpreted as having distinct meanings. However, in some cases, two competing meanings co-occurred for a long time. For instance the word bozopite was associated concurrently with two types of categories: large area (large) and large width (wide). This co-occurrence was due to the fact that in the types of environments that the robotic agents encountered most objects that were large in area were also large in width. This is an example of residual polysemy.
This brings us to a remark. As collective dynamics select sets of replicators that are well adapted to the environment in which the agents are communicating, we might be tempted to say that the quality of the replicators increases. But, like for species in natural evolution, optimization stops once adaptation is reached. We have seen in the previous section that in the presence of noise, well separated bands of replicators were emerging. However, once a stable solution was found, this optimization of distinctivity stops. The same effect occurs in more complex architectures where residual polysemy is observed. In all these situations, there is no absolute optimization, only the search for stable adapted solutions.
4. Cognitive constraints on Darwinian dynamics When linguistic replicators leap from brain to brain, they in fact do so through perpetual cycles of encoding, production, perception, decoding and learning. Whatever these replicators are, they need to be incorporated into internal representations in the brain of speakers and hearers at some point. The process of updating ones brain to incorporate some new information defines learning. Learning theory, and in particular machine learning theory (Mitchell and Weinmann, 1997), has shown that all learning systems are characterized by a number of biases which mean that every single system will be good at learning certain things and bad at learning other things. For example, learning algorithms such as recurrent neural networks are good to learn to predict complex time series but they are quite inefficient to learn fine categorical distinctions in high-dimensional static spaces, whereas support vector machines are good in high-dimensional static spaces but pretty bad when they have to learn time-dependent phenomena (Duda et al., 2001). Learning biases also apply to human brains. For example, when the human brain learns a new concept or a new sound, it will do so typically by using the representation of an already known concept or sound and modify it a little bit. The consequence is that learning a new concept or a new sound will only be effective if the corresponding brain already knows not too dissimilar concepts or sounds. This imposes strong constraints on the replication of linguistic memes, which are not only defined by the generic cognitive constraints of all human brains, but also by the particular cognitive structures that were built during the ontogeny of each of them. This means that for a given brain, some linguistic memes will be easily learnt and replicated, but some other linguistic memes will be strongly deformed often to the point that no replication at all takes place. And the linguistic memes which are easy to learn and replicate for a given brain may prove to be
difficult for another brain which had a different history.
What is then the consequence of all this on the dynamics of language evolution? We will now present the outline of a computational model of the origins of syllable systems which sketches the outline of an answer (this model is described in detail in another article (Oudeyer, 2005a)). This model involves a population of agents which can produce, hear, and learn syllables, based on an auditory and a motor apparatus that are linked by abstract neural structures. These abstract neural structures are implemented as a set of prototypes or templates, each of them being an association between a motor program that has been tried through babbling and the corresponding acoustic trajectory. Thus, agents store in their memory only acoustic trajectories that they have already managed to produce themselves. The set of these prototypes is initially empty for all agents, and grows progressively through babbling. The babblings of each agent can be heard by nearby agents, and this influences their own babbling. Indeed, when an agent hears an acoustic trajectory, this activates the closest prototype in its memory and triggers some specific motor exploration of small variation of the associated motor program. This means that if an agent hears a syllable S that it does not already know, two cases are possible: 1) he already knows a quite similar syllable and has a great chance to stumble upon the motor program for S when exploring small variations of the known syllable, 2) he does not already know a similar syllable and so there is little chance that he incorporates in its memory a prototype corresponding to S. This process means that if several babbling agents are put together, some islands of prototypes, i.e. networks of very similar syllables, will form in their memory and they will develop a shared skill corresponding to the perception and production of the syllables in these networks. Nevertheless, the space of possible syllables was large in these experiments, and so the first thing that was studied was whether agents in the same simulation could develop a large and shared repertoire of syllables. This was shown to be the case (Oudeyer, 2005a). Interestingly, if one runs two simulations, the population of agents will always end up with their own particular repertoire of syllables.
Then, a second experiment was run: some fresh agents were tested for learning syllable systems that were formed by another population of interacting agents, and some other fresh agents were tested for learning a syllable system which was generated artificially as a list of random syllables. The results, illustrated in figure 4, were that the fresh agents were always good at learning the syllable systems developed by other similar agents, but on the contrary rather bad at learning the random syllable systems. In other terms, the syllable systems developed culturally by agents were adapted to their cognitive biases, and the random systems were not. Thus, the replicators constituted by syllables evolved and were selected in a cultural Darwinian process so as to fit to the pre-existing ecological niche defined by the cognitive structures of agents, fitness being here learnability. What is particularly interesting is to note that the pre-existing learning/cognitive biases are here not language specific: the same architecture could be used to learn hand-eye coordination for example. The linguistic structures that adapt to these biases seem to be idiosyncratic from an external point of view, and might be thought to rely on language specific cognitive modules, but this experiment shows that this is not necessarily the case.
Figure 4: Evolution of the rate of successful imitations for a child agent which learns a syllable system established by a population of agents (top curve), and for a child agent which learns a syllable system established randomly by the experimenter (bottom curve). The child agent can only perfectly learn the vocalization systems which evolved in a population of agents. Such vocalization systems were selected for learnability (Reprinted from (Oudeyer, 2005a))
Several other computational systems have been developed to study the mechanisms that allow the cultural selection for learnability of linguistic replicators. Zuidema presented abstract simulations of the formation of syntactic structures and detailed the influence of cognitive constraints upon the generated syntax (Zuidema, 2003). Brighton et al. presented a thorough study of several simulations of the origins of syntax (Kirby, 2001) which were re-described in the light of this paradigm of cultural selection for learnability (Brighton et al., 2005). 5. Conclusion We have shown in this paper that conceptualizing language evolution in analogy to biological evolution, which has been proposed in the field of memetics in particular, requires to go down to the details of the replication mechanisms as well as the precise definition of replicating units and the ecological constraints in which their evolution takes place. Otherwise, there is a risk that this analogy may lead to misconceptions (such as for example the idea that simple epidemiologic models might be a good model of the formation of a consensus on the use of a (new) linguistic representation). However, we also think that the different examples we presented illustrate the fact that when the Darwinian concepts are applied operationally to the evolution of linguistic representations, they can provide important new insights. This conceptualization may participate in the refoundation of linguistics (Croft, 2000). Indeed, in the last fifty years language has been mainly considered as a fixed and idealized system which could be studied independently of its use and of its users. In this traditional body of theories, individual variation, and more broadly language evolution were either ignored or left unsolved, and biological evolution was used to explain the particularly good adaptation of our brains to the learning of nowadays idiosyncratic languages. On the contrary, viewing language as a system of replicators constantly replicating from particular brains to particular brains, with variation as a central
concept, allows one to understand how languages change over time, why there is so much linguistic diversity, and provides a different account of the ease with which children learn languages. Indeed, as we showed in our last example, this framework allows us to understand that languages themselves probably evolved in a cultural Darwinian manner so as to become easily learnable by their users. And the peculiarities of the pre-existing learning systems of these users can explain the apparent idiosyncratic properties of languages. The paradigm shift induced by viewing language evolution as a Darwinian process also sets up new problems to be solved. In particular, it highlights the fact that the sharing of complex and intricate linguistic conventions must be explained: how can a system of competing replicators interacting at the level of individuals converge to a coherent and distinctive system adopted by all the population? We have shown with several computational experiments how this problem could be solved for lexical systems, thanks to the use of specific replication mechanisms based on positive feedback loops and self-organization. Yet, future work will have to show how several interacting levels of conventions, ranging from phonology to grammar and pragmatics, can be formed through a cultural Darwinian process.
Acknowledgements This research has been partially supported by the ECAGENTS project founded by the Future and Emerging Technologies programme (IST-FET) of the European Community under EU R&D contract IST-2003-1940.
References Arita, T., Koyama, Y., June 1998. Evolution of linguistic diversity in a simple communication system. In: Adami, C., Belew, R., Kitano, H., Taylor, C. (Eds.), Proceedings of Artificial Life VI. The MIT Press, Cambridge, MA, pp. 917. Baronchelli, A., Felici, M., Loreto, V., Caglioti, E., Steels, L., 2006. Sharp transition towards shared vocabularies in multi-agent systems. Journal of Statistical Mechanics P06014. Berlinski, D., 1972. Philosophical aspects of molecular biology. Journal of Philosophy 69 (12), 319 335. Brighton, H., Kirby, S., Smith, K., 2005. Cultural selection for learnability: Three hypotheses underlying the view that language adapts to be learnable. In: Tallerman, M. (Ed.), Language Origins: Perspective on Evolution. Oxford University Press, Oxford. Croft, W., 2000. Explaining language change. Linguistics. Longman. Croft, W., 2002. The darwinization of linguistics. Selection (3), 7591. Dawkins, R., 1976. The selfish gene. Oxford University Press, Oxford. de Boer, B., 1997. Generating vowel systems in a population of agents. In: Husbands, P., Harvey, I. (Eds.), Proceedings of the Fourth European Conference on Artificial Life. The MIT Press, Cambridge, MA. De Boer, B., 1999. Self-organizing phonological systems. Ph.D. thesis, VUB University, Brussels. De Jong, E., Steels, L., 2003. A distributed learning algorithm for communication development. Complex Systems 14 (4-5), 315334. Duda, R., Hart, P., Stork, D., 2001. Pattern classification. John Wiley and Son. Hull, D., 1988. Science as a process: an evolutionary account of the social and conceptual development of science. Chicago: University of Chicago Press. Hutchins, E., Hazlehurst, B., 1995. How to invent a lexicon: the development of shared symbols in interaction. In: Gilbert, N., Conte, R. (Eds.), Artificial Societies: The Computer Simulation of Social Life. UCL Press, London, pp. 157189. Kaplan, F., 2001. La naissance dune langue chez les robots. Hermes Science, Paris. Kaplan, F., 2005. Simple models of distributed co-ordination. Connection Science 17 (34), 249270.
Kimura, M., 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge. Kirby, S., 2001. Spontaneous evolution of linguistic structure: an iterated learning model of the emergence of regularity and irregularity. IEEE Transactions on Evolutionary Computation 5 (2), 102 110. Labov, W., 1994. Principles of Linguistic Change. Volume 1: Internal Factors. Blackwell, Oxford. Mitchell, B., Weinmann, L., 1997. Creative design for the www. Lecture Notes, SIGGRAPH 1997. Mufwene, S. S., 2005. Language evolution: The population genetics way. Vol. 29. Gene, Sprachen, und ihre Evolution, pp. 3052. Oliphant, M., 1997. Formal approaches to innate and learned communicaton: laying the foundation for language. Ph.D. thesis, University of California, San Diego. Oudeyer, P.-Y., 2005a. How phonological structures can be culturally selected for learnability. Adaptive Behavior 13 (4), 269280. Oudeyer, P.-Y., 2005b. The self-organization of speech sounds. Journal of Theoretical Biology 233 (3), 435449. Schleicher, A., 1863. Die darwinsche Theorie und die Sprachwissenschaft : Offenes Sendschreiben an Herrn Dr. Ernst Hkel. Weimar : Bhlau. Searls, D. B., 2002. The language of genes. Nature (420), 211217. Smith, A. D. M., 2005. The inferential transmission of language. Adaptive Behavior 13 (4), 311324. Sperber, D., 1984. Anthropology and psychology : towards and epidemiology of representations (the malinowsjumemorial lecture 1984). Man 20, 7389.Steels, L., 1996. A self-organizing spatial vocabulary. Artificial Life Journal 2 (3), 319332. Steels, L., 2004. Analogies between genome and language evolution. In: Pollck, J. e. a. (Ed.), Artificial LifeIX: Proceedings of the Ninth International Conference on the Simulation and Synthesis of Living Systems. The MIT Press, Cambridge, MA, pp. 200207. Steels, L., Belpaeme, T., 2005. Coordinating perceptually grounded categories through language: A case study for colour. Behavioral and Brain Sciences 28, 469529. Steels, L., Kaplan, F., August 1998. Spontaneous lexicon change. In: Proceedings of COLING-ACL 1998. Morgan Kaufmann, San Francisco, CA, pp. 12431250. Steels, L., Kaplan, F., 2002. Bootstrapping grounded word semantics. In: Briscoe, T. (Ed.), Linguistic evolution through language acquisition: formal and computational models. Cambridge University Press, Cambridge, UK, pp. 5374. Stegmann, U. E., 2004. The arbitrariness of the genetic code. Biology and Philosophy 19 (2), 205 222. van Driem, G., 2003. The language organism: The leiden theory of language evolution. In: Mrovsky, J., Kotesovcova, A., Hajicova, E. (Eds.), Proceedings of the XVIIth International Congress of Linguists. Prague: Matfyzpress vydavatelstv Matematicko-fyzikaln fakulty Univerzity Karlovy, Prague. Vogt, P., 2005. The emergence of compositional structure in perceptually grounded language games. Artificial Intelligence 167 (12), 206242. Zuidema, W., 2003. How the poverty of the stimulus solves the poverty of the stimulus. In: Becker, S.,Obermayer, K. (Eds.), Advances in Neural Information Processing 15. Cambridge MA: MIT Press, pp. 5168.
The empirical content of the notion of linguistic mental representations.

Wolfram Hinzen (Durham University)
(Date de publication : 9 juillet 2007)
Abstract: If we look at the empirical basis for positing mental representations in the study of human linguistic competence, we find, firstly, that the assumption that there are innate representations essentially amounts to the refutation of a radical externalism. But the latter has no significance at least in the case of the study of linguistic competence, so nativism in this sense seems to be rather uncontentious; secondly, it is arguable that such representations are necessarily as 'natural' as anything is that fits into our best available explanatory schemes. In other words, it is not clear whether a question of naturalism arises for them. Furthermore, the idea that such representations are actually optimal responses to 'external pressures' has no empirical basis, since these representations have no 'relational' content at all. I argue that, despite all that, the notion of a linguistic representation within the language faculty is quite spectacularly interesting in many epistemological and metaphysical respects.
Paper :
In using the term language, I shall be writing about the human language faculty a set of mechanisms implemented in the brain and specific to our species, which generates expressions. These, intuitively speaking, are pairings of phonetic structures with semantic ones. This set of mechanisms, let us call it a generative algorithm, needs to exist: expressions are highly structured, and some procedure needs to generate that complexity.1 We may think of the faculty as another organ of the body, in the informal sense of a distinctive subsystem of our organism that interfaces with other such subsystems and emerges in ontogeny due to three factors: (i) genetic factors and laws of organismic development, (ii) external input, and (iii) general (e.g. physical) laws not specific to either language, our species, or the organic world (Chomsky 2005). A form of nativism follows, albeit in a rather trivial sense, once the impact of (i) is non-negligible, which I will here take for granted for a trait such as language. I shall argue that there is no sufficient empirical basis for positing referential relations between organism-internal representations generated in this organ, on the one hand, and external entities, on the other, if the relations and external entities in question are thought to determine or explain the meaning, content, or function of these representations. It is not the existence of mental representations that I will question which, I will take it, simply follows from the fact that neither the parts of the relevant generated structures nor their combinatorial laws are physical (in any other than the trivial sense of belonging to nature the one natural universe).
II Human linguistic expressions form a system, in a similar sense as the natural numbers do. If there is one expression, there are innumerably many others, and understanding the former entails understanding the latter (Fodor and Pylyshyn 1988). If language consisted only of interjections, which are non-structured at the relevant levels of representation, this would not be the case: no interjection makes a prediction for the existence of any other interjection. Not so in the case of structured
1 A generative grammar an explicit theory of this algorithm is for this very reason not a (controversial) theory, but a research programme.
expressions. Thus, understanding the expression John loves Mary is not to understand merely that some relation holds specifically between John and Mary. It is to understand that the very same relation can hold between any entities that form the subject and object of this expression, in the same way that a value can be inserted for an abstract variable. Similarly, the relation plus 2 in the equation 1+2=3 is not a relation specifically between 1 and 3. It is a relation between any value that instantiates the relevant variables in the equation x+2=y. In the same way, then, understanding John loves Mary is to understand algebraic relations between variables for which John and Mary are values. Hence, as a matter of algebraic fact, Mary loves John is an expression, a part of the system, too. There is, moreover, no finite bound to the expressions that are part of the system in question, again not unlike the case of the natural numbers.
This is a tale, not merely of structures, but meanings. Meaning is systematically aligned with structure in this system, as a matter of contingent fact. The architecture of the language faculty might have been different syntax and semantics could have been entirely independent computational modules with no implications of one for the other. But whatever human language we look at, the facts tell otherwise. Thus, in English, too, a decision over which variable the value John instantiates has systematic consequences for how the resulting complex expression is semantically interpreted. Depending on where in the underlying hierarchical structure of the expression this lexical item is inserted, it comes out as either the agent or the patient of the event in question. That, if Mary loves John, Mary is in love, is not a conjecture, but a structural necessity. Any denial of this entailment is a violation of an algebraic law (not of logic, but) theta-theory, a module of Universal Grammar (UG). If the interpretation is to be different, the structure needs to be changed. By contrast, interpreting Mary loves John to entail that John is in love is to draw a risky inductive inference: it is a conjecture that the structure does not license. In an older terminology, the first entailment is analytic; the latter is synthetic.
This explanation of analytic entailment (structure-driven meaning facts in a biological organ) has not invoked any relation of representation or reference at any point (nor has it invoked truth). Whatever John refers to, in particular, the fact that if that thing loves Mary, it is in love (and Mary is not necessarily), will hold. The explanation has invoked the structure of the language faculty, an organisminternal dynamical system that is not, apparently, experientially driven or stimulus-bound: excepting cases of pathology, the generation of an expression in a particular moment of time is not a function of what internal state we are in or which external state of the world we confront. As Quine (1960) put a related point, we cannot tell, from the features of the situations in which expressions are used, what they mean. A nave objection might be that all normal children turn this trick when acquiring a language. But their accomplishment is wrongly described in these terms: what narrows down their search space when hearing unfamiliar sounds is what meanings they know; and as Lila Gleitman has shown, it is in fact the structures of expressions that their minds can generate which help with the mapping task (Gleitman 2005).
By possessing the relevant generative algorithm, then, we can freely generate thoughts, merely by instantiating variables, and in doing so, we are moving around (mentally) in an infinitary system. Does that system represent anything? Its existence is due to the generative algorithm. Without it, there wouldnt be any such system. The algorithm, as far as we know, exclusively exists in (is implemented in) one place in the universe, namely the human brain. Hence, without the human brain, there wouldnt be any such system. The external environment, in particular, exhibits no traces of it. The sense in which language is external at all remains unclear: the written aspect of language is irrelevant for understanding the universal basis of our linguistic ability (most languages alive on this planet are not written). The acoustic aspect of language is close to irrelevant as well, as acoustics looked at as such,
i.e. in physical terms, exhibits none of the features that we associate with language: morphological structure (words), syntactic structure (phrases), and semantic structure (meanings). If we wish to understand something about language, we dont ask an acoustics expert or look at spectrograms, we ask a linguist.
What use then is the assumption that the system itself or any of its productions (structured expressions) represent the external environment, in some explanatory sense? Again, the linguistic environment described in physical as opposed to linguistic terms does not exhibit any of the algebraic structures that we find in the system and that systematically account for their semantic properties (to the extent that these are systemic: idioms or irony indicate that they need not be). Positing such structures out there, to then say linguistic structures represent them, seems circular. We need an empirical benefit for positing such representation relations. Saying mental representations in our head represent some external content (referent, semantic value, etc.) is an empirical claim, not a philosophical hypothesis, and should be tested like other empirical claims. What is the relevant evidence?
III One story is this. There is this generative algorithm, and it generates structures, but the CONTENT of these structures is something else (a worldly physical entity, perhaps an abstract entity). Its what the structures relate to. This relation to an external object is what their MEANING consists in. Before they relate to it, there is no meaning. The structures in question are therefore, not just structures, but mental REPRESENTATIONS. They are relationally understood. Philosophy in the 20th century has largely been about the question of how this reference relation gets into place, and thereby determines the mystery notions I have here capitalized. Its perhaps deepest anxiety has been to eradicate the case where meaning is plainly there but the external referent is missing, as in the case of the infamous current king of France. Russells method of eradication was to simply deny what is plainly the case from a linguistic point of view, namely that the kind of France is a referential expression.
This externalist view runs counter to what is probably the most distinctive evolutionary design feature of human language: that reference in human language is intentional and free, and in particular physically unconstrained. Unlike in any other known animal communication system, the referent can be arbitrarily remote in space and time, it can be abstract, and it certainly need not exist. We have the privilege to refer to what we like. The meanings of any one in the infinity of complex expressions are not what they because these relate to anything external. They are strictly a function of the systems internal combinatorics: John loves Mary means that John loves Mary and entails that John is in love not because of the world but because of the expressions hierarchical structure, its determining principles, and the lexical concepts it contains. An explanation of this meaning begins from items of the mental lexicon stored in long-term memory, each a pairing of a sound with a meaning (a mental concept, such as my concept of love). A retrieval procedure calls these items and inserts them into a derivation, which structures them in line with the laws of theta-theory and other modules of UG. The result of this process (all of it head-internal) is a complex expression with a new sound and a new meaning, which can then be externalized acoustically (though it need not). This meaning does not exist without the structure, I will assume not in any Platonic heaven, nor in any external physical world (as far as we know). Square circle means what it does, not because there are any, but because we have combined concepts in a particular way, generating a phrase structure of a particular kind. Combining concepts is free: we can do it to make worlds, rather than to mirror or represent the one thats physically there.
On this view, given the structure, the meaning, if its systematic, will follow by necessity. Mapping this very structure to something else external to it some physical conglomerate, some object posited in a non-physical realm does not help to explain why it has the meaning it does. An appeal to the internal structure of the expression as revealed in linguistic analysis is needed to explain a semantic fact such as that if John loves Mary, John is in love but Mary not necessarily. Hence, whatever we map this expression with its internal linguistic structure to either needs to recapitulate this internal structure, in which case it is redundant, or it does not, in which case we need to ask why it is needed in the first place, since the linguistic structure does account for the meaning.
Here is an answer to the question of why the different structure is needed. Linguistic expressions do not just have a meaning which may follow from their internal structure. They are also true or false. And this indicates they relate to an environment at least to a physical one (and one somehow hopes the idea carries over to abstract meanings). Features inherent to the expression could not explain that. Hence a different structure is needed, which does explain it. As I have indicated, that it could be the same structure would indeed be a very strange proposal: there are no verb phrases, relations of predication, subjects, lexical concepts, etc. out there in the physical world. But if it is a different structure, not isomorphic to the syntactic one that we find in the relevant production of the linguistic system (the expression), we dont know how or by what principles we map the one to the other, or how the external would explain the internal one. A fully transparent mapping between a syntactic structure and the presumed semantic correlate of it does not encounter this problem. Ideally, indeed, we would have an identity mapping here (Hinzen 2006). Be that as it may, let us turn to the proposal that truth and reference necessitate the idea that linguistic expressions intrinsically relate to the environment (as mind-independently and non-linguistically described), and that this accounts for their semantic properties.
IV Some things we know: we know there is a word true/truth, which is associated lexically with a human concept stored in long-term memory, the concept of truth. We also know that all human languages exhibit a very special relation, the relation of predication. Predication is not quite like any other relation. It has been studied in linguistic terms for a long time (for recent accounts see Moro 2000; den Dikken 2005). Truth enters into this very special relation. Various other things can be predicated in this way, too, such as existence. If truth is chosen for purposes of this special relation, it is necessarily predicated of full (declarative) sentences as opposed to other kinds of syntactic structures, as (1) suggests, an ancient observation:
(1) a. *[John] is true b. *[the Eiffel tower] is true c. *[my sister] is true d. *[in the garden] is true e. *[John nice] is true
f. *[walk quickly] is true g. *[kill Bill] is true h. *[John walk] is true
In each example of (1), truth is predicated of a syntactic object that is formally of the wrong kind to enter into such a predication: in (1a-c) the bracketed structures are Noun phrases (NPs), in (1d) it is a Prepositional Phrase (PP), in (1e) it is a small (verbless, untensed) clause, as in Mary considers [John nice], in (1f) it is a verb to which an adjunct has been adjoined, in (1g) it is a verb that as taken an argument (as opposed to an adjunct), in (1h) it is a clause with a non-finite verb. What do all of these various phrases have in common, structurally? The empirical answer is they are too impoverished to bear a predication of truth in particular, they all lack finite Tense. Specification for tense, these data tell us, is something that truth needs. A syntactic object that has finiteness specified in it is a Complementizer Phrase (CP), as seen in (2):
(2) [CP That Caesar conquered Gaul] is true
Before we have reached that level of structural complexity, then, a CP, which necessarily contains phrases of the type bracketed in (1) and is more complex than any of them, truth is not an option. Similar claims hold for other second-order notions like existence, though existence is predicated of NPs. What (1-2) suggest, then, is that truth is inherently a structural phenomenon. It is woven into the architecture of the full human clause. How we relate to the world explains nothing of this.
Quite the contrary: predicating truth is a very special way of relating to the world; and we are very likely alone at least on this planet in carrying out such mental feats. It would be surprising to be told that relating to the world or a causal relation of reference explains how this feat is possible. It appears to be the other way around: our mind builds a very specific and contingent kind of hierarchical structure: the rich fabric of the human clause (Hinzen 2003), with the verbal domain at the bottom, and the CP-TP (finiteness) layer on top. Once that structure is in place, a mental object exists that we can use to relate to the world in the specific way of carrying an alethic force. It provides what we may call a challenge for how the world is. Is it that way? The answer we do not find out by building more structure in our minds, or by more thinking. We only find it out by leaving language behind, and doing experiments and interacting with the world instead. We go to a laboratory if we have one, or a mundane substitute for it in our more everyday truth judgements (Is this really Tom? Are these pants the best purchase?). Nothing of this necessitates or even suggests the empirical assumption that when we have predicated truth of a proposition, there is a structure out there, which is not quite that of the linguistic expression that contains our truth predication, but slightly different, and yet similar enough to relate to it inherently and explain it.
As far as I can see, the assumption is baseless. Different kinds of minds confront the physical world and its adaptive challenges. They meet them as good they can with the structures that they find in their minds. The structures that we find in some of these minds, in particular our own, are not predicted by the structures that we find out there, or the adaptive challenges they pose. Perhaps we are even alone in doing something as seemingly simple as asking a question, or what I called posing a challenge. Yet the physical environment is something that all minds share. Perhaps there is a circle of
causality where mental structures feed into an environment, which feeds back into these mental structures, and adaptivity is enhanced as a result (McGonigle&Chalmers 2001). None of this accounts for what I have emphasized here, the origin of structure, as a presupposition for judgments of truth.
V The challenge posed remains, to show that we can or need to account for the productivity of human thought and the infinitary system of meanings in which we can move around at will when engaging in thought by appeal to structures or entities outside the human organism. If it wasnt met, where would this leave us? It means that while there are mental structures produced by some generative engine, they are not mental representations, in the relational sense commonly presupposed in philosophical discussions. As I suggested, though, the structures themselves need to exist. There is moreover all the justification in the world for calling them mental: for, at physical levels of descriptions, we can throw little light on them. If we could, we didnt have to go to the linguists and listen to the abstract categories that they invoke to describe the brains processing of language. But we do. So we are stuck with all kinds of structures that need to be non-physically described, hence are non-physical, if we interpret, as we should, our best available empirical theories of the world realistically.
Calling mental structures that underlie human language use unnatural on the grounds they are not describable in a physical vocabulary and share few if any of the features of physical objects, is as sensible today as it was for scientists in the 19th century do reject a realist interpretation of chemistry, on the grounds that the chemical could not be cashed out in physical terms. Science does not come with an ontology: it invents one. A linguistic ontology is no less natural today than a chemical one should have been in the 19th century. But can mental structures enter causally into an act of language use in which a thought is conveyed in a physical medium (causing air vibrations)? The suggestion that they do is no more surprising than the suggestion that a material object can act where it is not or the idea of non-physical action on a physical body. And yet, science has had to incorporate that very notion, that causation does not work mechanically, in terms of contact.
So, we have, not merely structures, but mental ones. But are they representations? I have suggested they enter into the way we relate to the world, constitutively. They construct a human environment (or better, a number of possible worlds) more than they reflect or mirror one. Not even the phonological structures do: they relate to all kinds of acoustic external patterns for sure, but they do not represent them, in the sense that there are phonemes and prosodic contours out there (some concoction of molecules, perhaps) to which such internal structures which phonologists posit relate or refer. As Chomsky (2000) emphasizes, such entities are not posited, and we expect this equally, or perhaps more so, on the semantic side. No doubt we do not want to say that a mental concept is some concoction of molecules outside. Again, no doubt there are various relations that connect the semantic underlying structure of an expression as generated in a human mind to physical structures in the environment. There is a myriad of causal connections in particular, in each and every case of creative language use. If I am right, none of this either explains what the internal structures mean or how they are used, nor why they have come into existence. The environment certainly doesnt cause such structures, nor do the regimes of adaptation enforce or explain them. The structures make great adaptive sense they are usable. None of this means that their use explains why they have come to exist or how they work (can be put to use). Structure-building in the organism organismic morphology and their laws of development, here at the level of abstract syntactic and algebraic structures processed by a brain is an independent explanatory factor in evolution, interacting with the laws of adaptation yet not reducible to them (Mller and Newman 2005; Hinzen 2006).
The adaptive challenge that the evolutionary engineer is supposed to have answered by fabricating language, on functionalist views, is usually identified as communication. As Pinker and Jackendoff (2005) put it: language evolved for the communication of complex propositions. But there is a radical difference, as noted above, between linguistic and non-linguistic communication. Any program explaining the features of human language from its use as a communication system will have to explain or predict its features, not from linguistic communication (which would be circular), but nonlinguistic communication. But non-linguistic communication comes for cheap in evolutionary terms: every species does it, often more efficiently than we do, by some reasonable standards (for example, in the case of ants). Looking at its features (Hauser 1996) explains next to nothing about the structural features of human linguistic communication, or what makes this system unique.
As a natural object, language may only contingently relate to the communicative to which we put it: that we not only produce recursive structures in our thought, but also externalize them may well be a happy accident of evolution. The more we find quasi-linguistic structure-building processes in the nonhuman animal mind hierarchical organization, categorization, perhaps recursion (McGonigle and Chalmers 2007) the more in becomes clear that having a quasi-linguistic mind (a mind generating the abstract structures underlying spoken linguistic expressions) does not require an ability to externalize it.
VI While I have mainly exploited the syntactic-structural aspects of language to draw these conclusions, I dont think that the empirical study of meaning and reference of words would yield any other result. As I have argued elsewhere (Hinzen 2007), positing a reference relation for words or the parts of speech does not illuminate their meaning either. It presupposes them. It does not illuminate our ways of referring to a person or a city to say that there is a perhaps causal relation to cities or persons. No standing in causal relations will explain why a creature that has a concept of a person has it, or a creature that lacks such a concept lacks it. No causal relation to a referent will explain why a child thinks a wizard (qua person) doesnt change when he transforms himself into a lion, or a mouse; or why a city can stay the same, in our judgement, when it is destroyed and rebuilt elsewhere (Chomsky 2000). It is our concepts of persons and cities that determine these identities, or what the world can be like. Intentional reference needs to exploit the specific perspectives on the world that human concepts provide, just as a judgement of truth needs to exploit the configuration of a full proposition in abstract thought that the truth value is assigned to, in this judgement. Reference is a function of what concepts we possess, just as truth is a function of which structures our mind can generate (section 3). This cannot be so if reference explains concepts, or what they mean. It is the other way around. There might be a creature that has concepts, but cannot refer probably most concept-using creatures on Earth are like this, if indeed intentional (as opposed to functional) reference is humanly unique and if at least non-human primates have a mental life as rich as current evidence suggests. There might even be a creature that can combine concepts recursively and compositionally at will (to generate blue circle, etc.), without ever intentionally referring or pointing to any one such thing (a blue circle, or this blue circle), simply because it lacks the necessary computational resources: determiners such as a or this, which localize a given, conceptually configured object in space. VII I have been defending a thoroughly internalist perspective on the human faculty of language, opposing an externalist preference that is prevalent today (and has been so since the 1950s). I contend the deepest motive for positing a relation of reference, which I have argued is non-
explanatory, or for individuating mental structures functionally, has been not empirical but metaphysical: physicalism has induced a pressure on eradicating mental entities as primitives of nature. If I am right, there is no such thing as a physical individuation of language anywhere in view; nor does science support in any way the underlying aprioristic ontological ideas. And the functional perspective on mental structures as representations of an external environment does not sufficiently respect the most distinctive design feature of language, its freedom from external control, its lack of situation-specificity, and the apparent lack of physical constraints on its referential use. References Chomsky, N.: 2000. New Horizons in the Study of Language and Mind, Cambridge University Press. Chomsky, N.: 2005. Three factors in language design, Linguistic Inquiry 36:1, 1-22. den Dikken, M.: 2006. Relators and linkers. Cambridge: MIT Press. Fodor, J. and Z. Pylyshyn: 1988. Connectionism and cognitive architecture: a critical analysis, Cognition 28, 3-71. Gleitman, L., K. Cassidy, R. Nappa, A. Papafragou, and J. C. Trueswell: 2005. Hard words, Language learning and development, (1), 23-64. Hauser, M.D.: 1996: The evolution of communication. Cambridge: MIT Press. Hinzen, W.: 2006: Mind design and minimal syntax. Oxford: Oxford University Press. Hinzen, W.: 2007: An essay on names and truth, Oxford: Oxford University Press. Moro, A.: 2000. Dynamic antisymmetry. Cambridge: MIT Press. McGonigle, B., and M. Chalmers: 2001. Circular causality comes to cognition, in: Spatial schemas and abstract thought. Cambridge: MIT Press. McGonigle, B., Chalmers, M.: 2007. Ordering and executive functioning as a window on the evolution and development of cognitive systems. Int. J. Comp. Psych., in press. Mller G.B. and S.A. Newman: 2005. The innovation triad: An EvoDevo agenda Journal of Experimental Zoology, MDE 304: 487-503. Pinker, S. and R. Jackendoff: 2005. Whats special about the human language faculty?, Cognition 95, 201-263. Quine, W.v.O.: 1960. Word and Object. Cambridge: MIT Press.
Representation in digital systems.

Vincent C. Mller (American College of Thessaloniki, Greece)
(Date of publication: 25 June 2007)
Abstract: There is much discussion about whether the human mind is a computer, whether a computer can have mental states at all, and whether at all physical entities are computers (pancomputationalism). I propose a criterion for which entities in this world are in digital states, and which of these are part of digital computers. A proper resolution requires a distinction of three levels of description (physical, syntactic, semantic) and the specification of what is a digital state on the syntactic level. On this basis, the proposed analysis is that a state is digital if and only if it is a token of a type that serves a particular function, typically but not necessarily a representational function for the system. Paper: 1. Motivations: The Computationalist Program, Artificial Intelligence, Pancomputationalism Given that the ontology of digital states is hardly an established philosophical problem, it will be useful to briefly motivate its discussion. A clarification as to what constitutes a digital state is necessary primarily in the context where digital states are part of a certain kind of digital systems, namely digital computers. There is a significant confusion over which objects in the world are computers because there is no agreement on the criteria; Shagrir calls this the problem of physical computation (Shagrir 2006, 394ff). For example, on the one hand there are the proponents of a computational representational theory of mind (CRM or computationalism) who believe that the human mind is a functional computational mechanism operating over representations. These representational abilities are then to be explained naturalistically, either as a result of information-theoretical processes (Dretske 1981; 1995), or as the result of biological function in a teleosemantics (Millikan 2005; Macdonald and Papineau 2006). On the other hand, the opponents of computationalism divide into two camps: those who think that some natural mechanisms may be computers, but the human mind is not one of these, and those who think that all systems can be interpreted as computers and so the human mind is just a computer like everything else: every natural process is computation in a computing universe (Dodig-Crnkovic 2007, 10) this position is now often called pancomputationalism. Finally, the question to what extent artificial intelligence (AI) is possible, requires an explanation what kinds of machines computers are and what they can do in principle given that digital computers are currently the main kind of mechanism that is used for AI. Several further problems for a computational theory of the mind would benefit from a resolution of what constitutes a digital state. Within the context of the discussion of the computationalism mental processes are traditionally understood as information processing through computational operations over representations. Is representation a necessary feature of computing? If yes, perhaps something can be called a digital state only on presupposing mental processes in the system, so there is a threat of a circle here (unless we are looking at a feedback circle). Another is the problem of grounding for computational systems: How can the meanings of the meaningless symbol tokens, manipulated solely on the basis of their (arbitrary) shapes, be grounded in anything but other meaningless symbols? (Harnad 1990, 335). We have argued in recent papers (Raftopoulos and Mller 2006a; 2006b) that a nonconceptual phenomenal content should be at the base of such grounding. If it were to turn out that such content is necessary but is analogue and cannot be present in purely digital systems, this would show that human cognition is not purely digital and that AI on purely digital computers is impossible. In separate work, we argue that nonconceptual content is precisely nondigital content.
I will argue in the following that being a digital state is to be a state of a type or category, but that we should not conclude from this that being a digital state is description-dependent. In particular, a state can be digital if it fulfills a particular function in a system of which it is a part e.g. a representational function. 2. Discrete vs. Continuous In a first approximation, being digital means being in a discrete state, a state that is strictly separated from another, not on a continuum. Prime examples of digital states are the states of a digital speedometer or watch (with numbers as opposed to an analog hand moving over a dial), the digital (binary) states in a conventional computer, the states of a warning light, or the states in a game of chess. Some digital states are binary, they have only two possible states, but some have many more discrete states, such as the 10 numbers of a digital counter or the 26 letters of the standard English alphabet. Goodman had pointed out in his early theory of representation that digital marks (physical entities) are differentiated, as he called it, precisely if they can have an exact replica: one can write the same letter A twice, since A is differentiated from any other letter. Analog marks, in contrast, are dense, meaning that for any pair of similar but non-identical marks, there is space for another mark in between (Goodman 1968; cf. Lewis 1971). So, the states of an analog speedometer with a hand moving in analogy to the speed of the vehicle are continuous, just as the speed it represents, and for any two places where the hand can be, there is a third in between. But which of the two characteristics is crucial for an analog state, the analogy to the represented, or the continuous movement? This question becomes relevant in the case of analogous representations that proceed in steps, e.g. a clock the hands of which jump from one discrete state to another. Zenon Pylyshyn argues that the underlying process is analog, and this is what matters: an analog watch does not cease to be analog even if its hands move in discrete steps (Pylyshyn 1984, 200; Shagrir 1997, 332 agrees). James Blachowicz also thinks that being on a continuum is sufficient for being analog, taking the view that differentiated representations may also be analog as long as they remain serial, his example is a slide rule with clicks for positions (Blachowicz 1997, 71). (Note how these authors assume a functional description, an issue to which we shall return later.) These views ultimately fail to differentiate between analogue and digital representations. Note that the very same underlying mechanism could give a signal to a hand to move one step and to a digit to go one up (this is actually how clocks are controlled in centralized systems, e. g. at railway stations). In any case, some classic examples of digital states are clearly in a series, indeed a series of infinitely many steps: the series of the natural numbers. These two points rule out Blachowicz proposal to take being serial as a criterion. Pylyshyn, on the other hand, would presumably say that the underlying mechanism is already digital, so the clock is digital in this case but surely there are systems where a digital signal is converted into an analogue one (the speedometer in most modern cars) and where an analogue signal is converted into a digital one (an analogue central clock that controls several digital clocks), so we should then say that the system has digital and analog parts. I conclude that the first crucial feature of a digital state is indeed that of being a discrete state not excluding that of being in a series, even in a series that is analogue to what is represented. 3. Multiple Realization As we already pointed out with reference to Goodman, it is characteristic of a digital mark that it can be realized several times. So, one can write the same word twice, even if one cannot make exactly the same mark on paper twice. John Haugeland usefully explains this phenomenon with games: chess is a digital game because we can reproduce an earlier position precisely, and we can resume the same game with different pieces. Billiards, on the other hand, is analogue, because we can reproduce an earlier position only to a certain degree of precision, and if we were to reproduce the same position with different physical objects, it would not be the same position (Haugeland 1985, 57; earlier in Haugeland 1981). The possibility of multiple realization is a result of the discrete states: since a white pawn in chess can be precisely on field C3, we can put it back on C3, or replace it with a different pawn; it does not matter that it is not identical to the earlier one, provided it is clearly a white pawn on C3.
4. Everything Is Analogue and Digital, too? A given blob of ink on a piece of paper might be in a particular digital state but it has several analogue properties, too, such as a color, a shape, a history, a value, etc. In fact, all digital states we have seen so far are states of physical entities, and thus have analogue properties as well. (For our purposes, we can leave aside the question whether abstract objects can be digital.) Being digital is a property of certain physical entities that are also analogue though they might not be analogue representations. But of which entities? Negroponte puts it nicely: A bit has no color, size or weight, It is a state of being: on or off, true or false, up or down, in or out, black or white. (Negroponte 1995, 14) But, which of the black things are bits? What determines whether something is a bit? It may seem that we can just define what counts as digital as we please, so everything is digital. Say, for example, the two of us agree that if I light a fire on a particular hill that means the King is out of town. Is the hill henceforth in a binary digital state? Is anything not in any number of digital states, then? For a given physical thing (say, my desk lamp), there are descriptions as continuous (where is the light, what is its shape, what its color?) and as digital (is it on/off?), so a natural response is to say that being a digital state is relative to a particular description: Under one description the light is digital, under another it is not, so we have at least a relativity of descriptions (Boghossian 2006, 29). This consequence is very tempting for digital computation and its algorithmic procedures. Alan Turing already seems to have already gone in this direction: The digital computers [] may be classified amongst the discrete state machines, these are the machines which move by sudden jumps or clicks from one quite definite state to another. [] Strictly speaking there are no such machines. Everything really moves continuously. But there are many kinds of machine, which can profitably be thought of as being discrete state machines. (Turing 1950, 439). John Searle takes it one step further: The electrical state transitions are intrinsic to the machine, but the computation is in the eye of the beholder. (Searle 2004, 64). Oron Shagrir concurs: to be a computer is not a matter of fact or discovery, but a matter of perspective (Shagrir 2006, 393), and about algorithms: whether a process is algorithmic depends on the way we describe the process. processes are not really step-satisfaction [algorithmic]. It is simply useful to describe them this way. Whether a system is digital depends not only on its natural properties, but chiefly on the context in which it is described. (Shagrir 1997, 321, 331, 335). It now seems that not only do we have a relativity of descriptions, but that a description dependence of facts: it would then be constitutive of being a digital state that its existence is dependent on contingent social interests, namely the interest in a particular feature that makes a digital state. To illustrate this with a classical example: Being digital is more like the word constellation than the word star. What is part of a stellar constellation depends on what we make part of it. What is a star depends on the world and is a matter of astronomic discovery (cf. Boghossian 2006, 18, 28; McCormack 1996) 5. Clarification I: Type/Token Understanding the true nature of relativity here requires some further clarifications. Haugeland defines as follows: A digital system is a set of positive and reliable techniques (methods, devices) for producing and reidentifying tokens, or configurations of tokens, from some prespecified set of types A positive technique is one that can succeed absolutely, totally, and without qualification; Many techniques are positive and reliable. Shooting a basketball at the basket is a positive method (for it getting through), for it can succeed absolutely and without qualification; (Haugeland 1985, 53f) Demopoulos thus calls being a digital mechanism of a certain type being a member of an equivalence class (Demopoulos 1987). Harnad talks about symbol tokens but not of types (Harnad 1990, 1.2). The characteristic of multiple realization (see above Error! Reference source not found.) is crucial, so there must be a positive technique to produce perfect realizations that are clearly of this digital state.
Multiple realization, however, this is not a feature of certain types, it is a feature of types, quite generally. For example, a transistor can be in a voltage state that is clearly of type on or off, but it can also be on the borderline between the two it just so happens that our computing machines are made with systems that do not usually get stuck in intermediate states. Every digital state is also on a continuum: the digital speedometer might change quickly from one number to another, but it does have intermediate states just that these are not states of numbers, not states of these types, of these descriptions. What is crucial here, therefore, is that a digital state is of a type. If it is of a type, then there can be multiple perfect realizations of it: No matter how many borderline cases a type happens to have (some have many, some have none), there is always the possibility of clear cases, and that is what is needed for being a digital state; we need to fulfill the implied semantic normativity of the token of a type. So, we require in a first instance that a digital state is a state that is a token of a type. What we need to see now is which tokens of a type are the digital states. 6. Clarification II: Levels of Description In a next step, it is helpful to differentiate at least three levels of description of a proposed candidate for being in digital or digital computational states: (a) physical, (b) syntactic and (c) semantic levels something only very few people do, despite the tradition of functionalism (Pylyshyn 1989, 57; Harnish 2002, 402f). The physical level (a) is that of the physical realization of the computation this is presumably what Searle had in mind with his electrical state transitions (above). That physical state is in (b) a particular digital state on the syntactic level (a binary state, or a number, a letter, a word). It is at this level that a particular mathematical function is computed, it is fully specified by specifying it on this level. (c) That digital state in turn may represent something else, e.g. a truth value, a time, or a color, let us call this the semantic level. What is represented at this level may, again, have representational functions on several levels (the color can represent a political opinion, etc.). A digital computer works because it is constructed in such a fashion that its physical states cause other physical states in a systematic way, and these physical states are also digital states on the syntactic level. The semantic level is not necessarily present and is not necessary for the digital system or digital mechanism. Contrary to popular belief, a computer does not require semantic content to function, (e.g. Boden 1990; 2006, 1414ff) and (Haugeland 1985, 66; 2002, 385). Given this clarification of levels, we can re-evaluate the understanding of digital states. The semantic level allows for a true relativity of facts, not just of descriptions: The same computer following the same algorithm can be said to compute different things. This is hardly surprising. For example, it may well be that what a computer does with the same binary sequence is to add two numbers or to change one letter to another. Whether we want to regard the binary sequence as the one or the other will depend on the context. So, these binary states can have any content; just like 2 + 2 = 4 can add apples or pears. This, however, does not make the addition itself stand in need for interpretation. Contrary to popular belief, it does not show that there is a relativity of facts on the syntactic level, on the level of digital states. 7. Clarification III: A Digital State is a Token of a Functional Type It is useful to note that not all systems that have digital states are digital systems. We can, for example, consider the male and female humans entering and leaving a building as digital states, even as a binary input and output, but in a typical building these humans do not constitute a digital system because a relevant causal interaction is missing. In the typical digital system, there will thus be a digital mechanism, i.e. a causal system with a purpose, with parts that have functions. Digital mechanisms in this sense may be artifacts (computing machines) or natural objects (perhaps the human nervous system). However, it seems clear that not all digital states are parts of computational systems: the words in this paper are digital states, but their function is not computational.
If being of a type was the criterion for being digital, then everything would be in any number of digital states, depending on how it is described. However, what we really should say is that something is digital because that is its particular function. My desk lamp is always in a digital state, because being on/off is part of its function. The first letter of this sentence is in the digital state of being a T because that is its function it is not an accidental orientation of ink or black pixels. The sun, on the other hand, is not in a digital state at present, though it can be shining or not shining at some place. We make artifacts where some physical states cause other physical states such that these are physical states of the same set of types, e.g. binary states. (Note that one machine might produce binary states in several different physical ways, e.g. as voltage levels and as magnetic fields.) If someone would fail to recognize that my laptop computer has binary digital states, they would have failed to recognize the proper function of these states for the purpose of the whole system namely what it was made for. The fact that a logic gate in my laptop is a binary state depends on whether it has that function and is not description dependent. (And the fact that it computes is crucial to its function, but not to that of, say, my shaving brush so pancomputationalism seems misleading here.) I conclude that we should say a state is digital if and only if it is a token of a type that serves a particular function. 8. Which States Are Digital? At this point, it is clear that the description dependence of being digital depends on that of having a function. Functions are a very large issue, let me just indicate why one might think that there may be some facts here that are not description dependent. In the case of an artifact, we assume a functional description. If the oil-warning light on a car dashboard is off, is it in a digital state? Yes, if its function is to indicate that nothing is wrong with the oil level. (It may serve all sorts of other accidental functions for certain people, of course.) But if the light has no electricity (the ignition is off), or if it was put there as a decorative item, then the lamp is not in a digital state off. It would still be off, but this state would not be digital, would not be a token of the same functional kind. In the case of a natural object, the allocation of proper function is dependent on teleological and normative description of systems (cf. Krohs 2007, esp. 2.2) a problematic but commonplace notion. The function of a humans legs seems to be locomotion (and kicking balls), but we are not tempted to say that the leg is in digital states, while perhaps the muscle cells are with respect to their function. Whether or not the legs are digital, they can be simulated (to an arbitrary degree of precision) on digital systems only that the simulation will not walk, it will just walk in the simulation. In the case of the human nervous system, there are the questions whether it is a digital system on the level of mental functions, and whether it is a digital system on the level of cell properties and interactions. Many neuroscientists think of the latter in digital computational terms. Computationalists think that representational function makes the mental level a digital computational system as well. Reproducing it in an AI computer system would thus yield mental properties. What I have tried to show here is that these are questions that deserve answers, not just decisions.
Acknowledgements I am grateful for audiences at the universities of Tbingen and Mlardalen for useful comments, especially to Alex Byrne and Kurt Wallnau. My thanks to Gordana Dodig-Crnkovic and Luciano Floridi for comments on an earlier draft. Thanks to Bill Demopoulos also.
References Blachowicz, James (1997) Analog Representation Beyond Mental Imagery, The Journal of Philosophy 94 (2): 55-84. Boden, Margaret A. (1990) Escaping from the Chinese Room, in Boden, Margaret A. (ed.), The Philosophy of Artificial Intelligence. Oxford: Oxford University Press, 89-104. (Original publication 1988.) (2006) Mind as Machine: A History of Cognitive Science. 2 vols. Oxford: Oxford University Press. Boghossian, Paul A. (2006) Fear of Knowledge: Against Relativism and Constructivism. Oxford: Oxford University Press. Demopoulos, William (1987) On Some Fundamental Distinctions of Computationalism, Synthese 70: 79-96. Dodig-Crnkovic, Gordana (2007) Epistemology Naturalized: The Info-Computationalist Approach, APA Newsletter on Philosophy and Computers 6 (2): 9-14. Dretske, Fred (1981) Knowledge and the Flow of Information. Cambridge, Mass.: MIT Press. (1995) Naturalizing the Mind. Cambridge, Mass.: MIT Press. Goodman, Nelson (1968) Languages of Art. Indianapolis: Bobbs-Merril. Harnad, Stevan (1990) The Symbol Grounding Problem, Physica D 42: 335-346. Harnish, Robert M. (2002) Minds, Brains, Computers: An Historical Introduction to the Foundations of Cognitive Science. Oxford: Blackwell. Haugeland, John (1981) Analog and Analog, Philosophical Topics 12: 213-226. (1985) Artificial Intelligence: The Very Idea. Cambridge, Mass.: MIT Press. (2002) Syntax, Semantics, Physics, in Preston, John and Mark Bishop (eds.), Views into the Chinese Room: New Essays on Searle and Artificial Intelligence. Oxford: Oxford University Press, 379-392. Krohs, Ulrich (2007) Der Funktionsbegriff in der Biologie, in Bartels, Andreas and Martin Stckler (eds.), Wissenschaftstheorie: Texte Zur Einfhrung. Paderborn: Mentis, forthcoming. Lewis, David (1971) Analog and Digital, Nous 5 (3): 321-327. Macdonald, Graham, and David Papineau, eds. (2006) Teleosemantics: New Philosophical Essays. Oxford: Oxford University Pres. McCormack, Peter, ed. (1996) Starmaking. Cambridge, Massachusetts: MIT Press. Millikan, Ruth Garrett (2005) Language: A Biological Model. Oxford: Oxford University Press. Negroponte, Nicholas (1995) Being Digital. New York: Vintage. Pylyshyn, Zenon W. (1984) Computation and Cognition. Cambridge, Mass.: MIT Press. (1989) Computing in Cognitive Science, in Posner, Michael I. (ed.), Foundations of Cognitive Science. Cambridge, Mass.: MIT Press, 49-91. Raftopoulos, Athanassios, and Vincent C. Mller (2006a) Nonconceptual Demonstrative Reference, Philosophy and Phenomenological Research 72 (2). (2006b) The Phenomenal Content of Experience, Mind and Language 21 (2): 187-219. Searle, John R. (2004) Mind: A Brief Introduction. Oxford: Oxford University Press. Shagrir, Oron (1997) Two Dogmas of Computationalism, Minds and Machines 7: 321-344. (2006) Why We View the Brain as a Computer, Synthese 153 (3): 393-416. Turing, Alan (1950) Computing Machinery and Intelligence, Mind LIX: 433-460.
Population Thinking, Darwinism, and Cultural Change

Peter Godfrey-Smith (Harvard University)
(Date of publication: 11 June 2007)
Abstract:The application of Darwinian ideas to culture is discussed. I distinguish between populational, Darwinian, and replicator-based views of cultural change. Darwinian and non-Darwinian forms of social learning are discussed in the context of game-theoretic models of behavior. I also defend the importance of reproduction to Darwinism, responding to arguments by Bouchard.
Paper: I. Introduction Recent years have seen a resurgence of interest in evolutionary models of culture. The picture envisaged in much recent work is something like this: the general capacity for culture presumably has a genetic basis and is an adaptation. The capacity for learning by imitation is crucial here (Tomasello 1999). But once this capacity is in place, cultural change tends to acquire its own dynamic, which has a partially Darwinian character (Richerson and Boyd 2004, Dennett 1995, Mesoudi et al. 2004). The most contentious versions of this idea posit discrete cultural replicators, which Dawkins (1976) called memes. These ideas have been discussed at several points in the webconference, for example in Bouchard's and Bryson's papers. Here I will offer some general ideas on the relation between evolutionary theory and cultural change. I argue for a framework that recognizes three nested categories: (i) theories that apply population thinking, (ii) theories that apply the concept of evolution by natural selection (a Darwinian dynamic), and (iii) theories that look for replicators. In the final section I will discuss some arguments in Bouchard's contribution to the webconference, as he sets things up quite differently. II. Population Thinking, Darwinism, and Replication The general picture I will defend can be represented in terms of three nested categories, describing theories of change.
Figure 1: Three categories
In setting up the broadest of the categories I draw on a concept due to Ernst Mayr (1959). He argued that a subtle but important innovation that can be associated with Darwin and his time is population thinking. This involves approaching a domain (the living world, in Darwin's case) in a way that recognizes the reality and causal importance of variation within populations, and avoids treating such
variation as imperfection in the worldly realization of ideal types. This is useful when thinking about the particular contrasts between Darwin and his precursors in biology, but the populational approach I have in mind here has some more concrete features as well. When we embark on population thinking, we think of a system as an ensemble of components that each have a degree of autonomy, a life (or something like a life) of their own, and a significant number of properties in common. Change at the level of the ensemble is a consequence of interactions within the population. I will use the term populational for a framework that applies ideas of that kind. So it contrasts not only with what Mayr called typological or Platonic approaches, but with various other explanatory strategies as well.
The second category I label Darwinian. Here I mean the Darwinian dynamic associated with change within a population by means of natural selection. Darwin described this process in fairly concrete terms, assuming units that were (almost always) organisms in the usual sense. Since then, there have been two ways of trying to abstract the core Darwinian idea, so it can be applied more broadly. One approach I call the classical tradition. The other is the replicator view.
The classical tradition dates at least to Weismann (1909), has perhaps its most-cited formulation in Lewontin (1970), and has also been expressed by many others [1]. The main idea is that we expect evolutionary change whenever we have a population in which there is (i) variation, (ii) which is responsible for differences between individuals in reproductive output, and (iii) which is heritable to some extent. The population here need not consist of ordinary biological individuals. It could consist of any entities at all for which the notion of reproduction is well-defined [2].
Heritability, in the relevant sense, is a statistical and comparative matter. We do not need copying or preservation of structure. Everyone in the population could be unique with respect to the evolving trait. Imagine an asexually reproducing population in which everyone has a different height, and these heights are evenly spaced without even a clumping into rough types. There is no copying of the distinctive properties of an individual in reproduction. But if parents and offspring are more similar than randomly chosen individuals, we have heritability and evolutionary change via selection is possible.
The second tradition of abstract description of Darwinism uses the idea of a replicator (Dawkins 1976, Hull 1988). Many definitions of a replicator take some notion of copying as a primitive. Dawkins said [w]e may define a replicator as any entity in the universe which interacts with its world, including other replicators, in such a way that copies of itself are made (1978:132). Hull defined a replicator as an entity that passes on its structure largely intact in successive replications (1988:408). This is somewhat metaphorical, but what is being described is in effect a special case of the phenomena recognized by the classical view. We have replicators when we have a population featuring highfidelity reproduction, and a single parent for each individual. That is indeed how genes are made; in the copying of DNA there is a parent molecule and an offspring molecule, and the copying is very high-fidelity. In many organisms, including us, low-fidelity sexual reproduction at the level of organisms is made possibly by high-fidelity asexual reproduction at the level of genes. The defenders of the replicator framework often seem to suppose that something like this is essential to evolution, so all evolution by natural selection has to involve replicators somewhere. But this is not true. What is needed for evolutionary change is heritability, which is a comparative matter, need not be high-fidelity, and which can, in principle, involve contributions from many parents. Evolution involving replicators is a special case [3].
If we apply this framework to the case of cultural change, we see that there are three questions to ask. To what extent should cultural change be treated as a populational phenomenon? To what extent should it be treated as a Darwinian phenomenon? Is it possible and useful to recognize cultural replicators?
It might initially seem obvious that cultural change is a populational phenomenon, but in the present sense this claim is far from trivial. Although cultural phenomena of all kinds clearly depend on the activities of individuals who make up populations, it is possible for a population to generate products that are not best treated in populational terms. Many cultural phenomena are like this. Persisting community-level artifacts like buildings and roads are the consequences of activities of a population, but once they exist their ongoing role is not populational in character. In fact, structures like these may affect behavior in ways that reduce the populational character of social life. Culture is populational to the extent that it can be modeled informatively as interaction between partially autonomous individuals who share a significant number of properties. A highly structured network with heterogeneous and non-interchangeable parts is a different thing from a population. This is part of what Fracchia and Lewontin are pointing at in their vigorous rejection of evolutionary models of culture (1999), though they express the point in different terms. But these anti-populational observations should not be made in a wholly general way. Simpler forms of culture may have a more populational character than more complex and refined forms. An initial populational mode of interaction may give rise to something else.
There is also a related point that is simpler. Societies with top-down control are less amenable to a populational treatment. The ideas that proliferate are those that come from a certain location in the society, regardless of the content and their local consequences. In many cultures, personal fashion choice is a fairly populational phenomenon, but in a sufficiently autocratic society, it will not be. (Gold teeth are presently banned in Tajikstan, solely because of the whim of an autocrat.) Ken Reisman argued in his PhD dissertation (2005) that Darwinian models of culture become inapplicable to the extent that power relations are asymmetric. I am broadening that claim to one about populational models in general.
But suppose that some cultural phenomena are populational. This might include changes in patterns of individual everyday behavior (eating, cooperating, communicating). Are such processes also Darwinian? There are two ways that such processes might be treated in a Darwinian manner.
First, we might treat the individuals in the population as ordinary biological individuals, and treat cultural properties as aspects of phenotype. This will work easily to the extent that cultural transmission is vertical, from parent to offspring. The reproduction relation remains an ordinary biological parenting relation. It will often be that offspring culturally resemble their parents more than they resemble randomly chosen members of the population, and differential reproduction can then yield evolutionary change. Things get much complicated to the extent that cultural transmission is not vertical, but I won't worry about those problems here.
The second option is to treat the instances of cultural traits as making up their own Darwinian population. The reproduction relation is now between the instances of a cultural trait; your best friend's Catholicism might be said to be the parent of your Catholicism. Does it make sense to use the concept of reproduction in this way? I think it makes some sense in some cases. In general, the notion
of reproduction is a vague and gradient one. Paradigm Darwinian processes involve populations with clear reproduction relations between individuals, but there are also processes that have some Darwinian characteristics because the population exhibits something like a reproduction relation. That applies to both biological and cultural cases; the lines in Figure 1 should be understood as somewhat blurred, rather than sharp. Reproduction involves the generation of a new entity of the same kind as the parent(s), with a certain kind of causal responsibility from parent to offspring. Some ways in which a cultural variant can be passed on or reappear in a new individual are reproduction-like, but this depends on the cognitive processes involved. The simplest kinds of imitation have some of this character, though there are interesting disanalogies even then [4]. For example, in imitation learning, it is the recipient's dispositions that are causally responsible for there being a similarity between a parental and an offspring instance of a trait, though the particular features of the offspring are then a function of the parent. And as Sperber (1996) and others have argued, most social learning is not passive in the ways that I here associate with a reproduction relation. But some simple forms of social learning might have enough of a reproduction-like relation between instances of a trait for a Darwinian framework to be applied. As emphasized earlier, the claim that there are cultural replicators is a stronger one again.
In assessing these ideas it is useful to focus on recent formal models of behavioral change via social learning (Skyrms 2003, Nowak 2006). These are models of behavioral change in which individuals interact locally, receive payoffs, and update their behaviors as a function of their experience. These models make use of a number of different update rules and dynamics. They are all rules in which an individual derives its phenotype from local influences, via a function of some kind. But having one's behavior be a function of the attributes of one's neighbors(s) is not the same thing as having one's behavior be a copy of some neighbor. The latter is a special case of the former.
We might represent some of this structure as follows. Suppose that via observation an individual is to set the value of some behavioral characteristic Z for the next time step, Z(t+1). Z is a continuous variable (though it might be the probability of making some binary choice, such as cooperation as opposed to defection). Assume the individual has n neighbors, where neighbor i's behavior at t is represented as Xi(t). Z(t+i) will be a function of the phenotypes of the neighbors, their payoffs (Wi) at t, and the individual's previous state Z(t) and payoff V(t). So in general, Z(t+1) is some function of the following variables: (X1(t), X2(t),... Xn(t), W1(t), W2(t),... Wn(t), Z(t), V(t)). There are many possible rules, but some of them can be represented like this:
(1)
Z(t+1) = (Z(t)) + (X*) + (1- - )(i(Xi))/n
Here X* is the behavior of the neighbor with the highest payoff at t, or the behavior of the focal individual at t if its payoff was higher than any neighbor's. The weights and sum to no more than one. The idea is that any individual's new behavioral choice can be sensitive to (i) what it did last time, (ii) the recent success of behaviors exhibited by neighbors, and (iii) the local prevalence of those behaviors. An individual can give some role to inertia, some to tracking what has recently worked, and some to doing what is common. The and parameters reflect how much weight is given to each factor. So when =1 we have an imitate your best neighbor dynamic; when =1 the individual never changes.
It would be interesting to see what the consequences are of various intermediate values of and , in different contexts. Equation (1) also omits another way for success to figure in behavioral updating, which is via a best response rule. An individual can produce on the next time step the behavior that would have been the best overall response to the behaviors produced by its neighbors last time. This is another way in which behavior can be a function of what was done earlier, but it is certainly not intrinsically Darwinian. In some cases the best response to X is X (coordination games) but in others the best response to X is Y (eg., the hawk-dove game).
This process of behavioral change by individuals could operate on top of an ordinary Darwinian process involving biological reproduction. An individual will then be born with an initial behavior, and will update it according to some specific rule. The learning rules (eg., and ) could then evolve. When does an individual do best to stick with the behavior it inherited from its (evidently) successful parent, and when does it do best to adjust in the light of experience? If so, how quickly should it adjust? And should it adjust by simple imitation, by success-modulated imitation (imitate your best neighbor), by an even smarter best response rule, or by a non-social rule such as trial-and-error? [5] This gives the model a link to the larger literature on the evolution of learning (Stephens 1991, Godfrey-Smith 1996, Kerr forthcoming), and the evolution of imitation learning as opposed to other kinds (Richerson and Boyd 2004). In this scenario, Darwinian processes give rise to a rule for social learning, and this rule may or may not have a partially Darwinian character itself. The rules that produce a partially Darwinian dynamic will be, as argued above, the simple forms of imitation. [6] Darwinian processes may produce non-Darwinian forms of social learning.
III. Origin Explanations and Persistence In this final section I will contrast part of the discussion above with some claims made in an interesting earlier paper from this webconference, by Frdric Bouchard.
Bouchard claims that it is mistake to see all Darwinian processes as involving reproduction or replication. Differential persistence is sufficient for Darwinian change, and once the notion of persistence is suitably broadened, it may be the key to understanding all cases of change by the Darwinian mechanism. Bouchard then notes that this move may help us understand cultural change, because many cultural cases are like the biological ones in which reproduction is problematic but persistence is not. To assess reproductive output, we must be able to count offspring, which is hard in some biological cases and in many cultural ones. It is more straightforward to assess differential persistence.
In the framework outlined above, pure cases of differential persistence count as populational phenomena, but not Darwinian ones. This is because I required reproduction for Darwinian change. (Note that when Bouchard claims that some of his phenomena do not involve populations, this is because he is assuming that all populations include reproduction. So his population concept is narrower than mine.)
I think that Bouchard is right to press the importance of cases (both in biology and culture) in which reproduction is a difficult concept to apply. So I will say something about why I set things up differently from him.
If we have a set of objects which do not reproduce in any sense, and some process then acts as a filter that eliminates some and retains others, this is certainly a Darwin-like process. Standard definitions of evolution by natural selection handle the case awkwardly. Some (Lewontin 1980) explicitly require reproduction. Some (Lewontin 1970) are ambiguous. And others, such as a definition in terms of change in gene frequencies would include such cases. As a matter of terminology, both broader and narrower uses of the key terms could be defended. On the narrower use, differential persistence is only part of a genuine Darwinian process, not sufficient alone.
But the matter is not merely terminological. Here is one argument for resisting Bouchard's proposal to make persistence primary, and treat reproduction as optional or as a special case of persistence. Let us distinguish two kinds of explanations in which natural selection can figure, Distribution explanations and Origin explanations. (These terms are modified from Karen Neander (1995), who distinguished Creation and Persistence explanations in a debate with Elliott Sober. I also broaden both her categories here.) In a Distribution explanation, we assume the presence of some range of variants, and explain how they came to be distributed as they are why some are common, others rare, why one has gone to fixation, or why an equilibrium is being maintained. In an Origin explanation, we explain how some particular variant came to exist in the population at all, regardless of its frequency and regardless of which individuals bear the relevant trait. It is obvious that natural selection figures in Distribution explanations, less obvious but very important that it can (in some circumstances) figure in Origin explanations. Suppose we want to explain how some novel adaptive characteristic came to appear. Novelty arises proximally, in an evolutionary context, via mutation and recombination. But natural selection can reshape a population in a way that makes a given combination of characteristics much more likely to be produced via mutation and recombination, than it would otherwise be. It does this by making intermediate stages common rather than rare, thus increasing the number of ways in which a given mutational event (or similar) will suffice to produce the combination in question. (See Forber 2005 for a discussion that clarifies and isolates this role particularly well.)
We can then note that if we have no reproduction in a population, then although there can be the kind of filtering or culling that has a partially Darwinian character, we cannot have the kind of natural selection that is involved in Origin explanations. All that can happen is that pre-existing types are retained, or not retained. The long-lived entities may still change via developmental processes; a population that has no reproduction need not be static. (This is what Lewontin 1983 called a transformational mode of evolution.) But the particular way in which natural selection can reshape a population in a way that makes otherwise improbable new variants accessible is a process that requires reproduction.
I can think of some replies that might be available to Bouchard here. He might argue that even if novelty is only arising by developmental processes that do not involve reproduction, still differential persistence may be important in Origin explanations. The longer an entity persists, the better chance it has of taking a developmental path that leads to a given novel state. That effect does seem real, but less powerful than the distinctive way in which selection is relevant to Origin events in the normal Darwinian cases. In those cases, the idea of reproduction of countable offspring is essential, because the role of selection is to increase the number of independent slots, often by a very large factor, at which a novel variant can arise. (And I think that Bouchard's interesting example involving mud does have reproduction in this sense.)
What does this mean for culture? It might turn out that real cases of cultural change are only Darwinian in the (for me) very marginal sense that does not involve reproduction. If it is also true that this precludes selection from figuring in Origin explanations, then selection would have a much less important role in cultural cases than it has in biological ones. The other possibility, discussed in the previous section, is that some cases of cultural phenomena can be usefully treated as Darwinian in the richer sense that does involve reproduction. References Bouchard, F. 2007. Ideas that Stand the [Evolutionary] Test of Time. A&R webconference: http://www.interdisciplines.org/adaptation/papers/12 Bryson, J. 2007. Representational Requirements for Evolving Cultural Evolution. A&R webconference: http://www.interdisciplines.org/adaptation/papers/13 Dawkins, R. 1976. The Selfish Gene. Oxford: Oxford University Press. Dawkins, R. 1978. Replicator Selection and the Extended Phenotype. Zeitschrift fr Tierpsychologie 47:61-76. Reprinted in E. Sober (ed.) Conceptual Issues in Evolutionary Biology. Cambridge MA: MIT Press. Dennett, D. 1995. Darwin's Dangerous Idea. New York: Simon and Schuster. Forber, P. 2005. On the Explanatory Roles of Natural Selection. Biology and Philosophy 20: 329-342. Fracchia, J. and Lewontin, R. C. 1999. Does Culture Evolve? History and Theory 38: 52-78. Godfrey-Smith, P. 1996. Complexity and the Function of Mind in Nature. Cambridge: Cambridge University Press. Godfrey-Smith, P. 2000. The Replicator in Retrospect. Biology and Philosophy 15: 403-423. Godfrey-Smith, P. (forthcoming). Conditions for Evolution by Natural Selection. Journal of Philosophy. Hull, D. 1988. Science as a Process. Chicago: Chicago University Press. Kerr, B. (forthcoming). Niche construction and cognitive evolution. Lewontin, R. 1980. Adaptation. Reprinted in R. Levins and R. Lewontin, The Dialectical Biologist. Cambridge: Harvard University Press, pp. 65-84. Lewontin, R. C. 1970. The Units of Selection. Annual Review of Ecology and Systematics 1: 1-18. Lewontin, R. C. 1983. The Organism as the Subject and Object of Evolution. Scientia 118: 63-82. Mayr, E. 1959. Typological Versus Population Thinking. In Evolution and Anthropology: A Centennial Appraisal. Washington: Anthropological Society of Washington. Mesoudi, A., A. Whiten and K. Laland 2004. Perspective: Is Human Cultural Evolution Darwinian? Evidence reviewed from the Perspective of the Origin of Species. Evolution58: 1-11. Neander, K. 1995. Pruning the Tree of Life. British Journal of Philosophy of Science 46:59-80. Nowak, M. 2006. Evolutionary Dynamics: Exploring the Equations of Life. Cambridge MA: Harvard University Press. Reisman, K. 2005. Conceptual Foundations of Cultural Evolution. PhD Dissertation, Philosophy Department, Stanford University. Richerson, P. and R. Boyd. 2004. Not By Genes Alone: How Culture Transformed Human Evolution. Chicago: University of Chicago Press. Sperber, D. 1996. Explaining Culture: A Naturalistic Approach. Oxford: Blackwell. Skyrms, B. 2003. The Stag Hunt and the Evolution of Social Structure. Cambridge: Cambridge University Press. Stephens, D. 1991. Change, Regularity, and Value in the Evolution of Animal Learning. Behavioral Ecology 2: 77-89. Tomasello, M. 1999. The Cultural Origins of Human Cognition. Harvard: Harvard University Press. Weismann, A. 1909. The Selection Theory. In A. C. Seward, (ed.), Darwin and Modern Science. Cambridge: Cambridge University Press.
[1] Weismann: We may say that the process of selection follows as a logical necessity from the fulfillment of the three preliminary postulates of the theory: variability, heredity, and the struggle for existence (1909:50). Thanks to Lukas Rieppel for this reference. [2] For a detailed discussion of these summaries, see Godfrey-Smith (forthcoming). There are a number of exceptions to the general claim that change can be expected when those three conditions hold. [3] For more detail here, see Godfrey-Smith (2000). [4] What I call simple imitative learning here counts as a fairly sophisticated skill in the sense of Bryson's taxonomy. [5] Bryson emphasizes the downside of simple imitative learning in her paper: where cultural evolution exists, it must coevolve with a set of constraints that damp its effects on the society and its ecosystem.... [I]f you are in a room with other people, look around yourself. Would it be a good idea if all of you converged on identical behaviour right now?
[6] There is also a non-social sense in which trial-and-error learning has a Darwinian character, as has long been acknowledged.
Representational Requirements for Evolving Cultural Evolution

Joanna Bryson (Department of Computer Science, University of Bath, UK)
(Date of publication: 28 May 2007)
Abstract: Why are humans the only species exhibiting exponentially accumulative culture? Language obviously facilitates this process, but language is also an example of an accumulated cultural artifact, one far more elaborate and complex than any other evolved signalling system. We now know other species do regularly exploit culturally-transmitted behaviours, so the basic capacity of social learning is present in these species, and is further proved adaptive at least in limited forms. In this article I propose that for most species the adaptive rate of cultural evolution is bounded by ecological pressures, but that in the case of humans a uniquely rich representational substrate allowed the evolution of intricate norms and behaviours. This allows cultural evolution to find even complex sustainable behavioural strategies.
I. Introduction Why are humans the only species exhibiting exponentially accumulative culture? Language obviously currently facilitates this process, but language is also an example of an accumulated cultural artifact. The development or evolution of language at best may have co-evolved with our cultural acquisition capacity, but probably this capacity must have preceded it. We know other species regularly exploit socially-transmitted behaviour (Franks and Richardson, 2006; van Schaik et al., 2003; Perry et al., 2003; Galef Jr. and Laland, 2005; de Waal and Johanowicz, 1993; Whiten et al., 1999). Thus the basic capacity of social learning is present in these species, and further has proved adaptive at least in limited forms. In this article I review the representations underlying social learning. I then propose that for most species the adaptive rate of cultural evolution is bounded by ecological pressures, but that in the case of humans a uniquely rich representational substrate has allowed the coevolution of intricate norms and behaviours. This allows human cultural evolution to find sustainable behavioural strategies. II. Representations Underlying Social Learning Learning is easy. Assuming that all you mean by `learning' is changing values inside a representation. Constraining learning so that it does something useful is the hard problem. This was the conclusion of Marler (1991) after he examined the surprising diversity of mechanisms that have evolved to satisfy one relatively simple problem: the transmission of birdsongs between individuals of a species. This has also been the experience of artificial intelligence (Bishop, 2006). The lesson from the current emphasis in machine learning on Bayesian statistics is that the hardest part of learning is establishing a characterisation of the learning space (Wolpert, 1996; Chater et al., 2006). In Bayesian terms, the problem is selecting an appropriate class of models for the learning domain, part of the task of establishing an appropriate set of prior probabilities. These results of the mathematical analysis of the complexity issues of the learning problem may be used to explain the fact that the vast majority of learning in nature is carefully specialised to task (Gallistel et al., 1991; Roper, 1983). The results further suggest that for species that do possess some general learning capacity, the probability of an individual stumbling across a useful piece of knowledge within its lifetime is not necessarily high. Where an individual agent, either animal or artificial, does have the capacity for general learning, it may very well be in its interest to learn knowledge that has proven useful to other similar agents. This line of reasoning has lead to the recent surge in interest in culture and social learning.
This section begins with a basic taxonomy of social learning. From this I will derive which representations are needed to explain the various components of such learning. This will begin to differentiate the capabilities of different species, and help explain what determines the fidelity or granularity of behaviour replication. This will in turn help return us to the question of what makes human culture different. Decomposing Social Learning This is a simplified taxonomy of the forms of generic social learning exhibited in nature. For more elaborate taxonomies and more complete descriptions, see Zentall (2001) or Whiten (2006). In the below descriptions, the model is another agent that already holds and expresses a behaviour being socially learned. Social facilitation: The increased propensity to express an already known behaviour when others express it. The classic example is yawning. However, this can also lead to learning to express a behaviour in a particular context. Local enhancement: An agent acquires a propensity to be in a particular area, which in turn (and in combination with other species-specific biases) leads to their displaying a similar behaviour. For example, an agent that follows another into a patch of novel food may discover subsequently that the food is edible just through random exploration. This is an example of social learning where a new behaviour is learned, but not directly from the model agent. Rather, a small amount of information from the model facilitates individual learning by the agent. Stimulus enhancement: An agent becomes interested in an object another agent has acted upon, and in the course of exploring that objects discovers affordances known to the previous agent, thus now expressing a similar behaviour. Goal emulation: An observing agent notices a model has accomplished something interesting, and acquires the goal of accomplishing the same thing. Again, with enough species-specific and/or environmental constraints, the end behaviour itself may be quite similar, or the agent may find quite a different way of achieving the same goal. But the observing agent's new behaviour would have been very unlikely to be expressed without the observation of the model's achievement. Program-level imitation: Postulated originally by Byrne (1995) and supported by Byrne and Russon (1998), program-level imitation is the acquisition of sequential or even hierarchical `plans' organising actions into complex behaviours. Byrne and Russon (1998) give the example of an orangutan living near a camp that begins doing laundry. This is also sometimes referred to as `staged emulation', because the individual actions are not necessarily learned new, but rather the combination of the actions are associated with each other and with a set of stimuli. Gestural or vocal imitation: Precise imitation of continuous manual or vocal gestures. This is closest to the ordinary-language meaning of imitation, such as copying an accent or repeating an exact verbal phrase or posture. Here I have not addressed the highly specialised, species-specific, evolutionarily `ritualised' forms of learning, such as tandem running in ants or imprinting in hatchling birds. Although in reality, the extent of general-purpose learning is often overestimated (Gallistel et al., 1991; Roper, 1983). Even simple stimulus-response conditioning does not work for all stimuli to all responses. Pigeons can learn to peck for food, but cannot learn to peck to avoid a shock. They can, however, learn to flap their wings to avoid a shock, but not for food (Hineline and Rachlin, 1969). Similarly, rats presented with `bad' water learn different cues for its badness depending on the consequences of drinking it. If drinking leads to shocks, they condition to visual or auditory cues, but if drinking leads to poisoning they learn
taste or smell cues (Garcia and Koelling, 1966). Such examples indicate learning biases in the brain, at the level of associative learning. Also often overlooked are biases provided in terms of motor and perceptual capacities. If an animal cannot perceive something (e.g. colour), it may be because no chance mutation has ever lead to that capacity, but it may also be because colour perception adds no net value to that animal's behaviour repertoire, but might distract it with irrelevant detail. Primitive Elements of Social Learning Computer scientists -- including those who build Artificial Life (ALife) and Artificial Intelligence (AI) -often speak of `primitives`. Primitives are the fundamental components, the building blocks of behaviour. Like atoms, these primitives are themselves constructions (e.g. of neural coding). But any particular discussion requires basic units at some level of abstraction. In ALife or AI the primitives are built in conventional computer code. The intelligence of the system then must express them in reasonable contexts and orders. There is some evidence that brains also work this way, with complex gestures and stimuli being represented (and even generated) by single nerve cells (Perrett et al., 1987; Rizzolatti et al., 2000; Graziano et al., 2002). In an attempt to understand the representations underlying social learning, I will begin by defining a few primitive elements or actions. Notice that not all of these primitives will necessarily appear in the final theory. Rather, I am starting with a set of primitives I believe underly common theories of social learning, but I will not necessarily support them. Context Identification The learning necessary to recognise a particular stimuli or, more likely, stimulating situation. This is a form of perceptual memory. It cannot be a simple retinotopic map e.g. to remember an image, since exact visual context matches are exceedingly rare. Rather, it must be sufficiently abstract to generalise. Goal Mapping is attribution to another agent of a particular aim, desire or intent. It is generally believed that such goals can only be identified through being mapped mapped to a similar sort of aim, desire or intent of the observing agent. E.g.: Maybe she did that because she was hungry (like me). Action Mapping is the association between a behaviour or behaviour element of the observed animal with a similar behaviour within the repertoire of the observer. To keep things simple, we take `behaviour' in a very general sense here, including perceptual acts such as focusing attention necessary to a task, as well as gross motor movement. Body Mapping is the identification of a particular body part of an observed agent to a corresponding body part of the observer. Coordinate Mapping is the identification of a particular location in space with respect to the observed agent to the equivalent egocentric-space coordinate for the observer. Notice that the preceding definitions of elements specific to social learning necessarily imply a set of representational primitives: contexts, goals, actions, body parts, and coordinates. In addition, we might also assume the presence of several more general-purpose abilities: the ability to associate two primitives, for example a context to a goal, the ability to chain two items, for example two sequential steps in a procedure, the ability to heighten attention to particular context, and the ability to desire (acquire) a new goal.
Again, I am not proposing that all these capacities are available in all (or even any) agents capable of social learning. I am claiming that these capacities are needed in order to display all the forms of social learning mentioned in the original taxonomy of social learning. Analysing the Taxonomy To begin with, there is no social learning without individual learning. In fact, social learning can be seen as a special case of individual learning -- a set of evolved biases for acquiring information by exploiting the knowledge of others (Bryson and Wood, 2005; Wood and Bryson, 2007). Social facilitation, location enhancement and stimulus enhancement are very little more than individual learning. Location and stimulus enhancement assume context identification, plus either an association with an established behaviour or the individual learning of a new behaviour, either of which occurs as a consequence of being attentive to the location or stimulus. These forms of social learning in no way assume goal, action or body mapping. Social facilitation requires no learning at all, although it may result in learning that increases the probability of associating some context with some known action, in the case where the social facilitation keeps happening in the same context. An example of this case of learning resulting from social facilitation might be the gradual social tuning of the context in which innate warning cries are expressed by vervet monkeys (Seyfarth et al., 1980). Goal acquisition through emulation might seem as simple as stimulus enhancement, since it might also require the acquisition of a single primitive element, the goal. However, motivations are fundamental to an agent's intelligence, and it is not easy to see how a totally new goal would be incorporated into an agent, with its associated drives and emotions. Goal emulation may be more like operant conditioning. An action or a perceptual context might become identified with a pre-existing drive, and thus become desirable itself. This reduction can be applied to simplify or eliminate goal mapping as a primitive. Goal emulation could be accounted for through action mapping, with the additional recognition or association of the observer's own desire to its perception of the target's action. If the two animals are in a similar state, whether due to shared history (e.g. a troop hasn't eaten yet today) or shared responsiveness to a perceptual context, then the probability of sharing a drive may be high enough for reasonably accurate learning to occur. At its simplest then, goal emulation might be viewed as the association of a behaviour to a context, where that context is some combination of a perceptual context and an internal drive. Put even more simply, it is socially acquired stimulus and response. Program-level imitation (or staged emulation) is essentially a set of goal emulations -- or a structured association of contexts to actions. The extent of this structure is much debated. It is tempting to take what appears to be the simplest explanation, and assume that simply associating sufficient perceptual context (perhaps including recent memory of prior events) to action responses will allow you to have an otherwise undifferentiated set of stimulus-response pairs to form the representation for learning a new task. However, this is not what humans or animals appear to do. In extensive experimentation with modelling human learning, Anderson et al. (1997) determined that intelligence driven by senseaction pairs requires specification of a subset of pairs to be active in a particular task context. Even within the task-specific subset, they also require each pair to be associated with a probability for being useful, referred to as a utility value. In my own research, I have found evidence that even this amount of information is not sufficient. Rather than probabilities of success, accurate representations of priority of one task-element over another are needed to guarantee task consummation. We have evidence that this better describes the behaviour of monkeys at least (Bryson and Leong, 2006; Wood et al., 2004), as well as being a useful representation for organising artificial intelligence (Bryson and Stein, 2001; Bryson, 2003). There is also evidence of neural representations for meta-level task information such as order in a
sequence (Tanji, 1996). Whiten (1998) has reported that chimpanzees not only imitate hierarchical behaviour, but do so more accurately on subsequent trials if the demonstration is repeated. This increase of fidelity -- essentially moving from goal emulation to program-level imitation -- might result from better learning the affordances of the task facilitating lower-cost representation and thus easier learning. Or there may be a social drive to emulate with more care when prompted by a repeated demonstration. However, these increasing-fidelity results have not yet been well-supported through replication, although the hierarchical structure of social task learning has (Whiten et al., 2006). If there are limits to the number of discrete task steps that can be imitated programmatically, then this indicates that gesture imitation (by which I include vocal-gesture imitation) may require a completely different representation. One could imagine that gestures could be extended sequences of many body or coordinate mappings. However, it is well-established that there is no neurological means by which rapid sequences of action expression can be launched independently each in response to the other (Lashley, 1951; Henson and Burgess, 1997; Davelaar, 2007). In other words, muscle firing cannot be integrated. I believe the capability for high-fidelity, temporally-accurate gesture imitation may be the key to the puzzle posed in the introduction to this article -- why human culture is different, at least from other primates'. I will explain how and why below. But for now, I will stick to the issue of representations. The relevant data here is that there is no evidence primates other than humans have the representation necessary for gesture or vocal imitation (c.f. Fitch, 2000, for the vocal case in particular). But if apes are not capable of full gesture imitation, how can they perform do as I do tasks? These involve imitating the gestures of a demonstrator (normally human) such as clasping one's self, or jumping up and down (Custance et al., 1995). These sorts of imitation certainly do require some kind of body mapping, and a process of action sequencing. But because chimpanzee and human bodies are similar, it may be that a very low resolution representation of the body configuration at the start and end points of the demonstration is sufficient to generate comparable actions within the tolerance required by those coding this research (see Custance et al., 1995, for further discussion). When species are less-closely related, less careful body mapping is sometimes demonstrated (Custance et al., 1999). Even in human children, precise body mapping is only followed when the children assess it to be an important part of the demonstration (Gergely et al., 2002). III. What makes humans different? I now return to the question of why humans are unique in having exponentially accumulating culture. My explanation hinges on a difference in representational capacities. Underlying Representations of Social Learning The evidence of the previous section leads to the following conclusions about representation: The majority of social learning observed in nature does not require complex information such as exact locations, temporal scripts, or even the number of iterations involved in steps with distinct cycles. Rather it can be summarised as learning salient contexts, optionally paired with learning actions appropriate to those contexts. Species that perform precise vocal (or other gestural) imitation may require a different, specialist representation to encode temporal `scripts' with rich information. As discussed above, research into sense-action pairings as a basis for intelligent action is extensive in both AI and Cognitive Science. Forming and ordering these pairings may be a function of the
hippocampus (Bryson and Leong, 2007). The special case of vocal imitation in songbirds (and also parrots) has been the subject of extensive neuroscience research (see e.g. Leonardo, 2004). The upshot seems to be that a special neurological substrate is required, and it is not capable of learning and production at the same time. Since vocal imitation is not a common trait in nature, we must assume it evolves independently where it emerges (Marler, 1991), so bird results cannot necessarily be generalised to species like humans. I am not aware of similar neuroscience explanations of human vocal imitation. But because this representation may be a key to our cultural difference, I will review what information I have been able to find, which is largely due to Pppel. Pppel (1994) documents a privileged representation of phrases, within which humans are capable of precise temporal memory and replication. These have a maximum duration of two to three seconds -- the exact duration seems to be under intelligent (though not deliberate) control and is situationappropriate. That is, we tend to remember salient phrases of speech, music or gesture with appropriately-lengthed memory. The maximum possible duration of such episodes is presumably a cognitive constraint. Pppel draws attention to the fact that most poetry and music consists of phrases of this length. Implications for Cultural Evolution That humans have this extra capacity while other primates do not is probably an accident of sexual selection (Vaneechoutte and Skoyles, 1998). As mentioned earlier, this accident may have provided us with a representation suitable for a memetic cultural-evolution explosion. Because so much more information is stored in the three seconds of detailed transcription than in the simple context-action pairs underlying programme-level imitation, knowledge represented in this domain can be highly redundant. This redundancy in turn can provide robustness where important data is stored, protecting it from an unsupervised process like evolution, and enabling operations analogous to cross-over and mutation to begin a full Darwinian process. I am discussing support and implications of this idea elsewhere (Bryson, 2008). In the introduction of this article I mention another, currently unsupported hypothesis I am just beginning to research. This is that cultural learning is rare not because the mechanisms of learning required for an individual learner are difficult to evolve in themselves, but because of the impact on the ecological and social system supporting the learners. I believe that while cultural evolution has the potential to be a powerful means to search for new and more optimal behaviour; where cultural evolution exists, it must co-evolve with a set of constraints that damp its effects on the society and its ecosystem. In the basic cases, this is obvious if you think about it -- if you are in a room with other people, look around yourself. Would it be a good idea if all of you converged on identical behaviour right now? The crossover equivalent in social learning, the mechanism of recombining good tricks (Dennett, 1995) from other conspecifics, must include a mechanism for maintaining diversity and supporting individual survival. One can also think about this in terms of longer-term consequences, such as population bubbles when a new, rich food source is discovered then driven to extinction. Open Research Problems Let us take the perspective that social learning is a risky strategy prone to positive feedback cycles that could result in a population's extinction, and that can only be stabilised with carefully co-evolved limits and damping mechanisms. Given this perspective, we can argue that we do see cumulative cultural evolution in a number of species (Franks and Richardson, 2006; van Schaik et al., 2003; Perry et al., 2003; Galef Jr. and Laland, 2005; de Waal and Johanowicz, 1993; Whiten et al., 1999). It just isn't accumulating as quickly as human culture, partially because the rate of change is actively
damped by biological evolution. This perspective leads to a number of open research questions, including: Why do species capable of cultural evolution have such extended periods of development? I suspect that a long development period is necessary for any species that learns novel behaviour (and so is a candidate for cultural evolution) because the individual experiences must be carefully integrated back into the general knowledge set. Development, with its different phases of learning, may be a key form of the biological damps for cultural evolution. Why do primates learn to recognise behavioural patterns more quickly than they learn to express them? This phenomena, also described as looking vs. knowing, has been well-documented in infants (e.g Hood et al., 2000; Spelke et al., 1992) and monkeys (e.g. Santos and Hauser, 2002). If my hypothesis is correct, then the explanation here is that new, relatively uncertain knowledge can be used to inform choices in observation, but should not be used to inform action until properly processed and integrated. Why are humans the only species that have language and rapidly accumulating cultural evolution? My hypothesis here is that because we are the only primate capable of transmitting precise temporal scripts (e.g. through vocal imitation), we are the only species likely to transmit sufficiently rich information socially to keep rapid cultural change relatively stable. Thus the limits and damping systems can be customised and transmitted memetically, with the behaviour, rather than having to be entirely biological or genetic. IV. Summary In this article I have proposed a very simple taxonomy of representational substrates for animal social learning: contexts, context-action pairings, and (for a very few species) short temporally-precise scripts of actions. I have also looked at the implications of this, both in relating social learning to wellknown individual task-learning representations in AI and Cognitive Science, and for explaining why the sort of exponential growth in culture seen in human society is not witnessed so far in other species. Acknowledgements: Thanks to Andy Whiten and Mark Wood for many discussions of social learning and its representations. References Anderson, J. R., Matessa, M. P., and Lebiere, C. 1997. ACT-R: A theory of higher level cognition and its relation to visual attention. Human Computer Interaction, 12(4), 439-462. Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer, London. Bryson, J. J. 2003. Action selection and individuation in agent based modelling. In Sallach, D. L. and Macal, C., editors, Proceedings of Agent 2003: Challenges in Social Simulation, pages 317-330, Argonne, IL. Argonne National Laboratory. Bryson, J. J. 2008. Embodiment vs. memetics. Mind & Society. Accepted for publication. Bryson, J. J. and Leong, J. C. S. 2007. Primate errors in transitive ìnference': A two-tier learning model. Animal Cognition, 10(1):1-15. Bryson, J. J. and Stein, L. A. 2001. Architectures and idioms: Making progress in agent design. In Castelfranchi, C. and Lesprance, Y., editors, The Seventh International Workshop on Agent Theories, Architectures, and Languages (ATAL2000) . Springer. Bryson, J. J. and Wood, M. A. 2005. Learning discretely: Behaviour and organisation in social learning. In Demiris, Y., editor, Third International Symposium on Imitation in Animals and Artifacts, pages 30-37, Hatfield, UK. The Society for the Study of Artificial Intelligence and the Simulation of Behaviour.
Byrne, R. W. 1995. The Thinking Ape. Oxford University Press. Byrne, R. W. and Russon, A. E. 1998. Learning by imitation: a hierarchical approach. Brain and Behavioral Sciences, 21(5):667-721. Chater, N., Tenenbaum, J. B., and Yuille, A. 2006. Probabilistic models of cognition: Conceptual foundations. Trends in the Cognitive Sciences, 10(7):287-291. Custance, D. M., Whiten, A., and Bard, K. A. 1995. Can young chimpanzees (pan troglodytes) imitate arbitrary actions? Hayes and Hayes (1952) revisited. Behaviour, 132:837-859. Custance, D. M., Whiten, A., and Fredman, T. 1999. Social learning of an artificial fruit task in capuchin monkeys (cebus apella). Journal of Comparative Psychology, 113(1):13-23. Davelaar, E. J. 2007. Sequential retrieval and inhibition of parallel (re)activated representations: A neurocomputational comparison of competitive queuing and resampling models. Adaptive Behavior, 15(1):51--71. de Waal, F. B. M. and Johanowicz, D. L. 1993. Modification of reconciliation behavior through social experience: An experiment with two macaque species. Child Development, 64:897-908. Dennett, D. C. 1995. Darwin's Dangerous Idea. Penguin. Fitch, W. T. 2000. The evolution of speech: A comparative review. Trends in Cognitive Sciences, 4(7):258-267. Franks, N. R. and Richardson, T. 2006. Teaching in tandem-running ants. Nature, 439(7073):153. Galef Jr., B. G. and Laland, K. N. 2005. Social learning in animals: Empirical studies and theoretical models. BioScience, 55(6):489-499. Gallistel, C., Brown, A. L., Carey, S., Gelman, R., and Keil, F. C. 1991. Lessons from animal learning for the study of cognitive development. In Carey, S. and Gelman, R., editors, The Epigenesis of Mind, pages 3-36. Lawrence Erlbaum, Hillsdale, NJ. Garcia, J. and Koelling, R. A. 1966. The relation of cue to consequence in avoidance learning. Psychonomic Science, 4:123-124. Gergely, G., Bekkering, H., and Kirly, I. 2002. Rational imitation in preverbal infants. Nature, 415:755. Graziano, M. S. A., Taylor, C. S. R., Moore, T., and Cooke, D. F. 2002. The cortical control of movement revisited. Neuron, 36:349-362. Henson, R. N. A. and Burgess, N. 1997. Representations of serial order. In Bullinaria, J. A., Glasspool, D. W., and Houghton, G., editors, Proceedings of the Fourth Neural Computation and Psychology Workshop: Connectionist Representations, London. Springer. Hineline, P. N. and Rachlin, H. 1969. Escape and avoidance of shock by pigeons pecking a key. Journal of Experimental Analysis of Behavior, 12:533-538. Hood, B., Carey, S., and Prasada, S. 2000. Predicting the outcomes of physical events: Two-year-olds fail to reveal knowledge of solidity and support. Child Development, 71(6):1540-1554. Cerebral mechanisms in behavior. John Wiley & Sons, New York. Leonardo, A. 2004. Experimental test of the birdsong error-correction model. Proceedings of the National Academy of Science, 101(48):16935-16940. Marler, P. 1991. The instinct to learn. In Carey, S. and Gelman, R., editors, The Epigenesis of Mind, pages 37-66. Lawrence Erlbaum, Hillsdale, NJ. Perrett, D. I., Mistlin, A. J., and Chitty, A. J. 1987. Visual neurones responsive to faces. Trends in Neuroscience, 9:358-364. Perry, S., Baker, M., Fedigan, L., Gros-Louis, J., Jack, K., MacKinnon, K., Manson, J., Panger, M., Pyle, K., and Rose, L. 2003. Social conventions in wild white-faced capuchin monkeys: Evidence for traditions in a neotropical primate. Current Anthropology, 44(241-268). Pppel, E. 1994. Temporal mechanisms in perception. International Review of Neurobiology, 37:185202. Rizzolatti, G., Fogassi, L., and Gallese, V. 2000. Cortical mechanisms subserving object grasping and action recognition: A new view on the cortical motor functions. In Gazzaniga, M. S., editor, The New Cognitive Neurosciences, chapter 38, pages 538-552. MIT Press, Cambridge, MA, second edition. Roper, T. J. 1983. Learning as a biological phenomenon. In Halliday, T. R. and Slater, P. J. B., editors, Genes, Development and Learning, volume 3 of Animal Behaviour, chapter 6, pages 178-212.
Blackwell Scientific Publications, Oxford. Santos, L. R. and Hauser, M. D. 2002. A non-human primate's understanding of solidity: Dissociations between seeing and acting. Developmental Science, 5. In press. Seyfarth, R. M., Cheney, D. L., and Marler, P. 1980. Monkey responses to three different alarm calls: Evidence of predator classification and semantic communication. Science, 14:801-803. Spelke, E. S., Breinlinger, K., Macomber, J., and Jacobson, K. 1992. Origins of knowledge. Psychological Review, 99:605-632. Tanji, J. 1996. Involvement of motor areas in the medial frontal cortex of primates in temporal sequencing of multiple movements. In Caminiti, R., Hoffmann, K., Lacquaniti, F., and Altman, J., editors, Vision and Movement: Mechanisms in the Cerebral Cortex, volume 2, pages 126-133. Human Frontier Science Program, Strasbourg. van Schaik, C. P., Ancrenaz, M., Borgen, G., Galdikas, B., Knott, C. D., Singleton, I., Suzuki, A., Utami, S. S., and Merrill, M. 2003. Orangutan cultures and the evolution of material culture. Science, 299(5603):102-105. Vaneechoutte, M. and Skoyles, J. 1998. The memetic origin of language: modern humans as musical primates. Journal of Memetics -- Evolutionary Models of Information Transmission, 2(2). Whiten, A. 1998. Imitation of the sequential structure of actions by chimpanzees ( pan troglodytes). Journal of Comparative Psychology, 112:270-281. Whiten, A. 2006. The dissection of imitation and its `cognitive kin' in comparative and developmental psychology. In Williams, S. R. . J. H. G., editor, Imitation and the development of the social mind: Lessons from typical development and autism, pages 227-250. Guilford Press, New York. Whiten, A., Flynn, E., Brown, K., and Lee, T. 2006. Imitation of hierarchical action structure by young children. Developmental Science, 9(6):574-582. Whiten, A., Goodall, J., McGew, W. C., Nishida, T., Reynolds, V., Sugiyama, Y., Tutin, C. E. G., Wrangham, R. W., and Boesch, C. 1999. Cultures in chimpanzees. Nature, 399:682-685. Wolpert, D. H. 1996. The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7):1341-1390. Wood, M. A. and Bryson, J. J. 2007. Skill acquisition through program-level imitation in a real-time domaiEEE Transactions on Systems, Man and Cybernetics Part B--Cybernetics, 37(2):272-285. Wood, M. A., Leong, J. C. S., and Bryson, J. J. 2004. ACT-R is almost a model of primate task learning: Experiments in modelling transitive inference. In The Annual Meeting of the Cognitive Science Society (CogSci 2004) , pages 1470-1475, Chicago. Lawrence Erlbaum Associates. Zentall, T. R. 2001. Imitation in animals: Evidence, function, and mechanisms. Cybernetics & Systems, 32(1):53-96.
Content From Development

Nicholas Shea (Faculty of Philosophy, University of Oxford)
(Date of publication: 14 May 2007)
Abstract: Most human mental representations arise as a result of development. These developmental processes depend on rich interactions with the thinkers environment. Are the details of those interactions part of what makes it the case that a particular representation has the content it does? For example, humans probably have a face recognition mechanism that allows us, on seeing some person X for a short time, subsequently to recognise X by their face. Plausibly, this piece of development results in a representation R which is about X (and not look-alikes) in virtue of it being X (rather than look-alikes) who interacted causally with the developmental mechanism that produced R. If that is right, we can ask why. Millikan appeals to derived proper functions; Papineau to development being a selectional process similar to evolution. This web paper examines whether content is indeed fixed by circumstances of development and, if so, whether we should be applying an adaptation-based framework.
I.
Introduction
Most human mental representations arise as a result of development. These developmental processes depend on rich interactions with the thinkers environment. This paper argues that the details of those interactions are often part of what makes it the case that a particular representation has the content it does. Section 2 motivates the idea with examples from the development of relatively low-level psychological capacities. Section 3 considers human learning, using examples to draw out the reason why content is partly fixed by circumstances of development. The final section relates this claim to Laurence & Margolis important paper arguing for the importance of development (Laurence & Margolis 2002). II. Examples: low-level development
This subsection gives examples of four learning systems in animals where, intuitively, the end state representation refers to the thing encountered during the development of that end state. These mechanisms are also likely found in relatively low-level human psychology. The first is imprinting. That is the process in which a newly born animal learns to behave in a special way towards a parent: to follow it around, demand food, etc. Lorenz famously demonstrated the phenomenon by leaving his rubber boots for young geese to see as they hatched. They would then faithfully follow him around the town [1]. The circumstances in which imprinting will occur, and its behavioural consequences, have been extensively described in chicks (Bateson 1966). The mechanism seems to give rise to a new representation: the chick comes to identify and keep track of something new, and behave in various ways in relation to it. The object first presented is clearly part of the cause of this representational development. The representation also seems, intuitively, to refer to that object: it is supposed to keep track of the object first seen [2]. A second example is provided by the cognitive maps that some animals develop as a result of experiencing a local environment (Pearce 1997, pp. 203-214). For example, rats can learn the layout of a maze of platforms hidden underwater, or an array of objects hidden around a room. There is good evidence that this representation is stored in so-called place cells in the hippocampus (OKeefe & Nadel 1978). The new representation is caused by the spatial layout of the environment in which it developed; and, intuitively, represents it. A particular rats cognitive map seems to be about its learning environment, and not other places that happen to have the same geography, or in which the rats map-guided actions would turn out to be successful.
A third illustration is aversion learning. This is the striking phenomenon in which an animal will avoid a food if the taste of it is followed by sickness [3]. As in classical conditioning, an UC, sickness, comes to be associated with a CS, the taste. However, unlike classical conditioning, the learning occurs after only one trial, and the aversive stimulus need not be paired in time with the taste, but may occur several hours later. The substance with that taste is part of the cause of the new aversion [4]. And the new disposition is, intuitively, an aversion to that substance. To test this, consider an animal with its taste buds subsequently reversed by some physiological re-wiring. The animal would then avoid the wrong things that is, its aversive representation continues to refer to that which originally caused it. Finally, consider a regular case of classical conditioning: learning to identify a foodstuff by sight as well as taste. In primates, it seems this is achieved in part by the development of neurons with finer sensitivity in the orbitofrontal cortex (Rolls & Treves 1998, pp. 155-159). The new representation is caused to develop by the foodstuff with that smell and taste. Plausibly, the referent of the representation is that foodstuff [5]. None of these examples is revolutionary. Doubtless, many other theories could make good claims for rival content assignments. However, the examples have a common thread, which suggests a special role for an ontogenetic factor, both as the causal source of a new representation, and as its referent. More modestly, they illustrate that it is at least plausible that the circumstances in which a representation developed constrain the content that is to be ascribed to it. III. Examples: human learning
There is strong evidence that humans have a specialised capacity for recognising faces. [6] The first indications came from the existence of patients with a selective deficit in the ability to recognise faces, prosopagnosia (Sacks 1985). There are now several converging lines of evidence that face recognition is performed by a dedicated system in the brain. Neuropsychological studies show that damage to a specific brain area is associated with severe prosopagnosia. That has been confirmed with functional imaging, and by electrical measurement and stimulation inside the brains of epileptic patients. [7] The area specific to faces is called the fusiform face area, located near the junction of the brains occipital and temporal lobes (although many other brain areas are also involved in processing faces, including prefrontal areas). On experiencing a novel face a person develops the ability to produce a new representation (which is at least partly located in the fusiform face area, and is distributed across that area), which she employs in recognising that face in the future. It seems obvious that this mechanisms function is to enable people to recognise each other by their faces. It is part of the way that humans keep track of conspecific individuals. So the representation refers to an individual: the person who caused that representation to develop (call him S, for source). A different individual, experienced in unusual visual conditions, could later cause the same representation to be tokened. It would then misrepresent (that is S would be false). Similarly with lookalikes. We use face recognition to build up a body of knowledge about how we should act towards a person, and about what he will do. It would be a mistake to project these expectations across to a different individual who happened to look very similar. It is not superficial similarity that grounds the projection of attributes from occasion to occasion. It is the fact of encountering the same individual on each occasion (since many attributes of an individual person are stable over time). And the source of that mistake would be a false representation. The error would start when seeing the look-alike and thinking that is S. The content of that thought is false because the face-tracking representation refers to the original individual, and not anyone else. Similar considerations have been used in the broader context of the philosophy of language to argue that causal history partly determines the content of proper names (Kripke 1972). However, the conclusion is more compelling in the case of face recognition, both because the phenomenon is simpler and better-described, and because the correct answer is more obvious. The evidence is overwhelming that the ability to recognise a particular face arises only as a result of experience, and is implemented by means of an internal representation. It is then hard to resist the conclusion that the particular circumstances in which one of those abilities develops the person you see when you learn to recognise someone new partly determine the content of the resultant representation.
The same thing occurs in higher level cognitive systems. Since these systems are less well described and understood, the content ascription is correspondingly more contentious. I will take as an example the acquisition of concepts of natural kinds. There is good evidence that the way children categorise changes dramatically as they grow up. Even when newborn they can keep track of objects, by trajectory and number.[8] Ingenious experiments based on violation of expectancy [9] show that babies soon come to differentiate solid objects from portions of stuffs [10], and then begin to track objects by category (e.g., animate vs. inanimate) until they can eventually differentiate objects at the level of natural kind terms: by species, etc. [11] By the age of 2-3 years children can categorise a wide range of objects on the basis of what they look like and what they do: their characteristic features. But then there is a dramatic shift. Children stop relying upon a wide range of characteristic features and shift to a smaller core of defining features as the basis for their category judgments (Keil, 1989). This shows up in overt category judgements, and in the range of new exemplars to which children will project existing known properties. It is also found implicitly in the way that children project what they learn about things one can do with members of the category [12]. By 4-5 years old children are very good at penetrating beneath surface appearances (Gelman & Wellman, 1991). Their judgements come to be based more on objects insides [13] or, for animals, their lineage [14]. Most strikingly, this characteristic to defining shift [15] is much more pronounced in relation to natural kinds than artefacts [16]. With artefacts, there is a more subtle shift towards greater reliance on an objects function for categorisation. For my purposes, the importance of the developmental studies is to show that an explanation of childrens deployment of concepts must advert to more than surface appearances. (There is a separate debate about whether children have essentialist beliefs, which is not directly relevant except if one holds that the means of identification associated with a concept determine its content.) If a theorist is to explain the patterns of behaviour of older children and adults, she cannot base her explanation only upon the ways that objects appear. As the experiments show, it is the reidentification of something underlying that explains how the subject will act on a new instance, and which further properties they will project it to (for example, if the original instance tasted sweet, when a further sample is classified under the same concept C the subject may project the property of sweetness and so think that is a C, that is sweet in relation to the new instance). What is the referent of such a concept? Lets answer that by asking what it takes for a subject to be getting it right when he uses the concept in relation to a new instance. The answer is that he must be right that the new object has the property which he projects to it, or affords the action that he performs on it. For such projections to be justified, there must be something in virtue of which the instance shares those properties with the original samples that he learnt about. Notice that, to be useful, the property / affordance projected must go beyond the way that the new instance is identified as falling under the category. Suppose you had to check that a fruit was red, round, crisp and tasty before classifying it as an apple. Then inferring from thats an apple to thats tasty, while justified, would not tell you anything new. So the relevant underlying feature must give rise to both the properties used to identify instances and to the non-apparent properties that can thereby usefully be projected. So here is the picture: concepts of natural kinds are employed to project useful properties and affordances from learning samples to novel instances. For that to work, novel instances must fall within the same category as the learning samples, where membership of that category is the causal source both of the properties the thinker relies upon to identify an instance as falling under the concept, and of the properties a thinker thereby projects to those new instances [17]. Reference depends upon the samples the thinker experienced when he originally developed a concept for the category. The reference is some feature of those samples which allows him to project knowledge about the original samples to new instances. Given original samples of a different kind, but with the same surface features, the causal basis for the projection of properties would be different, so the referent would be different. When he uses the new representation in respect of an instance of a different category (e.g., he thinks that is an apple, eat it, on seeing a wax apple), then the error consists in identifying the wrong thing something that does not share a projective ground with the original learning samples. The fact that this new use is a misrepresentation shows that the natural kind concept is tied to the learning samples. Thus, its referent depends in part upon the particular circumstances in which the concept developed. The picture I have painted is closely related to Millikans (2000) theory of substance concepts. In particular, I draw from her the idea that use of these sorts of concepts depends upon projecting learned properties to novel instances. That entails that members of the category share some
underlying ground that is the causal source of the projected properties. It is these underlying grounds which Millikan calls substances: they are the causal source of the co-projection of a variety of properties over instances. However, I rely on developmental considerations more explicitly than Millikan does. I use the idea that a new substance concept will develop as a result of experience of samples of the substance. Then the reference of the concept will depend upon the identity of those learning samples. Millikan can allow something similar. Her substance concepts are abilities to identify substances. The reference of the concept is given by its natural purpose: the function of the ability is to identify some particular substance, and that substance is the referent of the concept. Millikans natural purposes are given by natural selection. However, most identification abilities have not evolved directly, but are produced in the course of experience by relational mechanisms which have evolved to produce such abilities [18]. Their functions derive from the function of the learning mechanism. The function of the learning mechanism is relational: to produce new abilities that function thus and so. The new abilities so produced derive their function thereby. I emphasise a further feature which Millikan can readily accept: when a general learning mechanism operates in a particular situation to produce a new substance concept, features of that situation determine the function of that ability, and hence the content of the concept. That is perfectly compatible with Millikans theory of relational and derived functions, and may even follow from it. If so, Millikans theory of substance concepts also supports my claim that the samples which are experienced when a new natural kind concept develops partly constrain the content of that concept. IV. Development constraining content
In an important paper Laurence & Margolis (2002) argued for a weaker claim: that an adequate theory of the content of mental representations must be compatible with plausible accounts of how those mental representations arose in psychological development. Their project is to reject Fodors strong concept nativism. They start by re-construing Fodors nativism as a challenge: how can primitive (i.e., unstructured) representations be learned? Lacking an answer, Fodor concludes they must be innate. Laurence & Margolis disagree. They argue that there are plausible theories of the acquisition of new primitive representations; i.e. accounts that do not require the new representations to be structured out of existing ones. They take the learning of new natural kind concepts as an example, and work through an empirically justified account of their acquisition [19]. The challenge is to fit the development with the theory of content. Laurence & Margolis take content to be determined by information connections (they work with Fodors asymmetric dependence theory of content). The developmental account must show how the end state comes to display the appropriate features, so that content is appropriately determined by the theory of content. The challenge is to demonstrate compatibility between means of acquisition and the theory of content. It is not simply that it would be nice to have an account of how the representational states are acquired. The constraint is stronger. An adequate theory of content must be compatible with the appropriate content-determining factors being acquirable, according to plausible accounts of development, based on the best empirical evidence. Of course, one way that the theory of content could be compatible with the developmental story is if developmental circumstances partly determine content. That is my claim. The thrust of Laurence & Margolis argument comes close to that stronger suggestion: For the present purposes, however, the crucial point we want to emphasize is that questions about the nature of concepts are intimately bound up with questions about how they are acquired. (...) So even with primitive concepts, an investigation into how they are acquired seems likely to say quite a lot about their nature. (Laurence & Margolis 2002, p. 50.) I agree that the nature of representations is intimately bound up with how they are acquired. That intimacy, I suggest, may be reflected in their contents, such that a representation would not have the content it does if it had not been acquired in the circumstances it was. Where I disagree with Laurence & Margolis, however, is with their assumption that the development of the vehicles of content is less problematic. They assume that potential vehicles of content are
available, the properties of which can be adjusted in content-relevant ways, so that a vehicle comes to have the features which determine its content appropriately (in Laurence & Margolis case, being the appropriate informational relations). Thus, as part of their account of the acquisition of natural kind concepts, they say: She sees a new object that has features that suggest that it is a natural object of some sort. upon encountering the item, the child releases a new mental representation and begins accumulating information about the object and linking this to the representation. (Laurence & Margolis 2002, p. 42, italics added.) More likely, part of the process of developing a new concept is to develop a new item which can be the vehicle of that content. Laurence & Margolis agree that the representation has to develop properties appropriate to its content. What they miss is that this very process may be what differentiates the representation into a new vehicle type. If so there is good reason to add to the scope of Laurence & Margolis claim. Not only must a theory of content be consistent with a semantic account of representational development. It must also be consistent with an account of the development of the vehicles of content. Indeed, the two may be inseparable. In cases where content is partly determined by the circumstances of development of a new representation type, the two together furnish a substantive developmental constraint on an adequate theory of content.
References Bateson, P. P. G. 1966. The characteristics and context of imprinting. Biological Review 41: 177-220. Carey, S. and F. Xu. 2001. Infants' knowledge of objects: beyond object files and object tracking. Cognition 80: 179-213. Cohen, J. & F. Tong. 2001. The face of controversy. Science 293: 2405-407. Gelman, S. and H. Wellman. 1991. Insides and essences: early understandings of the non-obvious. Cognition 38: 213-244. Huntley-Fenner, G., S. Carey and A. Solimando. 2002. Objects are individuals but stuff doesn't count: perceived rigidity and cohesiveness influence infants' representation of small groups of discrete entities. Cognition 85: 203-221. Kanwisher, N. 2000. Domain specificity in face perception. Nature Neuroscience 3: 759763. Keil, F. C. 1989. Concepts, Kinds and Cognitive Development. Cambridge, MA, MIT Press. Kripke, S. 1972. Naming and Necessity. Oxford, Blackwell. Laurence, S. and Margolis, E. 2002. Radical Concept Nativism. Cognition 86: 25-55. Mandler, J. M. and McDonough, L. 1998. Studies in inductive inference in infancy. Cognitive Psychology 37: 60-96. Millikan, R. 1984, Language, Thought and Other Biological Categories. Cambridge, Mass: MIT Press. 2000. On Clear and Confused Ideas. Cambridge, Cambridge University Press. 2002. Biofunctions: Two Paradigms. In: Cummins, Ariew and Perlman (eds.), Functions: New Readings in the Philosophy of Psychology and Biology. Oxford: O.U.P. O'Keefe, J. and Nadel. L. 1978. The hippocampus as a cognitive map. Oxford, Clarendon Press. Pearce, J. M. 1997. Animal Learning and Cognition. Hove, Psychology Press. Rolls, E. and Treves, A. 1998. Neural Networks and Brain Function. Oxford, OUP. Rose, S. 1992. The Making of Memory. London, Bantam Press. Sacks, O. 1985. The Man Who Mistook His Wife for a Hat. New York, Summit Books. Shepherd, G. M. 1994. Neurobiology. 3rd edition. Oxford, OUP. Soja, N., S. Carey and Spelke, E. 1991. Ontological categories guide young children's inductions on word meaning: object terms and substance terms. Cognition 38: 179-211.
[1] Rose, 1992, p. 58.
[2] Those in a theoretical frame of mind might dispute this. Doesnt the representation refer to the chicks mother, whatever it was hapless enough actually to imprint on? My use of the example relies on a more nave intuition. [3] Shepherd, 1994, p. 633-634. [4] If the sickness were paired with no CS, then no new aversion would arise. [5] This example is more controversial. Perhaps the animal has an existing representation of that foodstuff, and has simply learnt to distinguish it in a greater variety of circumstances. That interpretation is resisted if several different foods share the associated taste, since the new representation will be specifically sensitive to the food with the relevant appearance. Even so, this is a case where different theoretical perspectives will motivate different content assignments. It is less clear here that one option is more intuitive than all the others. [6] Kanwisher, 2000. [7] Cohen & Tong, 2001 summarises the evidence. [8] Carey & Xu (2001). [9] This is operationalised as looking time, graded from videos by nave independent observers. Some critics object to the assumption that increased looking time implies violation of expectancy. However, what is important is the existence of statistically significant differences in looking time, demonstrating that the babies differentiate the situations, however we choose to describe it. [10] Soja, Spelke and Carey, 1991; Huntley-Fenner, Carey & Solimando, 2002.[11] Carey & Xu, 2001. [12] Mandler & McDonough, 1998. [13] Gelman & Wellman, 1991. [14] Keil, 1989.
[15] Keil & Batterman, 1984.

[16] Keil, 1989. [17] These properties need not apply to all category members, but only to arise reliably enough from category membership to be useful. [18] For more detail on the theory of relational and derived functions see Millikan 1984, pp. 39-50 and 2002. [19] Their account relies upon the kind of evidence mentioned in the previous subsection, so I largely agree with it. However, they suggest that natural kind concepts require essentialist conceptions. They need the essentialism because of their commitment to Fodors asymmetric dependence theory of content. The essentialist disposition makes it the case that causal relations between non-referents and the concept are asymmetrically dependent on the causal relation between the referent and the concept. The position presented here differs in three respects: the chosen theory of content, the reliance on thinkers conceptions as content-determining and the resultant view that essentialism is indispensable.
An Evolutionary Solution to the Radical Concept Nativism Puzzle

Murray Clarke (Concordia University in Montreal, Canada and Carleton Universitys Institute of
Cognitive Science in Ottawa, Canada)
(Date of publication: 23 April 2007) Abstract:

In The Language of Thought, Jerry Fodor infamously argued for radical concept nativism by suggesting that all of our primitive lexical concepts are innate. In Concepts, he defends informational atomism and rescinds radical concept nativism by offering a noncognitivist, metaphysical argument that is intended to show that we acquire or lock to concepts that are neither learned nor innate. I offer an evolutionary version of informational atomism in the context of Cosmides and Toobys evolutionary psychology and the massive modularity hypothesis. Like Fodor, I argue that primitive concepts are, neither learned nor innate, but 'acquired.' Unlike Fodor, I argue that such terms were 'acquired' in the Pleistocene environment of evolutionary adaptation (EEA). In The Language of Thought (1975), Fodor infamously argued for radical concept nativism by suggesting that all of our primitive lexical concepts are innate. Call this the Radical Concept Nativism Puzzle. I will offer an evolutionary solution to this puzzle as it applies to natural and perceptual kind concepts. In Radical Concept Nativism, Laurence and Margolis describe Fodors acquisition puzzle this way: 1) 2) 3) 4) 5) 6) 7) Apart from miracles of futuristic super-science all concepts are either learned or innate. If theyre learned, they are acquired by hypothesis testing If theyre acquired by (non-trivial) hypothesis testing, theyre structured. Lexical concepts are not structured. So lexical concepts arent acquired by hypothesis testing. So lexical concepts arent learned. Therefore, lexical concepts are innate.
The thought that lies behind the third premise of the argument is that in typical cases of concept learning, the experimenter has a concept in mind and the subject is asked to sort objects in terms of whether they are instances of a novel concept, say, a flurg. The subject subsequently frames inductive hypotheses concerning the objects and might decide that a flurg is a circle and only later discover that flurgs possess the individuating conditions of objects that are green. But if this is the case, Fodor notes, then the subject is not learning what a flurg is, since the subject already possesses the concept green. Hence, concept learning is not possible. An alternative is to conclude that concept learning involves complex concepts, or concepts with internal structure, that are constructed out of primitive concepts. The result is that genuine learning can take place since the complex concept was not represented in the evidential base, but assembled out of primitive constituent concepts. According to Fodor, the sort of structure found in such complex concepts is definitional structure. But the history of lexical concepts is that they are not characterizable in terms of necessary and sufficient conditions. Second, non-definitional accounts of internal structure, such as Prototype Theory, fail because their constituents fail to compose. According to Prototype Theory, concepts have statistical structure. Hence, complex concept A has prototype structure if its constituents express properties that things that fall under A tend to have. Take the complex concept Pet Fish, e.g., goldfish. Its constituents, Pet and Fish have prototypes, dog and trout, that do not produce the goldfish prototype. Hence, a prototype theory of complex conceptual structure fails to satisfy the compositionality constraint, a constraint that Fodor has repeatedly, and I think persuasively, claimed is nonnegotiable for any adequate theory of concepts. It follows that lexical concepts are not learned by virtue of their structure because concepts have no structure, nondefinitional or definitional. If that is so, then concepts are not acquired by hypothesis testing and not learned. If primitive concepts are not learned then they are innate. That is the concept acquisition puzzle. The argument is plausible, yet the conclusion is deeply counterintuitive so few accept it. In his more recent book, Concepts (1998), Fodor has rescinded radical concept nativism.
Fodor wants to avoid an inductivist, cognitivist solution to the concept acquisition puzzle because that would be circular in the sense that hypothesis testing requires that one already have the concept in question if the concept is a primitive one. To avoid that result, Fodor offers a noncognitivist, metaphysical solution. As Fodor says: My story says that what doorknobs have in common is being the kind of thing that our kind of minds (do or would) lock to from experience with instances of the doorknob stereotype [Fodor, 1998, 137]. For Fodor, there is nothing cognitive, intentional, or evidential about the locking relation. One simply resonates to the doorknob property by virtue of having the sort of mind that we do. Fodors claim is that primitive concepts are acquired due to a metaphysical locking relation that is noninductivist or noncognitive or nonpsychological. Such primitive concepts as DOORKNOB are not innate. What is innate are the nonpsychological mechanisms that cause us to experience things the way we do. Hence, the sensorium is innate. So, there is no reason to think that the acquisition story is in the domain of cognitive neuropsychology (as opposed, as it were, to neuropsychology tout court) [Fodor, 1998, 143]. The problem with Fodors position, as Laurence and Margolis rightly point out, is that it is entirely unenlightening to cite an unknown neurological mechanism to explain mental phenomena and so Fodor has no adequate concept acquisition story. He rejects an inductivist account and the metaphysical story only provides the logical preconditions for an account of concept acquisition. More recently, Margolis (1999) and Laurence and Margolis (2002) have provided an alternative account. Laurence and Margolis provide a cognitivist, inductivist causal theory of content solution to the concept acquisition puzzle. They believe that Fodors original asymmetric dependence version of the causal theory of content can provide the proper starting point, ironically, to develop an account of concept acquisition. The causal theory of content has it that in cases where a predicative expression (Deer) is thought of an object of predication (a deer), the symbol tokenings denote their causes, and the symbol types express the property whose instantiations reliably cause their tokenings [Fodor, 1987, 99]. In successful cases, my uttering Deer says of a deer that it is one. The idea was that reliable causation would be counterfactual supporting in the sense that the property deer does, and would, cause the tokening of deer. The idea behind the causal theory is that such nomological relations determine the semantic interpretation of mental symbols. The central problem for the causal theory of representation was to give an account of misrepresentation. A problem for many solutions to the misrepresentation problem is the disjunction problem. According to the crude causal theory of representation, Ds reliably cause tokenings of D. Hence, the condition governing what it means for D to be represented by D is identical to the condition for such a token being true. If so, it is impossible to get falsity into the picture. One might think that D-caused D tokenings are veridical and E-caused D tokenings are unveridical. But this fails since the existence of E-caused D tokenings establishes the fact that the causal dependence of D tokenings on Ds is imperfect. It follows that Ds are reliably caused by (D or E). But if D expresses the property (D or E), then E-caused D tokenings are veridical and we have no account of misrepresentation. That is the disjunction problem. Fodor thinks that the right approach to the disjunction problem involves the counterfactual properties of the causal relations between different mind/world tokenings. He argues that falsehoods are ontologically dependent on truths but not vice-versa. That is, one can only confuse a deer with an elk once one has the concept of a deer. Hence, since deer does mean deer, the fact that deer cause one to say deer does not depend on any semantic relation between deer tokenings and elks. False or wild tokening can now be picked out in terms of the necessary condition: E-caused D tokenings are wild only if they are asymmetrically dependent upon non-E-caused D tokenings. Fodors asymmetric version of the causal theory of content was specifically designed to handle his view that primitive lexical concepts have no internal structure. That is, concepts are not definitions or prototypes but are the result of having a representation that stands in a certain causal dependency mind-world relation. No specific piece of information that people associate with dogs via the concept DOG, say, is actually constitutive of the concept DOG though much information may be associated with our concepts. All that is essential to a concepts content are the dependency relations that a concept bears to things in the world. Fodor takes it to be a cardinal virtue of his account that people may associate wildly different, false or incomplete, information with a concept and yet possess the same concept. According to Laurence and Margolis (2002) the key to concept acquisition is the notion of a sustaining mechanism. As they say: A sustaining mechanism is a mechanism in virtue of which a concept stands in the mind-world relation that a causal theory of content, like Fodors, takes to be constitutive of content [p.37]. For Fodor, the relevant sustaining mechanisms are those that underpin the asymmetric dependence causal relations between concepts and the properties they express in the world. There may be different sustaining mechanisms between one persons concept and the property that it expresses or between different peoples identical concept and the property that it expresses.
Laurence and Margolis illustrate the idea of a sustaining mechanism by focusing on concepts for natural kinds. The key sustaining mechanism for a natural kind concept is one that implicates a kind syndrome along with a more general disposition to treat paradigmatic exemplars of the syndrome [p.38]. A kind syndrome is a collection of properties that is highly indicative of a kind yet is accessible in perceptual encounters [p.38]. For instance, this may include the shape, motions, markings, sounds, and colors of a kind. Concept learning then becomes a matter of accumulating contingent perceptual information about a kind. This information, once coupled with an essentialist disposition, establishes an inferential mechanism that tokens the natural kind concept. For instance, Laudau et.al. (1994) found that shape is an important cue that children use to determine that two objects are members of the same kind. A new word used in the presence of a cup could refer to a myriad of possible concepts, yet children employ a shape bias to fix on the kind at issue. The shape bias together with other biases and childrens understanding of their relative importance enables children to represent kind syndromes. In order to avoid fakes, children show a tendency to look for essential properties in order to fix a natural kind. Hence, perceptual properties are only a rough kind for them. In addition they have an essentialist predisposition that has them, for instance, responding differently to objects insides as opposed to their outsides as a function of expecting essential properties to be constitutive of the insides of, say, dogs. Gelman and Wellman (1991), for instance, found that four and five year olds displayed an essentialist predisposition. Laurence and Margolis think that this essentialist disposition together with the acquisition of a natural kind syndrome-based mechanism constitutes a learning model and, as such, it is a cognitivist and inductivist (or empiricist) model of the sort that Fodor rejects. This is so despite the fact that they posit innate structure in the form of biases and inferential mechanisms of various sorts. I think Fodor needs an acquisition story. But I think Fodors mistake was not that he failed to provide an inductive learning theory for primitive term acquisition but that he failed to provide an evolutionary account of primitive term acquisition. Humans do have an innate predisposition to sort objects into natural kinds due to an essentialist predisposition. The perceptual features of objects are clues to an underlying world of essential properties that we hone in on from a very early age. But Laurence and Margolis need to say how their inductivist learning model avoids Fodors objection that inductivist accounts presuppose knowledge of primitive concepts in order to explain the acquisition of primitive concepts, e.g., as in the Flurg example. But how do we acquire primitive concepts? We must be careful here to avoid the circularity error of presupposing a concept while providing a learning model of concepts. It is also important to focus on primitive concepts and not complex concepts. In the past I argued that the gap between the domain in which a module was selected for (its Environment of Evolutionary Adaptation, EEA) and the actual domain (or Actual Environment, AE) can lead to misrepresentation. Now suppose that misrepresentation is part of the process that leads to the malfunction of a certain proper function of a module. It follows that certain accurate representations historically have been important for the execution of the proper function of particular modules. This will not be universally true nor necessarily true but it will, often enough, be a contingent fact about the history of our species that true beliefs were important for our survival and reproduction. Suppose that Laurence and Margolis, Landau et.al., and others, are correct to suppose that humans from an early age possess an essentialist bias, and a shape bias, among other perceptual biases. Also, we know that humans display prototypical, statistical response patterns to the phenomena that they confront. Humans will acquire the concept of a dog and a name, Dog, for it. This, as opposed to the more abstract term animal or the more specific term Labrador Retriever. That is, we tend to acquire terms for objects categorized in ways that are convenient for our purposes. On the savannah, presumably, those purposes would include hunting prey and avoiding ones own predators, among other activities. The acquisition of perceptual terms eventually leads to the acquisition of kind syndromes. One wants Adams Ale because it quenches ones thirst and we identify it as that liquid that is tasteless, colourless, odorless, and so forth. But so far we have no object identified as having any essential properties. Of course, that does not mean that we will not have a primitive essentialist predisposition to go along with our primitive concept of Adams Ale. We know that Adams Ale tastes better than that horrid stuff in the Ocean that is exceedingly salty. We eventually get very good indeed at picking out the stuff we like and that truly quenches our thirst. In fact, the better we get at picking it out, the more we acquire a low-level theory about Adams Ale. At any rate, mass terms, such as water and count nouns, such as wildebeest1, are added to our primitive lexicon as they are called for in the execution of the proper function of ones modules. Wildebeests are identified with the benefit of our shape bias, as they are for the prey of wildebeests. My suggestion is that the terms used to identify such animals, such as brown and four-legged, were acquired (not learned) during the Pleistocene period of our development. Later, those that were able
to pick out the relevant properties in the perceptual space, hunt, survive, and reproduce were selected for. Such individuals acquired what I will call PERCEPTUAL KINDS in the Proper Domain or the Environment of Evolutionary Adaptation (EEA). Such PERCEPTUAL KINDS included color terms like brown, and shape terms like four-legged. Of course, it is the concept brown that constitutes such a PERCEPTUAL KIND, not the actual term brown. Such PERCEPTUAL KINDS became our perceptually primitive lexical concepts. In the Actual Domain or Actual Environment (AE) in which we now exist, we acquire, but do not learn, what brown is. We use our innate perceptual ability to isolate brown amid the flux of the passing show. Misidentifying a black bear as a wildebeest for our ancestors could spell disaster. The utilization of appropriate perceptual terms that helped to fix the meaning of natural kind terms was no accident. For instance, determining that it is a good thing to hunt in the south valley because there is a statistically good chance that lots of prey will be there is a statistical fact that evolution, but not learning, will provide for a species. Nevertheless, we acquire PERCEPTUAL KINDS in order that they will work alongside an evolutionarily selected for hunting instinct or module. We possess an innate predisposition to avoid snakes and the PERCEPTUAL KINDS, e.g., long, thin, round, object, that help us to identify snakes are made possible by the shape bias. The prototypical snake, then, emerges as a generalization of the statistical data that our ancestors faced. Prototypes represent the results of categorization strategies developed to aid our ancestors in flight and fight. As such, such perceptual cues acquired as perceptually primitive concepts are acquired in the fulfillment of the proper function of our innate modules. But this story will not work. Prototypes do not compose as Fodor taught us using the Pet Fish example. As such prototypes cannot be a part of the primitive concepts that constitute our informational atomist foundation. Something else is needed here, but what? To answer this question, consider another question. Why do humans possess an essentialist predisposition? Its a fact that young children expect objects to have insides that are constitutive of what that object is. My suggestion is that such a disposition is an evolutionary adaptation. As such, it has a function just as all modules that are selected for have functions. Acquiring good clean water and the natural kind term water is, in this sense, not just an abstract, theoretical exercise. Rather, it becomes a biological imperative where reproduction, the goal of natural selection, and its necessary precondition, survival, are at issue. Hence, while prototypes do not compose and so fail to provide the primitives necessary for a compositional semantics, natural kind terms succeed. On my view, Natural kind terms and Perceptual Kind Terms are acquired as the result of innate mechanisms. If that is true, we do not learn natural kind terms or Perceptual Kind Terms and so there is no inductivist circularity objection to face from Fodor. But such natural kind and perceptual kind terms are not strictly speaking innate either; rather, the concepts that such terms token are acquired as a contingent consequence of innate predispositions, such as the essentialist and space bias. Secondly, there is no problem about natural kind terms or perceptual kind terms being compositional as there is with prototype theory, connectionist theory, and so forth. Moreover, as Fodor (1998) has emphasized, we do not acquire natural kind terms or perceptual kind terms as such, i.e., only in a rich theoretical context, but only natural kind terms and perceptual kind terms. We do not require a sophisticated theoretical picture in order to grasp natural or perceptual kind terms, though we may acquire such a theory in the fullness of time. The view I am defending might be called PlaceHolder Essentialism because my view is that natural kind terms have no structure, they are informationally atomic terms.2 That is why it is possible to track such kinds without benefit of theory; we lock to natural kinds by virtue of innate mechanisms that hook us up to perceptual kinds. And, those perceptual kinds are themselves informationally atomic. Natural kind terms and perceptual kind terms are not innate, nor are they learned. If I am right, one can have ones natural kinds ala Laurence and Margolis without learning (and its attendant Fodorian learning problem) and still retain informationally atomic natural kinds. To conclude: 1) Fodor lacks an acquisition story for concepts but is correct to think that concepts are unstructured symbols or indicators. 2) Fodor is correct to think that we cannot learn primitive terms inductively on pain of circularity. 3) Laurence and Margolis are correct to think that Fodor has offered no concept acquisition story but they are wrong to think that primitive lexical terms can be learned given Fodors circularity objection.
Conclusion: We need to provide an evolutionary account for the primitive terms of conceptual atomism. It should be noted that the sketch of concept acquisition that I have provided only deals with natural kind and perceptual kind terms. It remains an open question as to whether the account generalizes to other sorts of lexical concepts. Clearly, much work needs to be done to articulate a complete account of evolutionary informational atomism. References Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and Theory of Mind. Cambridge, Mass.: The MIT Press. Bekoff, M. and Lauder, G., eds. (1998). Natures Purposes: Analyses of Function and Design in Biology. Cambridge, Mass., The MIT Press. Bogdan, R., ed. (1986). Belief. Oxford: Oxford University Press. Cosmides, L. and Tooby, J (1994). Origins of Domain-Specificity: The Evolution of Functional Organization. In Hirshfield and Gelman (1994), 85-116. (1995). Foreword to Baron-Cohen. Clarke, Murray (2004). Reconstructing Reason and Representation. Cambridge, Mass.: The MIT Press. (with Fred Adams) (2005). Resurrecting the Tracking Theories. Australasian Journal of Philosophy, Volume 83, Number 2, June, 207-221. Dretske, F (1986). Misrepresentation, in Bogdan (1986), 17-36. Fodor, Jerry (1975). The Language of Thought. Cambridge, Mass.: The MIT Press. (1987). Psychosemantics. Cambridge, Mass.: The MIT Press. (1998). Concepts: Where Cognitive Science Went Wrong. New York: Oxford. Hirshfield,L., and Gelman, S., eds. (1994). Mapping the Mind: Domain-Specificity in Cognition and Culture. Cambridge: Cambridge University Press. Laurence, S. and Margolis, E (2002).Radical Concept Nativism. Cognition 86:25-55. LePore, E. and Pylyshyn, Z., eds. (1998). Rutgers Invitation to Cognitive Science. Oxford: Blackwell. Margolis, E (1999). How to Acquire a Concept. In Margolis and Laurence (1999), 549-568. Margolis, E. and Laurence, S, eds. (1999). Concepts: Core Readings. Cambridge, Mass.: The MIT Press. Samuels, R., Stich, S., and Tremoulet, P. (1998). Rethinking Rationality: From Bleak Implications to Darwinian Modules. In LePore and Pylyshyn (1998), 130-160. Wright, L (1998). Functions. In Bekoff and Lauder (1998), 51-78.
1 The Concise Oxford English Dictionary defines the Wildebeest or Gnu this way: any antelope of the genus Connochaetes, native to S.Africa, with a large erect head and brown stripes on the neck and shoulders (Concise OED, p.503). 2 Note that Medin and Ortony use the term essence placeholder in a similar way in their defence of the TheoryTheory. As Laurence and Margolis suggest: This isnt to say that the Theory-Theory requires that people have a detailed understanding of genetics and chemistry. They neednt even have clearly developed views about the specific nature of the property. As Medin and Ortony put it: people may have little more than an essence placeholder (Laurence and Margolis, 1999, 46).
Ideas that stand the [evolutionary] test of time

Frdric Bouchard (Universit de Montral)
(Date of publication: 26 March 2007)
Abstract: Evolutionary psychology, memetics and models of cultural evolution focus on reproductive success. I will argue that fitness should in fact focus on differential persistence of entities instead of differential reproductive success of replicators. Understanding evolution as such shifts the nature of adaptation from reproduction to persistence, changing the means by which representational powers would be selected in biological systems. I. Introduction
Evolutionary psychology, memetics and models of cultural evolution focus on adaptationist explanations and their appeal to replicators. This view fleshes out the notion that fitness should be understood in terms of differential reproductive success. I will argue that fitness should in fact focus on differential persistence of entities instead of differential reproductive success of replicators. Understanding evolution as such shifts the nature of adaptation from reproduction to persistence, changing the means by which representational powers would be selected for in biological systems. After briefly explaining why replication is the Achilles heel of evolutionary explanations of mind and culture, I will argue that a focus on replication in evolutionary thinking in general is the problem. I will do so by describing some experiments on artificial ecosystem selection where replicators are not an explicit part of the explanation of adaptations. By going beyond the focus on replicators, we will not only get novel predictions in biology, but also an understanding of adaptation that solves some difficulties for our understanding of the evolution of mind and culture. Because of the format, I can only provide a brief sketch of this view, but hopefully, the programmatic description offered here will be suggestive and thought-provoking. II. Context
Gould and Lewontin (1979) famously argued against an exclusively adaptationist heuristics, showing how many biological traits and their current state could be better explained by appeal to other biological processes that are relatively independent of natural selection (e.g. developmental constraints). However, most attempts to give an evolutionary account of cognition adopt a strong adaptationist angle. As Buller (2005) argues, evolutionary psychology refers both to a general broad research interest and a specific narrow research programme (defended among others by Buss, Pinker, Cosmides and Tooby[1]) For Evolutionary Psychology (EP), as Pinker puts it (Pinker, 1997, p.21): The mind is organized into modules or mental organs, each with a specialized design that makes it an expert in one arena of interaction with the world. The modules basic logic is specified by our genetic program. Their operation was shaped by natural selection to solve the problems of the hunting and gathering life led by our ancestors in most of our evolutionary history. If we adopt Gould and Lewontins analysis we can see that Evolutionary Psychology is an extreme case of adaptationism. Memetics, a distinct theory (Dawkins (1976)) geared towards explaining the evolution of cultural entities named memes (i.e. slogans, ads, songs, etc), adopts a similar adaptationist heuristics. Aside from arguments against adaptationism in general, the projects of EP and of memetics are unsavoury for distinct but related reasons. 1The human experience of behaviour and culture seems to contradict the genetic hardwiring that is described by EP. The effects of learning on our behaviour weaken the appeal to an inneist framework. The means of inheritance (i.e. genetics) assumed by EP seem too rigid and
too slow to account for the diversity, fluidity and apparent adaptiveness in response of human behaviour. 2- Memetics has an even more fundamental problem: the analogy between genes and memes breaks down when one tries to explain what actually reproduces and how it actually reproduces. These two difficulties are serious flaws (or challenges for the more generous reader[2]) for two of the more credible evolutionary accounts of the mind and culture[3]. Examining these flaws highlights two major building of evolution by natural selection: heritability and differential reproductive success (Brandon 1990, reprising Lewontin 1978s articulation also adds of course variation). Evolutionary accounts of mind and culture wrestle with the difficulties of how heritable differences in behaviour could be passed on from one generation to the next and what would constitute the differential success that is necessary to obtain adaptation by natural selection. Difficulties in offering satisfactory explanations for this have hurt evolutionary accounts of mind and culture. What has rarely been suggested however is that the problem is not with explaining the adaptation of behaviour but in how we understand the process of evolution of natural selection in general. The most plausible answers to our queries about mind and culture widen evolutionary theory to include some sort of genetic accommodation, or complex feedback interactions between organisms and their ecological niches (see for examples Boyd and Richerson 2005 and Odling-Smee et al. 2003 for an articulation of these promising ideas). I will now give a rough sketch of another alternative. I will briefly describe why evolutionary theory needs to widen its approach in way that recasts differential success in terms of differential persistence (contra differential reproductive success). The example that I will use to show this will purposely not be a cognitive animal. Hopefully this will show that is not merely an ad-hoc argument to satisfy our desire to provide an evolutionary explanation of behaviour but rather a necessary change to explain actual cases of adaptation. The upshot for any evolutionary account of mind and culture will be that that the lack of credible replicators will not be seen as an absolute obstacle to have an explanation of the evolution of the mind and culture. III. Fitness: population size mattered, but what do you do when you dont have populations?
Fitness since Darwin has been understood in terms of survival and reproduction. Because of population genetics and its centrality in contemporary evolutionary biology, fitness now refers to the frequency of alleles, but in more general fashion, fitness refers to the differential reproductive success of any entity be it a gene or an organism or even for some, groups or species. Dawkins (1976) translated this in terms of differential success of replicators. Dawkins makes the case that genes are the best replicators around. although he allows that other entities could act as replicators but this theoretical possibility is in fact only actualized by genes and (surprisingly) memes,. In any case, an entity with a higher probability of leaving more copies of itself than its competition is fitter than its competition. Although fitness has been understood in terms of survival and reproduction, the reproduction story has overshadowed the survival story. That is why EP and memetics focus their explanations on the replicators and how they lead to the phenotypes that we are interested about in this context (i.e. individual and social behaviours). But there are good reasons to believe that this replicator-centered story is not the only game in town. Some biological systems (e.g. some clonal species, certain colonial organisms and symbiotic communities) appear to be evolving; by that I mean that they display adaptive change as a response to their selective environments and these changes accumulate and are fined tuned over time in order to increase the systems capacity to survive. This adaptive change occurs in response to selection on the parts of the system. However, I argue, these systems evolution is not adequately captured by a concept of evolutionary fitness that is defined solely in terms of differential reproductive success or change in gene frequencies. Let us briefly examine such a case and the insights it gives us into fitness. We will later see how a new definition to fitness might offer fertile ground for the evolutionary understanding of mind and culture.
The focus on persistence has been around for a long time in ecology (often under the guise of stability). Most advocates of the idea that whole ecosystems could evolve quickly realize that persistence, not reproduction, will be the way to go. Ecosystems obviously do not reproduce but they do persist some better than others. Theoretically the idea of ecosystem evolution is interesting but the problem has always been to identify real cases of ecosystem evolution. Ecosystem evolution had until very recently not been identified as a likely evolutionary process (although many believed it was at least a theoretical possibility). Most believed such evolution to be epiphenomenal (Hoffman 1979) or at least very unlikely (Hull 1980). Aside of the theoretical difficulties with this hypothesis, an operational difficulty in testing the ecosystem evolution hypothesis was a problem of physical scale. How can one go about measuring the evolutionary fate of a whole ecosystem? Ecosystems are relatively large and it will be very difficult to account for all the species constituting it and the interactions between them. But when one realizes that ecosystem or communities do not have to be large relative to human scale, testing evolutionary hypothesis becomes much more manageable. In recent artificial selection experiments, a good case for artificial ecosystem selection was provided. Swenson and others[4] (2000a, 2000b) describe three experiments where artificial selection is used to shape the phenotype of whole ecosystems. Let me briefly describe one of their experiments: They take 2ml of sediment (dirt, bacteria, etc) and 28 ml of water from a pond for each of 72 test tubes; they are then incubated. Each tube is then measured for pH level (which was the arbitrary trait they decided to select on, but a good trait to measure phenotypic change in ecosystems since the pH level is a feature of the physical substrate, the dirt, and the water, and not only the micro-organisms living in the dirt). They then take the 6 test tubes with the highest pH. From each test tube they take 5 ml of mud and add 25 ml of autoclaved pond mixture. And repeat. They did observe an increase in Ph level in the winning test tubes. As strange as it seems the mud samples produced the phenotype that enabled them to survive in this artificial selective environment. And more importantly they were stable enough so that the increase in pH level actually was retained across generations[5] and amplified across time. By showing how small malleable ecosystems could be artificially selected to get a desired trait they show that at least in theory, we could observe the same thing in nature. To make sense of ecosystem evolution, defining fitness in terms of offspring numbers will only take us so far. Microsystems with higher pH persisted better than microsystems with lower pH. The pH level is a trait of the whole ecosystem. The only way for the mud to persist is if it changes its pH. It does so without reproducing. But its phenotype changes thanks to environmental pressures, and this change persists and increases over time. There are no populations of ecosystems Again I am not claiming that reproduction is not involved at all here, but I am claiming that it is not the salient feature to explain the transformation of the phenotype of the ecosystem as a whole. Extend the experiments above in a thought experiment. Lets say that a higher pH lead to slower erosion. The patches of mud with a higher pH would persist whereas the ones with lower pH would erode. There is natural selection here. But is there evolution? If the patch only gets smaller and smaller there is just natural selection. Van Valen (1989) makes a similar point: erosion may be seen a selective process without there being adaptation: one must not confuse selection and response to selection (see Brandon 1990 for a detailed analysis of this). The latter is what we need to have adaptations. As Van Valen argued as well, in purely abiotic cases there is likely no response to selection (and therefore no adaptation or evolution). But our thought experiment here is not like this. If the patch eventually stabilizes, and moreover may grow thanks in part to reproductive success of some of its microorganisms but also possibly the chemical reactions of the physical substrates, AND if the pH increases (leading to less erosion) then it seems we have evolution by natural selection even though offspring contribution might not be the best way to describe the evolutionary change. To understand the fitness of the ecosystem one will have to understand how components of that ecosystem (and selection on these components) contribute to the capacity of the system to persist[6].
Thoday in 1953 suggested that to be fitter is to have a higher propensity to leave even only one offspring in 10 to the 8 years. But why should we talk about offspring at all? If we wish to examine two ecosystems, couldnt we compare their relative fitness in terms of their capacity to still be there in x number of years? Couldnt we say that if this propensity (which will fluctuate over time) is the result of environmental pressures then what we have is evolution by natural selection? Ecologists have been suggesting concepts like differential persistence for ecosystems for many years, but actually, many cases of evolution below this level of organisation demand such view. Elsewhere (Bouchard 2004, Bouchard in preparation) I argue that many clonal species (e.g. Quaking aspen), colonial organisms (termite colonies) and many cases of symbioses show responses to selection that can only be explained by appeal to differential persistence, not differential reproduction. This doesnt mean that there are no replicators, but rather that at least in some cases of evolution, replication is but a means to increased persistence not the sine qua non condition for adaptation to selective pressures. IV. From mud to mind
How does this muddying the water help us understand the evolution of mind and culture? What were the two difficulties of evolutionary accounts of mind and culture highlighted in the introduction? Heritability and replication. Evolutionary accounts of mind and culture have given explanations that were tied to the arguably implausible efficacy of specific replicators (genes for EP and memes for memetics). In the context of the case we have just briefly examined here we can see that we may not have to identify replicators at all to identify adaptations. I am obviously not providing a full story here, but one may see this possibility in light of its possible advantages. Lets apply the persistence model to our question of interest. How could memetics benefit from a persistence model?
One of the difficulties with the memetics project was in explaining the process of replication of memes. Imitation is often invoked but one does not get a convincing story as to what is actually being copied and how. Even if we charitably let go of this shortcoming, one who would still need a metric to compare the relative success of various memes[7]. A population of meme would be composed of what? In a replication-based view of adaptation, their numbers become the salient feature, but it seems perverse (and unhelpful) to compare the wazzup meme with the the swoosh logo on a running shoe based on their numbers. Using probability to persist as a metric, we could compare the fitness of these very different memes without having to identify population sizes. The number of occurrences is of course linked to increased persistence, but its not a necessary condition for it (just as in nature, the number of individuals is only weakly inversely related to the risk of extinction). It may be better to be a rare meme than to be a very numerous meme to persist for a long time (think of a super-luxury item like a Faberg egg). This hints at why reproductive success may not be the best way to understand evolution in all cases. How could Evolutionary Psychology benefit from a persistence model?
The short answer is that in a strict sense, it cannot: as mentioned previously Buller 2005 distinguished between a narrow view of evolutionary psychology (EP) and a broader view. EP as based exclusively on genes to pass on adaptive behavioural capacities is not obviously compatible with a persistence model of fitness and adaptation. But again, one should realise that reproductive success is a good means for a lineage to persist. Certain human behaviours increase the likelihood that homo sapiens will be around for some time. These behaviours are passed on in the lineage hypothetically genetically and probably culturally as well. If EP can extend its research program to include these non-genetic means of inheritance (to include aspects of memetics or other theories of cultural evolution), it will become more a more plausible theory of the evolution of mind and culture. If it accomplishes this openness, a persistence story will become useful for reasons highlighted above. Much (much) more needs to be said to flesh out this suggestion, but the idea remains: evolutionary psychology and memetics pin their explanatory hopes on finding plausible replicators for the explananda they are interested in, namely human capacities for behaviours and the apparent evolution of culture. I have argued here that the relative failure of these two projects should not be surprising since evolution by natural selection may not necessarily exclusively rely on replicators to create evolution by natural selection. As long as you have parts of a system that react differentially to
pressures from the environment and that the winners (the ones still around) are there (retained, inherited, etc.) for future selection events, one gets evolution by natural selection. Most of the time in nature these parts are organisms (or genes depending on your point of view) and the systems are the species. Most of the time species persist by producing more offspring than less. But for many biological systems, success is in the persisting not the reproducing. In many cases of evolution by natural selection, there are no actual populations. This is potentially positive news for evolutionary accounts of mind and culture. Persistence is an intuitive notion when thinking about culture (think of traditions for instance) and culture is an important aspect of our behaviours. What I am hinting here is that persistence may be necessary to understand the adapted nature of our minds. References Blackmore, Susan. Imitation and the Definition of a Meme. Journal of Memetics- Evolutionary Models of Information Transmission 2, no. 2 (1998). Bouchard, Frdric. Causal Processes, Fitness and the Differential Persistence of Lineages. In Philosophy of Science Association meeting. Vancouver, 2006. . Evolution, Fitness and the Struggle for Persistence. Duke University, 2004. . Fitness. In The Philosophy of Science: An Encyclopedia, edited by Jessica Pfeifer and Sahotra Sarkar, 310-15: Routledge, 2006. Brandon, Robert N. Adaptation and Environment. NJ: Princeton, 1990. Dawkins, Richard. The Selfish Gene. New York: Oxford University Press, 1976. Gould, S.J.; Lewontin, R.C. The Spandrels of San Marco and the Paglossian Paradigm: A Critique of the Adaptationist Programme. Proceeding of the Royal society of London. Series B, Biological Sciences 205, no. 1161 (1979): 581-98. Lewontin, Richard C. Adaptation. Scientific American 239, no. 3 (1978): 156-69. Odling-Smee, F.; Laland; K.N.; Feldman, M.W. Niche Construction: Princeton University Press, 2003. Pinker, Steven. How the Mind Works. New York: Norton, 1997. Richerson, Peter J., and Robert Boyd. Not by Genes Alone : How Culture Transformed Human Evolution. Chicago: University of Chicago Press, 2005. Sober, Elliott, and David Sloan Wilson. Unto Others the Evolution and Psychology of Unselfish Behavior. Cambridge, Mass: Harvard University Press, 1998. Swenson, W.; Wilson, D.S.; Elias, R. "Artificial Ecosystem Selection." PNAS 97, no. 16 (2000): 9110-14. Swenson, W.; Arendt, J.; Wilson, D.S. "Artificial Selection of Microbial Ecosystems for 3-Chloroaniline Biodegradation." Environmental microbiology 2, no. 5 (2000): 564-71. Thoday, J. M. Components of Fitness. In Symp. Soc. Exptl. Biol. 7, pp.96-113: Cambridge, 1953. Van Valen, Leigh M. Biotal Evolution: A Manifesto. Evolutionary Theory 10 (1991): 1-13. . Three Paradigms of Evolution. Evolutionary Theory 9 (1989): 1-17. Wilson, D.S. Introduction: Multilevel Selection Theory Comes of Age. The American Naturalist 150, no. Supplement: Multilevel Selection (1997): s1-s4. Wilson, D.S., and E. Sober. Reviving the Superorganism. Journal of Theoretical Biology 136, no. 3 (1989): 337-56. Wilson, David Sloan. Evolutionary Biology: Struggling to Escape Exclusively Individual Selection. Quarterly Review of Biology 76, no. 2 (2001): 199-206.
[1] Reprising Bullers distinction (2005, p.12), capitals will be used for the latter sense. [2] See Blackmore 1998 for an informative discussion about how to limit the use of concept of meme to a more helpful context. [3] In this paper I will treat the evolution of mind and culture as one and the same problem even though they are two distinct explananda (as the distinct projects of Evolutionary Psychology and memetics can attest). However, as I will show, a revised understanding of fitness may benefit both projects. [4] Note that David Sloan Wilson was the senior researcher on these studies. He has worked extensively on issues surrounding complex biological organisation such as groups and multi-level selection. [5] The quote signs are used here since we do not get real generations, but only forced selective events on separate test tubes.
[6] Van Valen 1989, 1991 suggests that biotas do evolve but that their fitness is to be understood in terms of energy control. I will not go through this account here but elsewhere I describe some of the limitations of that account (see Bouchard 2004 for a detailed analysis of his views). [7] Is it better (fitness-wise) to be a meme in as many heads as possible or in fewer heads filled with less memes? Is it better to be a visual meme or an audio meme?
The complex vehicles of human thought and the role of scaffolding, internalisation and semiotics in human representation
Robert Clowes (University of Sussex)
(Date of publication: 26 February 2007)
Abstract: In his article Language, embodiment, and the cognitive niche (2006) philosopher Andy Clark presents a picture of the human mind as supervening upon a complex material substrate composed in part of straightforwardly biological systems but also of animal-built structures. Clark's picture is compelling, but if we accept it, the task of understanding human cognition becomes somewhat different to that classically understood. Rather than needing to understand merely the principles of organisation of, say, a language of thought or even the means of neural representation, a proper science of the human mind becomes the search for the principles of operation of a motley assortment of representational vehicles and their processes of organisation, both within and across the border of the organism and through a variety of time scales. Central to this picture of niche-enhanced cognition is the intersection of language and mind. But what are the principles of organisation of this intersection? While there is no wide agreement on an overarching theory, a rough outline of some of the central means of organisation can be depicted. Toward this understanding I focus on three principles of the organisation of human representation which work at different time-scales - across the interface of organism and its social embedding. These are semiotic dynamics, scaffolding and internalization.
The means of the organisation of the human mind: social, biological and technological. What sort of science should a science of human representation be? Will the main principles and regularities turn out to be continuous with the biological sciences or will we need to look to additional principles in our understanding? This is rather too large a question and so my primary purpose here is to draw attention to some of the specifics of the development of human minds and the laws and regularities governing their construction, across various time scales, in human social systems. I will contrast this approach with two competitors, the first which sees our minds primarily as grounds for pseudo-biological replication and the other which sees our mental powers as being embedded in (perhaps even composed of) a mesh of technological devices.
I will attempt an alternative path that seeks to understand the systems of representation that govern human minds as being social in origin, formed around our tendency toward sign-mediated interpretation (and, indeed, over-interpretation). By examining some of the principles underlying this form of interpretation I assess the claim that human representation is structured by, and pivots around, material symbol systems (Clark 2006). While I think there is something basically correct about this claim I shall argue that the principles by which these systems are structured and maintained are semiotic and can only partially be characterised in a narrowly technological way. Our minds are born embedded in a web of social interpretation and this allows us to develop unique cognitive abilities. The distinctive character of human cognitive processes may indeed pivot around material symbol systems but what material symbol systems are is not exhausted by a technological description.
The nature of this embedding is, however, controversial and, Ill argue, often misconceived. Human minds are not merely embedded in a world of technological systems and processes even when
broadly conceived but are embedded in systems of interpretation. These systems of interpretation are organised in a variety of time-scales and it is through the co-ordination of interpretational means that our minds come to embody some of their unique powers. In what follows, I will discuss how semiotic systems co-ordinate and construct minds across the organismic boundary; at least as classically-understood. Semiotic systems structure minds through many different processes, but we can broadly categorise them through their organisation at three different time-scales. These are, the timescale of semiotic sedimentation: the cultural time-scale over which sign systems are produced and structured and maintained as an emergent property of all of the interpretational activity in a system of communicating beings. Next, is the timescale of semiotic induction, when a child is taken-up by a process that I call constructive semiosis and through this induced into normative social practices, especially auto-interpretation? Finally, there is the time-scale of self-regulation via signs which encompasses the processes through which we regulate ourselves with semiotically produced means, striving to make ourselves coherent and predictable. Selectionist theories of culture: An ecology of memes, or semiotic sedimentation. Memetics proposes a quasi-biological theory of culture, i.e., that there are units of cultural selection that are literally propagated between minds. The propagation of these units is held to be in some sense analogous with the way genes are propagated by the copying of DNA. Tightly defining what precisely is being replicated under the memetic approach is not an easy task as habits, trends and fads all seem to be equally well captured. However, central to the idea is that beliefs and desires, those old targets of folk psychology, can be copied. Memetics, then, is controversially an epidemiology of beliefs(Sperber 1996). It proposes a selectionist perspective on how beliefs and desires are propagated through minds. In order to better understand this perspective we are enjoined by Dennett, among others, to take the memes-eye-view (Dennett 1998). Such a view demands that we see minds not as active agents whose motivational structure can be understood by beliefs and desires, i.e., via the folk-psychological or intentional stance, but rather as receptacles for colonisation by these ideas with attitude.
Space does not allow a detailed critique of memetics, so Ill limit myself to a couple of comments and then contrast it with an alternative account of the dynamics of cultural systems. Perhaps the central problem with the memetic theory is that it does not give us any mechanisms to understand the processes by which memes are replicated. Indeed, the mechanisms by which memes appropriate minds seem quite mysterious. Having cast mind as the medium through which memes are selectively propagated, the memes theorist should surely attempt to identify some of the principles by which this propagation takes place. A review of some of the things that have at one time or another been claimed to be memetically reproduced, e.g. the wearing of baseball caps, belief in god, the first couple of bars of Beethovens fifth; might make us suspicious as to whether there are any strong principles to be found in this area.
These memetic principles should also ex hypothesis be at least similar to those we find in natural selection. But what are the phenotypes of memes, what are their genotypes and what are their units and mechanisms of selection? Moreover, there is no great indication that much progress has been made in identifying these entities since the original development of the idea (Dawkins 1976). While the possibility of finding respectable analogues for such entities in a memetic science cannot be ruled out at this time, there is a suspicion that the lack of much progress hereabouts points toward fundamental difficulties with the idea.
There are, I think, competitor theories in this area that make much better sense of the sorts of regularities memetic theory hopes to account for, have a selectionist element, but target not ideas and attitudes as the units of selection, but their occasional vehicles and mediators: words and sentences. In addition, we already start to understand some of the selectional mechanisms and principles of organisation that stand behind signs and their role in the organisation of mind. These principles of semiotic organisation have little to do, except by very rough analogy, with the principles of operation of DNA and RNA and the way these are transcribed and used to construct and regulate cells and their assemblies and, ultimately, bodies and brains. The argument for memes proceeds by analogy but I think it is blocked by a much better developed and better targeted theory in this area. I now want to contrast the memes theory with an alternative approach to cultural selection that identifies regularities and mechanisms that take place not on beliefs themselves but on their intermediaries, i.e., signs.
Semiotics is the theory of sign systems: what signs are, how they work, and how they evolve. Semiotics may have seemed once upon a time to be a purely descriptive or hermeneutic science a long way from naturalism, but at least since Millikans (1984) work it has at least seemed possible to reconcile theories of signs with a naturalist epistemology [1]. However, making it possible to study the propagation and maintenance of systems of signs on a mechanistic basis has largely been a recent development of the use of multi-agent simulation systems.
Of great interest in this area is the work of Luc Steels and his collaborators who have used multi-agent systems to model the dynamics of language and other sign systems [2]. I take these studies not to address the actual processes by which signs are propagated and maintained among human beings, but rather how a population of agents can construct and maintain a coherent system of signs using some quite simple interpretational mechanisms. This work has shown how a semiotic system can be regarded as an emergent property of the individual acts of interpretation of an ecology of agents coordinating their activities through a system of shared communicational means, (cf. Steels and Kaplan 1999; Kaplan 2000; Steels 2000).
At the heart of these language-game experiments are not just a selectionist framework but acts of proto-interpretation. One episode of such an interpretational activity can be characterised as follows:
1. A speaking agent formulates a prospective utterance based on an association between a word-form and a meaning-form. 2. The speaker utters its chosen word-form. 3. The listener attempts to map the chosen word-form onto a referent in the environment 4. Either all is well and in the event of communicational success both parties strengthen the connection between meaning-form and word-form, 5. Or else, the communication is unsuccessful and both communication participants weaken the weightings of their previous associations between word-form and meaning-form.
The above description delineates the bare outline of an approach to the emergence of sign-systems
garnered from the study of simple multi-agent systems [3]. It allows us the following insights. The dynamics that maintain a semiotic system can arise from the collective activities of interacting agents. They are emergent processes which allow complex systems of sign-organisation to become established from very minimal interpretational activities. They also require no spooky mechanisms by which beliefs can be propagated between minds. Signs and their systems are maintained by the process of agents attempting to guess at the referencing acts of other agents [4]. Semiotic sedimentation, as I intend the idea here, is the process by which systems of sign interpretation are not just produced, co-ordinated and maintained between minds, but the way these are accreted into established systems of practice that regulate those minds. Such processes can construct and lay down systems of signs very rapidly [5], but more typically work over longer timescales than the lives of individual agents.
This then is a substantial point of contrast between memetic theory and a theory based on the dynamics of semiotic systems. According to the semiotic approach, it is not beliefs and desires themselves that are refined but, rather, the systems of interpretation of their occasional vehicles: words, sentences and other linguistic devices. These structures are then employed to co-ordinate beliefs and desires.
I claimed above that the memes theory as it stands has not very much by way proprietary mechanisms with which it could be properly explicated. It is merely the claim that culture is organised by mechanisms that have a rough analogy with natural selection. By contrast a theory of semiotic organisation provides potential proprietary mechanisms by which sign systems are organised and through them the representational systems of mind. However, the mechanisms by which such signsystems are really structured between minds have only been sketched in their barest outline so far. Such a theory needs to be connected with mechanisms that explain the development of individual minds as sign interpreters. To understand these processes we need to look at a different time-scale. Scaffolding and constructive semiosis. In the study of the development of the child, we find a plethora of mechanisms dedicated to the propagation of signs [6]. What we find here are systems of complex processes by which the developing childs activities are regulated, first by interaction with the mother and then by interaction with a much broader society of caregivers and others.
One of the most salient features of this scaffolding arrangement is that mothers do a lot of work - albeit some of it unconscious - to introduce children into the cultural word. A central way in which this is accomplished is by literally interpreting minds into existence, i.e. by applying an interpretational skein of beliefs and desires and their supporting mechanisms. To put this in an alternative way, human minds are produced by applying the intentional stance. While I use Dennetts terminology I want to put a slant on it which although not entirely absent from Dennetts work is perhaps not emphasized as much as it should be. Applying the intentional stance to at least a certain class of beings is not merely an interpretational mechanism. Rather it serves to render their minds explainable and open to prediction. Intentional interpretation or rather over-interpretation is a mechanism that makes at least a certain class of systems more felicitously encompassed by the intentional stance in the future. Not all systems are, of course, so affected by interpretation. Thermostats, paramecia, geckos and most primates do not seem to be much affected by being subjected to systematic interpretations. Human infants, on the other hand are.
Stephen Cowley makes this apparent in research into the way that childrens minds are interpreted and transformed by embedding in a particular socio-cultural niche. He shows that there are important and measurable effects on the childs behaviour in even the early weeks of life. In one nice example, Cowley discusses how speakers of isiZulu teach a baby to show ukuhlonipha (glossed as respect). This method of calming babies appears to be quite different from that used by Indian and White populations but is effective from as early as 14 weeks. Moreover as Cowley shows, the way in which babies take up these cultural patterns is controlled by semiotic means and embedded in an intentionally-inflected interpretational framework [7].
Human infants, virtually from the moment they are born, are caught up in this framework of intentional interpretation. Their mothers almost immediately set about capturing their gestures and actions in a range of activities - descriptions, labelling and pointings-out - that are all intentionally structured.
Further, these systems of interpretation are not very tightly constrained across cultures. Although many philosophers once held that folk psychology is largely a cross-cultural and ahistorical beast, empirical evidence has demonstrated the limitations of this thought (Lillard 1998). Folk-psychologies may turn out to be much more variable than was once understood, and they will not likely turn out to be intentional in the precise sense of the Western model. However, interpretation into culturally expected standards of mind is something which we do find across all cultures. We need to understand the mechanisms by which minds like ours move from being systems that are only poorly captured by the standard belief / desire theory to being consummately covered by it. If I am right, this is another reason - apart from the reason that it is a mistake to see folk-psychology as theoretical rather than practical as to why calling folk psychology a theory is to make a kind of category error, or at least that this is not only what taking the intentional stance consists in. Taking the intentional stance is not just a means of interpretation but a means of transformation [8].
But, are these processed mediated by signs as I have previously claimed they are? To see how Ill discuss a related theory of how language is involved in cognition which I think closely relates to our discussion, i.e. the theory of material symbols developed by Andy Clark (Clark 2006) [9]. Material Symbols and their semiotic embedding. According to Andy Clark, language is to be understood not just as a biological adaptation but a species of external artifact whose current adaptive value is partially constituted by its role in re-shaping the kinds of computational space that our biological brains must negotiate (Clark 1998). Words compose material symbol systems: structures in which we become embedded and which complement the native representational systems of our brains. Through interacting with these systems, we come to develop and sustain the sorts of cognitive powers that are most characteristic of the human species [10].
For Clark material symbols play a unique role in ongoing human cognition in virtue of their material form. Material symbols function in part by providing pared-down targets for cognition without some of the rich and distracting concreteness of whatever they stand in for. In one of Clarks illustrative examples the symbol-trained ape Sheba is able to gain a subtler mode of control of her own activities precisely because she only shallowly interprets the symbol. The original task confronting Sheba is to
point to one of two piles of treats. Sheba should point to the smaller pile in order for the experimenters to give her the larger. However, she fails to do this in the face of the rich sensual presence of the treats. Sheba can however point to a symbol representing the smaller pile of food however and so get the larger pile. Clark argues this demonstrates a central way that material symbols complement brains, i.e. by providing manipulable but shallowly interpreted stand-ins [11].
Perhaps because Clark is concerned to give an account of the cognitive powers conferred by material symbol systems, he has much less to say about the systems of interpretation on which those symbol systems depend. What could be missed on this account is that it is the social embedding of symbols that not only confer their unique cognitive powers, but also what makes them symbol-systems in the first place. What seems obscured on the material symbol systems view is the way that symbols arise from and are embedded in a system of interpretation, i.e., the way in which symbols are signs.
The importance of semiotic embedding over too extreme a focus on the materiality of symbols does, I think, become clear when we consider some of the variety of the instantiation of human languages. Sign or gesture based languages seem to be every bit as cognitively felicitous as vocally instantiated ones, yet although there may be some relevant differences between the sorts of sign systems which could be built in either media, the same underlying mind producing regularities arise. Minds which are able to manipulate and interpret systems of signs seem to have remarkably similar powers whether the symbols are instantiated in fluttering vocal chords or in gesticulating fingers. In part Clark has accounted for this by arguing that it is because symbols are abstract that they have the cognitive properties they do. Yet this seems to cast doubt on the very materiality claim. It seems that the powers of material symbols are actually multiply realizable and their cognitive effects can be found broadly whenever held in the right sort of interpretational framework.
Clark makes a strong case that our minds have the character they have because of their embedding in a world of linguistic technology. However, it is the interpretational nexus in which this technology is framed that is the real source of some of our most subtle and complex cognitive powers. While the socalled material dimension of symbols is undoubtedly important, it is only by paying attention to their semiotic, interpretational embedding that we can understand their central cognitive properties.
From semiotic coordination to self-interpretation. We are inveterate self-interpreters using a complex scheme of hermeneutic means to examine and motivate our mental lives. While some of us are more reflexive in our thinking than others, it seems likely that it is also definitive of being human that we at least some of the time reflect upon our own motivations and desires. Does this reflexive capability rely on semiotic means? I think there is good evidence it does.
One place to look for clarification of the role of auto-interpretation is in developmental psychology which pays close attention to micro-genesis, i.e., the transitory episodes by which mind is interpreted into life. What we find in microdevelopmental episodes is the autonomous, creative and second-bysecond adoption of semiotic props to structure activities. In a recent paper Sinha (2005) discusses the way in which Brazilian children appropriate a television character Beto Carrero into an episode of play. The television character has a characteristic cowboy hat. In a play episode Sinha analyses how this
cowboy hat is taken up and re-interpreted in a blend which incorporates both some of the characteristics of the TV character but is also feminised and transformed in a way that is appropriate to the gendered characteristics of the children playing the game. This re-interpretation itself pivots around an actual cowboy hat with which the girls are playing. Here we find not a shallow interpretation but a deep and subtle re-interpretation of the material signs available [12].
The use here of semiotic systems and their moment-by-moment incorporation into structured activities is not an exception but the norm of everyday child development and of course of adult life. It is a process of self-construction, of bringing a complex agent to life through self-interpretation according to social norms and regularities.
Through folk practices, children are inducted into a world of intentional ascriptions by which they come to practice interpreting both the minds of others and themselves. But achieving such a feat is not just a matter of conforming to outside pressures to play language games, but rather to consider oneself as the proper place where intentional ascriptions find their focus. Not merely as a confluence of beliefs and desires but as an agent governed by reasons and socially prescribed norms. Such a view of norm-governed auto-interpretation gives us the key to why beings like us are the proper province of the intentional stance and not other animals or artefacts that may be derivatively captured with more or less accuracy or under a wider or narrower range of circumstances. We, unlike any other actually existing system, internalise the interpretative systems of folk psychology and use them to govern our actions. Self is the outcome of auto-interpretation mediated by the interpretative system provided by folk-psychology.
I have tried to outline the role of semiotic interpretation in the human mind and the part it plays in determining our unique cognitive skills. I have emphasized how this process is organised over different time-scales and how the semiotic approach is at least consistent with the move toward a science of embedded cognitive systems and mind. Semiotic systems are shaped over cultural time, are drawn upon in developmental time, when we are literally interpreted into mindfulness, and are then used by human beings to organise their moment-by-moment ongoing activities. Understanding the nature of representation in human minds will require paying attention to the multiple time-scales over which the means of representation are organised.
References Bickerton, D. (1990). Language and Species, University of Chicago Press. Boysen, S. T., G. Bernston, et al. (1996). Quantity-based inference and symbolic representation in chimpanzees (Pan troglodytes). Journal of Experimental Psychology: Animal Behavior Processes 22: 76-86. Clark, A. (1998). Magic Words: How Language Augments Human Computation. Language and Thought. Interdisciplary Themes. P. Carruthers and J. Boucher. Oxford, Oxford University Press: 162 183. Clark, A. (2006). Material Symbols. Philosophical Psychology 19(3): 291-307. Cowley, S. J., S. Moodley, et al. (2004). Grounding Signs of Culture: Primary Intersubjectivity in Social Semiosis. Mind Culture and Activity 11(2): 109-132. Dawkins, R. (1976). The Selfish Gene. Oxford, Oxford University Press. Deacon, T. W. (1997). The Symbolic Species: The Co-Evolution of Language and the human brain,
The Penguin Press, Penguin Book Ltd. Deacon, T. W. (1999). Memes as Signs - The trouble with memes (and what to do about them). The Semiotic Review of Books 10(3): 1-3. Dennett, D. C. (1998). Memes: Myths, Misunderstandings and Misgivings. Manuscript presented at The Chapel Hill Colloquium, October. Gallagher, S. (2001). The Practice of Mind. Journal of Consciousness Studies 8(5-7): 83-108. Kaplan, F. (2000). Semiotic schemata: Selection units for linguistic cultural evolution. Proceedings of Artificial Life 7, MA, MIT Press. Lillard, A. (1998). Ethnopsychologies: Cultural variations in theories of mind. Psychological Bulletin 123: 3-32. McGeer, V. (2001). Psycho-practice, psycho-theory and the contrastive case of autism. How practices of mind become second-nature. Journal of Consciousness Studies 5-7: 109-132. Millikan, R. G. (1984). Language, Thought and Other Biological Categories. Cambridge, MA, MIT Press. Sinha, C. (1988). Language and Representation: a socio-naturalistic approach to human development, Harvester Wheatsheaf. Sinha, C. (2005). Blending out of the background: Play, props and staging in the material world. Journal of Pragmatics 37(2): 1537-1554. Sperber, D. (1996). Explaining Culture: A Naturalistic Approach. Oxford, Blackwell Publishers Inc. Steels, L. (1999). The Talking Heads Experiment: Volume I. Words and Meanings. Antwerpen, Laboratorium. Steels, L. (2000). Language as a Complex Adaptive System. Proceedings of PPSN VI. M. Schoenauer, Springer-Verlag. Steels, L. and F. Kaplan (1999). Collective learning and semiotic dynamics. Advances in Artificial Life (ECAL 99), Lecture Notes in Artificial Intelligence. D. Floreano, J. D. Nicoud and F. Mondada. Berlin, Springer-Verlag: 679-688. Steels, L., F. Kaplan, et al. (2002). Crucial factors in the origins of word-meaning. The Transition to Language. A. Wray. Oxford, UK, Oxford University Press.
1. A naturalist approach to signs can also be found in the work of one of the founders of the field and influence on Millikan: Charles Sanders Pierce. Several other works (Sinha 1988; Deacon 1997) have also proposed a rapprochement between semiotics and cognitive science on a naturalistic basis. 2. As some of Steels' principal collaborators will later be a part of this interdisciplines conference, and for reasons of space, I will keep my comments here to a minimum. 3. The interested reader is invited to consult Steels, Kaplan, McIntyre and Van Looveren (2002) or Steels (1999) 4.This co-ordination between the producers of signs and their consumers is something which is perhaps best described in the philosophical literature by Ruth Millikan (1984) 5. Examples of the rapid establishment of semiotic systems can be found in the histories of some Creole languages (Bickerton 1990). 6. For some related arguments on the proper understanding of signs and their role in cultural dynamics, see Deacon's (1999) 'Memes as Signs - The trouble with memes (and what to do about them)'. 7. See also Cowley, Moodley and Fiori-Cowley (2004). 8. Some of the motivation behind these comments can be linked the idea (Gallagher 2001; McGeer 2001) that what we need is not just an account of theories of mind, but of the practices of mind. 9. Not to be confused with the 'Physical Symbol Systems' of Newel and Simon. 10. Clark further argues that the representational powers of the human brain are not substantially changed by this new environment. The brain goes on doing the same old pattern completing work it has always done but now magnified by its embedding in external symbolic structures. This view is
sometimes contrasted to the one ascribed to Daniel Dennett (1991) where a virtual machine comes to be installed in the brain by interactions with the world of language. 11. See Clark's (2006) discussion of Boysen, Bernston, Hannan and Cacioppo (1996). 12. I have only scratched the surface of Sinha's (2005) treatment of this episode and the complex processes which lay behind it.
The Theory of Biological Adaptation and Function

Robert Brandon (Duke University)
(Date of publication: 29 January 2007) Abstract: Modern evolutionary biology provides a naturalistic account of adaptation and function. That account is sketched here. The basic concepts are reasonably clear. The devil is in the details.
The phenomena of adaptation, both the process of adaptation and the products thereof, are central to, if not definitive of, biology. The explanatory problem of adaptation was a major part of Darwins revolutionary project. Prior to Darwin, the only systematic explanation of design in nature was a designer. (Although Lamarckian explanations of ontogenetic adaptation and transmission were thought by Darwin and others to be responsible for a small fraction of adaptive structures in biology.) Darwins theory of evolution by natural selection changed all of that. This paper focuses exclusively on that theory. I will not consider extensions of that theory to culture, nor analogues to human artifacts. We will see that Darwins theory of evolution by natural selection underwrites a thoroughly naturalistic account of functions in the biological world and the extensions of this to the cultural and artifactual domains are both tempting and perfectly reasonable, but due to space limitations I leave that for others. 1. The Theory of Evolution by Natural Selection. Consider a simple case of evolution by natural selection. In a population of annual plants there is variation in the height of the plants. Let us say the mean value is 1 meter (other statistical properties of the distribution will not be relevant in this simple example). Taller plants out reproduce shorter plants, i.e., reproduction output is a (probabilistic) increasing function of height. The ecological reason for this in our scenario is that the taller plants are out-competing the shorter plants for sunlight because they are growing close together in dense stands and so the taller ones shade the shorter ones. But, let us note now, that the same result, talls out reproducing shorts, could occur for any number of ecological reasons. This is a point to which we shall return. Finally, height is, to some degree, heritable in this population, i.e., taller than average plants tend to produce taller than average offspring and similarly for shorter than average plants. Although my description of heritability above is mechanism neutral, and the process of evolution requires no particular mechanism of heritability, let us suppose that in our plants multiple genes at multiple loci control height. Statistically, though there will be exceptions, taller than average plants will have different alleles than shorter plants and these different genes will be passed on to their offspring. Thus the offspring of tall parents will tend to have tall alleles and offspring of short parents will have short alleles. Has evolution thus occurred? I would say no. We could take a purely genetic point of view and think of evolution as change in allele frequencies over generational timebut notice that is not how we started our account. We started with a phenotypic description. And, I claim, we need to complete a full generational cycle to give an evolutionary account, thus we have one further stepnamely the development of these genotypes into mature plants with heights. Then we have a 2nd generation height distribution which, given all that has been said, will differ from the previous generation in that its mean will be greater than 1 meter. How much greater depends on quantitative aspects of the account that have not been specifiedthe strength of selection and the degree of heritability. But the points we are interested in are not dependent on making this simple example quantitative. From this example it is easy to extract what have come to be know as Darwins Three Conditions. These conditions are necessary, but not sufficient, for evolution by natural selection. The conditions are: 1. 2. Variation. There is variation in phenotypic traits among members of a population. Inheritance. These traits are heritable to some degree, meaning that offspring are more like their parents with respect to the traits than they are like the population mean. 3. Differential reproductive success. Different variants leave different numbers of offspring in succeeding generations. (Lewontin 1978, Brandon 1990)
I will not dwell on the necessity of these conditions, since I think this has been explained adequately elsewhere and is fairly obvious, but for our purposes I need to make one point about the lack of sufficiency. In finite populations, and, of course, all biological populations are finite; in the absence of selection, differential reproduction is still expected. Take two coins fresh off the mint, as close to physically identical as can be. Toss them both 100 times. It is highly likely that one will yield more heads than the other. Similarly, in the absence of selection, drift will occur. So changein gene frequencies, or in phenotypic distributionsis by no means indicative of selection. We need more than change to invoke selection. What we need is differential adaptedness. Or, more specifically, differential adaptedness to a common selective environment. The Darwinian explanation of 3 is the Principle of Natural Selection. I have stated this principle as follows: PNS: If a is better adapted than b in environment E then (probably) a will have greater reproductive success than b in E. (Brandon 1990) For this to be an explanatory law relative adaptedness needs to appropriately defined. For instance, if we were to define relative adaptedness in terms of actualized reproductive success, then the PNS would be a tautology and explanatorily empty. Elsewhere I have argued for the propensity interpretation of adaptedness (also known as the propensity interpretation of fitness), arguing that such an interpretation renders the PNS explanatory (see Brandon 1978, 1990, Mills and Beatty 1979, Brandon and Beatty 1984). I will simply assume the correctness of that interpretation in the discussion that follows. 2. Adaptation and Environment. The PNS compares entities a and b in a common environment E. Why? And what does this mean? The basic idea behind the PNS is to localize the adaptive differences in the organisms (or whatever entities we are talking aboutfor now let us focus on organisms). The contrast would be the following: suppose we take two genetically identical seeds and plant them into two pots, one containing fertile soil which we water regularly and place in a well lit location. The other pot contains soil mixed with arsenic that we place in a dark closet and never water. Now the plant in the first pot grows well and produces many seeds, the second dies shortly after germination. Do we say the first plant was better adapted than (or fitter than) the second? No. That would be to confuse a withinenvironment comparison of two different organisms with a comparison of two different environments. Now if I change the case and use two different genotypes to start with but keep everything else the same then I have just succeeded in constructing a worthless experiment. Again, almost certainly, the plant in pot one will do well and produce many seeds and the plant in pot two will die before producing any seeds. In the language of experimental design, I would have confounded genotype and environment, and so would not have any meaningful interpretation of the result. To meaningfully compare genotypes, I must do so in a common environment. That idea is the foundation of the longstanding practice in biology of common garden experiments which are attempts to compare different organisms within a common environment. But all of this sounds epistemological, and I am not here interested in making an epistemological point. Rather I want to argue that ontologically, evolution by natural selection works by comparing organisms within common selective environments. The point cannot be made fully here, but two facts are pertinent. First, only differences localized within organisms have the possibility of being heritable in the normal sense. This point is more complicated that it might seem. Good and bad luck can be transmitted across the generations, for instance, if a plant lands in a bad habitat that is large relative to its range of seed dispersal, then that bad luck will be transmitted to its offspring. Suppose another genotype in the same species had the good fortune to land in a good habitat that is again large relative to its dispersal range. This second genotype will increase relative to the first over generational time. But this case this is not natural selection; rather it is a form of genetic drift, based on a chance distribution event. There is no differential adaptedness involved. (For further discussion of habitat choice vs. chance distribution see Brandon 1990, pp. 60-64.) Second, cumulative adaptive evolution, requires the existence, indeed the persistence, of common selective environments. More on this shortly. What is a selective environment? How are they individuated? The concept of fitness or adaptedness has received much attention from philosophers and biologists alike. But adaptation, as I have just argued, is always to an environment. One cannot meaningfully speak of adaptation
simpliciter. Thus the concept of the environment is the dual to the concept of adaptation. As such it should have received equal attention from philosophers and biologists, but until quite recently it received none at all. The result was not just a lack of conceptual clarity, but completely misguided experimental research programs (see Antonovics, Ellstrand and Brandon 1988 and Brandon and Antonovics 1996). Fortunately progress has been made. Interestingly, the problem is strictly analogous to the reference class problem in probability theory, and the approach I have taken draws directly on Wes Salmons solution to the reference class problem in terms of homogeneous reference classes (Salmon 1984). I have argued that a selective environment is homogenous with respect to types (genotypes or phenotypes) T1, T2, Tn if and only if the relative adaptedness values of those types are constant in that region. (I am intentionally using the vague term region here. We might wish to apply this to a spatial transect, a temporal slice or even something else.) First, notice that this notion is explicitly comparative and relative. One must have at least 2 types to compare in order to have a selective environment; otherwise there is no relative adaptedness (only absolute). And the scale of environmental homogeneity and heterogeneity for one set of genotypes may differ dramatically from that of a different set. The idea just articulated is radically different from more tradition ways of thinking of the environment. The above conception takes an organism-centric point of view. That is, we look at the environment not through our eyes, not by guessing what might be of relevance to the organisms, but by using the organisms as our measuring instruments (Antonovics et al. 1988). We put the organisms out in the environment, see how they do relative to each other, and repeat. As long as we keep getting the same pattern we have the same selective environment, a change in relative performance, indicates a change in selective environment. This conception of the selective environment has two primary virtues. One it is operational (see e.g., Brandon and Antonovics 1996). Two it is this notion, rather than the more simple-minded notion of heterogeneity of the external environment, that is relevant to the important areas of population genetic theory in which the notion of environmental heterogeneity has been invokede.g., the evolution of sex and the maintenance of genetic polymorphisms. Finally, let me return to a point mention earlier in this section. Some (e.g. Sterely and Kitcher 1988) have complained that my notion of selective environment is not really relevant to evolutionary biology because it is too narrowthat if applied literally it would break up the biological world at much too fine a scale to be useful for the purposes of evolutionary biology. My reply is that the existence of stable adaptationswhether at the phenotypic level or genetic levelvery strongly implies the long term persistence of selectively homogenous environments, or at least homogenous with respect to trait variants in question. Otherwise there is no explanation of the long-term persistence of the trait. Turtles-with-folding-necks is a good evolutionary trick. It has persisted for tens of millions of years (a interesting and complicated story, see Rosenzweig and McCord 1991). The PAX6 gene has persisted in more or less the form we see in fruit flies, zebra fish, mice and humans for 400 million years or so. The only way to account for this impressive long-term stability is very strong stabilizing selection, meaning that any variant off the standard gets strongly selected against. 3. Adaptation and Function. The Darwinian theory of evolution by natural selection that we have sketched above underwrites a thoroughly naturalistic account of functions in the biological world. But the account has a number of complications that must be recognized in order to properly apply it. We have already pointed to one serious epistemological problem in applying this theory change, increase or decrease in frequency is by no means sufficient to indicate that selection has occurred. Drift is constantly changing frequencies in natural populations (Brandon 2006). Differentiating selection from drift raises both interesting conceptual and methodological problems (on the former see Millstein 2002 and Brandon 2005). Space limitations preclude any detailed discussion of that here. But it is worth pointing out that a great deal of practical methodological progress has been made on that point, especially at the molecular level (see, e.g., Bamshad and Wooding 2003). We will focus on a problem that remains even after we have a clear separation of drift from selection. As mentioned above, the PAX6 gene is highly conserved across a broad swath of animal phylogeny. Given the constancy of mutation, the only explanation of this stasis is very strong stabilizing selection. Thus the presence of PAX6 in the form that it has across the broad phylogenetic distribution it has is due, without any doubt, to selection. Consider the other example mentioned above, turtles that can flex their necks (by bending their necks sidewaysPleurodiraor in an S-curveCyrptodira) gradually replaced straight necked turtles (Amphichelydia) four or five times in different regions of the globe, as long ago as the Cretaceous in Eurasia and as recently as the Pleistocene in Australia (Rosenzweig and McCord
1991). The consistency with which this happened strongly implicates selection on this particular trait. That is, the bending neck is adaptively superior to the phylogenetically prior character state of a straight neck. In both cases we can be confident selection has occurred, and that we know the target of selection (thus to use the language of Sober 1984, there has been selection for PAX6 not merely selection of, likewise for the bending necks of turtles). And so if an adaptation is a product of the process of evolution by natural selection (Brandon 1978, 1990), then these things are adaptations. And so, I claim, they have functions. Their functions are their effects that make them adaptively superior to the trait variants with which they compete. But knowing that they are adaptations, does not automatically allow us to identify their function. To do that, we must have the ecological explanation of the adaptive superiority of the trait variant in question relative to its competition and relative to the relevant selective environment. That ecological understanding does not follow either directly or indirectly from the statistical methods used to detect selection. Thus, especially at the molecular level, it is quite possible to know that something is an adaptation, and so that it has a function, while being quite ignorant of its function. Neither of the two cases I have discussed are such cases, though the PAX6 was one where there was initial misunderstanding of its function. When it was initially discovered that a mouse PAX6 gene could induce and ectopic eye in a fruit fly two erroneous conclusions were drawn: first that the fly eye and the mammalian eye were homologueswhich is certainly false; and second, that the function of PAX6 was to induce eye development. The latter is not flatly false, but incomplete, like saying that the function of a refrigerator is to chill Champagne. That is an incomplete description of the function of a refrigerator. Likewise PAX6 turns out to be a general purpose developmental regulatory gene that is involved in much more than eye development (see Brandon 2005 For discussion and references). Now we know its function much more fully. The turtle case is one where we have a very plausible ecological explanation of adaptive superiority of bending necksit allows turtles to retract their heads under their shells for protection from predators. Although we are dealing with selection in the distant past in many different habitats, the assumption is that predation was a constant problem and, everything else being equal, a bending neck is therefore superior to a non-bending one. Thus the function of the bending neck is to provide protection for the turtles head from predators. (See Brandon 1990, chap. 5 for a discussion of the general difficulties of constructing the necessary ecological explanations to support attributions of function.) 4. The Necessity of a Hierarchical Point of View. Strictly speaking, this section is unnecessary, given a correct ecological account of selection. But it is worth being explicit on the point that the only possibly adequate general account of selection and adaptation must be hierarchical. A single level account, for example, a purely genic account, could not possibly account for adaptations in nature. Although this refutes some philosophical positions, once understood, there can be no residual controversy. Ironically, looking back at the discovery of one of the first genuine cases of genic selection nicely illustrates this point. When Doolittle and Sapienza (1980) and Orgel and Crick (1980) first discovered what we now know to be the common phenomena of repetitive genomic sequences they were initially puzzled. They asked how the organism could benefit from these repetitive sequences. No plausible answer emerged. In the words of Doolittle and Sapienza, they had to reject the phenotypic paradigm in order to finally understand this phenomenon. Benefit to the organism was not the issue. Why? Because the process that produced these repetitive sequences was not a selection process among organisms. Rather, the process was a within-cellular process of a bit of DNA copying itself and inserting that copy elsewhere in the genome, thus out-competing bits of DNA not doing that or doing that at a slower rate. In other words, this was another, lower, level of selection, to be understood in terms of what benefits accrue to the entities competing at that level. Organismic benefit is irrelevant (in the first instanceof course, selection can, and often does occur simultaneously at multiple levels). What is good for a sequence of DNA, may, or may not, be good for the organism in which it is housed. Genuine genic selection is indifferent to organismic benefit. The classic case of meiotic drive in mice, is one where what is good at the genic level is clearly deleterious at the organismic level (males homozygous for the t-allele are sterile). There is always a potential for conflict of interest among different levels of selection (e.g., cell lineage in cancer vs. organism). Any adequate theory of adaptation must recognize this. And so must be hierarchical.
5. Summary. Modern evolutionary biology does provide a naturalistic account of adaptation and function. That account has been briefly outlined here. The process of adaptation is simply the process of evolution by natural selection. The products of that process, we label adaptations. That concept, then, is explicitly historical. To say something is an adaptation is to say something about its causal history; just as to label a mountain volcanic is to say something about its causal history. Adaptations are adaptations to specific environments and they are adaptations in virtue of specific effects that made the trait variant in question adaptively superior to the variants with which it has competed. These effects can be referred to as the function of the adaptation. Given that there are conflicts of interests among different levels of biological organization, the theory of adaptation must be explicitly hierarchical. Conceptually I think this is reasonably clear. But as we have seen, there are considerable difficulties lurking in the details.
References Antonovics, J., Ellstrand, N. C. and Brandon, R. N. (1988) Genetic variation and environmental variation: Expectations and experiments. In Plant Evolutionary Biology, (ed. by L. D. Gottlieb and S. K. Jain), pp. 275-303. Chapman and Hall. Bamshad, M. and Wooding, S. P. (2003) Signatures of natural selection in the human genome. Nature Reviews Genetics 4: 99-111. Brandon, R. N. (1978) Adaptation and evolutionary theory. Studies in History and Philosophy of Science 9: 181-206. Brandon, R. N. (1990) Adaptation and Environment. Princeton University Press. Brandon, R. N. (2005) Evolutionary modules: Conceptual analyses and empirical hypotheses," in Modularity: Understanding the Development and Evolution of Natural Complex Systems (ed. by W. Callebaut), pp. 51-60. The MIT Press. Brandon, R. N. (2005) The difference between selection and drift: A reply to Millstein. Biology and Philosophy 20: 153-170. Brandon, R. N. (2006), The principle of drift: Biologys first law. Journal of Philosophy Vol. CIII, 7: 319335. Brandon, R. N. and Antonovics, J. (1996) The coevolution of organism and environment. In R. Brandon, Concepts and Methods in Evolutionary Biology. Cambridge University Press. Brandon, R. N. and Beatty, J. (1984) The propensity interpretation of fitness: No interpretation is no substitute. Philosophy of Science 51: 342-347. Doolittle, W. F. and Sapienza, C. (1980) Selfish genes, the phenotype paradigm and genome evolution. Nature 284: 601-603. Lewontin, R. C. (1978) Adaptation Scientific American 239: 212-230. Mills, S. K. and Beatty, J. (1979) The propensity interpretation of fitness. Philosophy of Science 46: 263-286. Millstein, R. (2002) Are random drift and natural selection conceptually distinct? Biology and Philosophy 17: 33-53. Orgel, L. E. and Crick, F. H. C. (1980) Selfish DNA: The ultimate parasite. Nature 284: 604-607. Rosenzweig, M. L. and McCord, R. D. (1991) Incumbent replacement: evidence for long-term evolutionary progress. Paleobiology 17: 202-213. Salmon, W. (1984), Scientific Explanation and the Causal Structure of the World. Princeton University Press. Sober, E. (1984) The Nature of Selection. MIT Press. Sterelny, K. and Kitcher, P. (1988) The return of the gene. Journal of Philosophy 85 (7): 339-361.
The Evolution of Misbelief

Ryan McKay (School of Social Sciences and Liberal Studies, Charles Sturt University, Australia) This paper is co-authored by Daniel Dennett.
(Date of publication: 15 January 2007)
Abstract: A prevailing assumption is that those beliefs which maximise survival will be those which best approximate reality. True beliefs are seen as adaptive beliefs, and false beliefs, or misbeliefs, are seen as dysfunctional and maladaptive. Contra this assumption, we explore the extent to which certain misbeliefs might in fact be adaptive, in which case we may have an evolved predisposition to form them. A misbelief is simply a false belief, or at least a belief that is not correct in all particulars. We can see this metaphorically: If truth is a kind of target that we launch our beliefs at, then misbeliefs are to some extent wide of the mark. Of course, there is no philosophical consensus about just what a belief actually is. In most of what follows we intend to avoid this question, but we offer here the following working definition of belief, hopefully general enough to cover most representationalist and dispositional accounts: A belief is a functional state of an organism that represents that organisms endorsement of a particular state of affairs as actual. A misbelief, then, is a belief that to some degree departs from actuality, i.e. it is a functional state endorsing a particular state of affairs that happens not to maintain. A prevailing assumption is that those beliefs which maximise survival will be those which best approximate reality (D. C. Dennett 1987, Millikan 1984a, 1984b, 1993). Humans are thus assumed to have been biologically engineered to form true beliefs by evolution. The notion that true beliefs are adaptive is excellently summarized by M. Scott Peck, author of The Road Less Traveled: The more clearly we see the reality of the world, the better equipped we are to deal with the world. The less clearly we see the reality of the world the more our minds are befuddled by falsehood, misperceptions and illusions the less able we will be to determine correct courses of action and make wise decisions. Our view of reality is like a map with which to negotiate the terrain of life. If the map is true and accurate, we will generally know where we are, and if we have decided where we want to go, we will generally know how to get there. If the map is false and inaccurate, we generally will be lost (1978, p. 44). Pecks book, of course, is a work of pop psychology rather than cognitive science, and as such is more concerned with what is psychologically adaptive rather than evolutionarily adaptive. Nevertheless, his map analogy resonates with the earlier view of Ramsey that beliefs are maps by which we steer (1931, p. 238). According to Paglieri (2006), this is a view shared by most contemporary philosophers interested in practical reasoning, although others have emphasised that talk of beliefs as maps and true beliefs as accurate maps is just metaphorical: I believe that Interstate 5 runs from Solana Beach to La Jolla, but there is nothing in my brain that has the shape of the southern California freeway system (Stich 1990, p. 102). In any case, it remains a widely held assumption that organisms that operate with accurate notions about how the world is structured, about how it works, are better suited to navigating the world and to satisfying their needs for survival. Shifting metaphorical gears, our beliefs about the world (about what is or isnt true) are essentially tools that enable us to act effectively in the world. So beliefs are tools, and true beliefs, it is assumed, are effective tools. A corollary of this assumption is that we tend to adopt an alethic construal of proper-belief-formation-system-functioning we consider belief-formation systems to be functioning properly when belief-formation is predicated upon truth-aiming, alethic reasons (Mele 1993, Millikan 1993). If, however, evolution has designed us to accurately appraise the world and to form true beliefs, how are we to account for the routine exceptions to this rule - instances of misbelief? After all, no-one can deny that our beliefs do often miss the mark, alethically speaking. Most of us believe propositions that
end up being disproven, many of us produce beliefs that others consider obviously false to begin with, and some of us form beliefs that are not just manifestly but bizarrely false. How can this be? The prevailing answer to this question is that misbeliefs result from glitches or breakdowns in the machinery of belief formation. If we conceive of the belief formation system as an information processing system that takes certain inputs (e.g. perceptual inputs) and (via manipulations of these inputs) produces certain outputs (beliefs), then misbeliefs arise from dysfunction in the system. Such misbeliefs are the faulty output of a disordered, defective, abnormal cognitive system. The fact that we are not presently equipped with fail-safe belief-formation systems does not tell against an evolutionary perspective, any more than do the facts that we are not currently endowed with lightspeed nervous systems (Stich 1990) or infallible visual systems. This is because evolution does not necessarily produce optimally designed systems (Dawkins 1982, Stich 1990) and in fact often conspicuously fails to do so. It would be panglossian to think otherwise (Gould & Lewontin 1979, Voltaire 1759/1962): Brilliant as the design of the eye is, it betrays its origin with a tell-tale flaw: the retina is inside out No intelligent designer would put such a clumsy arrangement in a camcorder (D. C. Dennett 2005, p. 11). Evolutionary explorations in Design Space are constrained, among other things, by economic considerations (beyond a certain level, system improvements may exhibit declining marginal utility; Stich 1990), historical vicissitude (the appropriate mutations must occur if selection is to act on them) and the topography of the fitness landscape (selection cannot access optimal design solutions if it must traverse a fitness valley to do so; D. C. Dennett 1995). Evolution, in short, is an imperfect design process, and the products of that process (like the products of other imperfect designers) are imperfect; sometimes the mechanisms miss their target (truth) because of their imperfections. Our intention in this paper is to offer a gloss on the prevailing evolutionary view of misbelief that corrects some important oversimplifications. We agree that misbeliefs can indeed result from imperfections in the belief formation system. We argue, however, that not all misbeliefs arise that way specifically, there are certain situations in which misbelief can actually be adaptive. In those situations, therefore, we can expect that we will be evolutionarily predisposed to form misbeliefs. In short, misbelief evolves. Note that what we are claiming here is not merely that misbeliefs can occasionally arise in the normal course of the belief formation systems operations. After all, just as there are normal (non-defective) readers who misread words on occasion, so there are normal belief-formation systems that produce misbeliefs on occasion; moreover, in some circumstances, these systems may produce misbeliefs even while functioning normally. Millikan articulates this possibility: [T]hat John has a false belief need not indicate that his belief-manufacturing mechanisms are faulty. Indeed, it need not indicate that anything in him is abNormal (except the belief). Perhaps his beliefmaking mechanisms have been laboring under external conditions not Normal for performance of their proper functions Similarly, when John perceives things wrongly this is not always the fault of his perceptual systems. Sometimes Normal conditions for proper functioning of these systems are not met, as when the train on the track next to the Latvian express leaves the station but Johns perception is that it is his train that is leaving instead Because our belief-making systems are dependent for their proper operation upon numerous conditions for which the bodys systems are not responsible, it is not surprising if many of the beliefs of perfectly healthy people are false (Millikan 1993, p.74, italics in original). The fact that our belief-formation systems depend for their proper operation upon certain external conditions is a limitation of those systems, but not a defect. Because evolution is an imperfect design process, the systems we have evolved for representing reality are bound to be limited. However, other things being equal, we might expect that our belief-formation systems would be designed to minimise the occurrence of errors to minimise misbelief. If true beliefs are adaptive (Millikan 1993), then surely it is adaptive to maximise their number? In the next section we will show that this assumption is false in certain cases it is adaptive to have more rather than less misbeliefs.
Error Management Theory: We note that it is easy to dream up anomalous offbeat scenarios where true beliefs are in fact detrimental for survival: [Harry] believed that his flight left at 7:45am Harrys belief was true, and he got to the airport just on time. Unfortunately, the flight crashed, and Harry died. Had Harry falsely believed that the flight left at 8:45, he would have missed the flight and survived. So true belief is sometimes less conducive to survival than false belief (Stich 1990, p. 123). As Stich (1990) notes, cases such as this are highly unusual, and do little to obviate the claim that true beliefs are generally adaptive (see also Millikan 1993). After all, natural selection does not act on anomalous particulars, but rather upon reliable generalizations. Our question, then, is whether there might be cases where misbelief is systematically adaptive. In many circumstances, perhaps most (but not all, as we shall claim), the ideal belief-formation system would be one that formed completely accurate beliefs 100% of the time (Haselton & Buss 2000, Stich 1990). Given that such a system is virtually impossible, however (Stich 1990), trade-offs may arise between overall doxastic accuracy and accuracy in certain situations. Dennett illustrates this point: [I]t might be better for beast B to have some false beliefs about whom B can beat up and whom B cant. Ranking Bs likely antagonists from ferocious to pushover, we certainly want B to believe it cant beat up all the ferocious ones and can beat up all the obvious pushovers, but it is better (because it costs less in discrimination tasks and protects against random perturbations such as bad days and lucky blows) for B to extend I cant beat up x to cover even some beasts it can in fact beat up. Erring on the side of prudence is a well-recognized good strategy, and so Nature can be expected to have valued it on occasions when it came up (D. C. Dennett 1987, p. 51, fn. 3). Stich echoes the logic of this scenario with an example of his own: Consider, for example, the question of whether a certain type of food is poisonous. For an omnivore living in a gastronomically heterogeneous environment, a false positive on such a question would be relatively cheap. If the organism comes to believe that something is poisonous when it is not, it will avoid that food unnecessarily. This may have a small negative impact on its chances of survival and successful reproduction. False negatives, on the other hand, are much more costly in such situations. If the organism comes to believe that a given kind of food is not poisonous when it is, it will not avoid the food and will run a substantial risk of illness or death (1990, pp. 61-62). What these examples suggest is that when there are reliable asymmetries in the costs of errors (Bratman 1992), i.e. when one type of error (false positive or false negative) is consistently more detrimental to fitness than the other, then a system that is biased toward committing the less costly error may be more adaptive than an unbiased system.[1] The suggestion that biologically engineered systems of decision and belief formation exploit such adaptations is the basis of Error Management Theory (EMT; Haselton forthcoming, Haselton & Buss 2000, 2003, Haselton & Nettle 2006). According to EMT, cognitive errors (including misbeliefs) are not necessarily malfunctions reflecting limitations of evolutionary design; rather, such errors may reflect judicious systematic biases that maximise fitness despite increasing overall error rates. Haselton and Buss (2000) use EMT to explain the established phenomenon whereby men overperceive the sexual interest and intent of women (e.g. Abbey 1982, Haselton 2003). They argue that, for men, the perception of sexual intent in women is a domain characterised by recurrent cost asymmetries, such that the cost of inferring sexual intent where none exists (a false-positive error) is outweighed by the cost of falsely inferring a lack of sexual intent (a false-negative). The former error may cost some time and effort spent in fruitless courtship, but the latter error will entail a missed sexual and thus reproductive opportunity an altogether more serious outcome as far as fitness is concerned.
For women, the pattern of cost asymmetries is basically reversed. The cost of inferring a mans interest in familial investment where none exists (a false-positive error) would tend to outweigh the cost of falsely inferring a lack of such interest (a false-negative). The former error may entail the woman consenting to sex and being subsequently abandoned, a serious outcome indeed in arduous ancestral environments. The latter error, on the other hand, would tend merely to delay reproduction for the woman a less costly error, especially given that reproductive opportunities are generally easier for women to acquire than men (Haselton forthcoming). EMT thus predicts that women will tend to underperceive mens intention to commit, a prediction that has received empirical support (Haselton forthcoming, Haselton & Buss 2000). Other EMT predictions that have received empirical support include the hypotheses that recurrent cost asymmetries have produced evolved biases toward overinferring aggressive intentions in others (Duntley & Buss 1998, Haselton & Buss 2000), particularly members of other racial and ethnic groups (Haselton & Nettle 2006, Quillian & Pager 2001); toward overinferring potential danger with regard to snakes (see Haselton & Buss 2003, Haselton & Nettle 2006); toward underestimating the arrival time of approaching sound sources (Haselton & Nettle 2006, Neuhoff 2001); and - reflecting Stichs (1990) example above toward overestimating the likelihood that food is contaminated (see Rozin & Fallon 1987, Rozin, Markwith, & Ross 1990). One objection that might be raised at this point is that the above examples need not actually involve misbelief. Stichs omnivore need not believe that the food in question is poisonous it need not believe anything one way or the other about the food. The issue here is the issue of what doxastic inferences we can draw from the animals behaviour. Most commentators would agree that prudent action should not be confused with belief. After all, we always look before crossing a road, even where we are almost positive that there is no oncoming traffic. Our actions in such a case should not be read as reflecting a belief that there is an oncoming vehicle, but rather as reflecting a belief that there might be an oncoming vehicle (and the absence of a vehicle will not render false the latter belief). If we had to bet our lives one way or another on the matter, we might well bet that there isnt an oncoming vehicle (Bratman 1992). Betting ones life one way or the other, however, is a paradigm case of error symmetry (if were wrong, we die no matter which option we choose). In everyday cases of crossing the road, however, the errors are radically asymmetrical an error one way may indeed mean serious injury or death, but an error the other way will mean only that we have wasted the energy required to turn our heads a couple of times. The upshot of this criticism is that tendencies to overestimate the likelihood that food is contaminated, to overperceive the sexual interest of women, or to overinfer aggressive intentions in others, may reflect judicious decision criteria for action rather than misbeliefs. In other terminology, such tendencies may reveal not (mis)beliefs, but merely acceptances: To accept a proposition is to treat it as a true proposition in one way or another [] to act, in certain respects, as if one believed it (Stalnaker 1984, pp. 79-80). Nature may well prefer to err on the side of prudence, but does she need to instil erroneous beliefs to accomplish this? Or can she make do with cautious acceptances? We move on now to a consideration of evolutionarily adaptive cases where it seems necessary to invoke actual, bona fide (mis)beliefs. Positive Illusions: The perception of reality is called mentally healthy when what the individual sees corresponds to what is actually there. ~ Jahoda (1958, p. 6) In parallel with the prevailing evolutionary view of adaptive belief, a number of psychological traditions have regarded close contact with reality as a cornerstone of mental health (Gana, Alaphilippe, & Bailly 2004, Krebs & Denton 1997, Taylor & Brown 1988). A substantial body of research in recent decades, however, has challenged this view, suggesting instead that optimal mental health is associated with unrealistically positive self-appraisals and beliefs. Taylor and colleagues (e.g. Taylor 1989, Taylor &
Brown 1988) refer to such biased perceptions as positive illusions. Given that positive illusions are defined as beliefs that depart from reality (Taylor & Brown 1988), they qualify as misbeliefs. Such illusions include unrealistically positive self-evaluations, exaggerated perceptions of personal control or mastery, and unrealistic optimism about the future. For example, evidence indicates that there is a widespread tendency for most people to see themselves as better than others on a range of dimensions. This is the better-than-average effect (Alicke 1985) - individuals, on the average, judge themselves to be more intelligent, honest, persistent, original, friendly and reliable than the average person. Most college students tend to believe that they will have a longer-than-average lifespan, while most college instructors believe that they are betterthan-average teachers (Cross 1977). Most people also tend to believe that their driving skills are better than average even those who have been hospitalised for accidents (e.g. McKenna, Stanier, & Lewis 1991, Williams 2003). In fact, most people view themselves as better than average on almost any dimension that is both subjective and socially desirable (Myers 2002). Indeed, with exquisite irony, most people even see themselves as less prone to such self-serving distortions than others (Friedrich 1996, Pronin, Gilovich, & Ross 2004, Pronin, Lin, & Ross 2002). Positive illusions may well be pervasive, but are they adaptive, evolutionarily speaking? For example, do such misbeliefs sustain and enhance physical health and fitness? Research into positive illusions has indicated that unrealistically positive views of one's medical condition and of one's ability to influence it are associated with increased health and longevity (Taylor, Lerner, Sherman, Sage, & McDowell 2003). For example, in studies with HIV-positive and AIDS patients, those with unrealistically positive views of their likely course of illness showed a slower illness course (Reed, Kemeny, Taylor, & Visscher 1999) and a longer survival time (Reed, Kemeny, Taylor, Wang, & Visscher 1994, for a review see Taylor, Kemeny, Reed, Bower, & Gruenewald 2000). Taylor et al. (2000) conjectured that positive illusions might work their medical magic by regulating physiological and neuroendocrine responses to stressful circumstances. Stress-induced activation of the autonomic nervous system and the hypothalamic-pituitary-adrenocortical (HPA) axis facilitates fight or flight responses and is thus adaptive in the short-term. Chronic or recurrent activation of these systems, however, may be detrimental to health (see McEwen 1998), so psychological mechanisms that constrain the activation of such systems may be beneficial. Consistent with the above hypothesis, Taylor et al. (2003) found that self-enhancing cognitions in healthy adults were associated with lower cardiovascular responses to stress, more rapid cardiovascular recovery, and lower baseline cortisol levels. Results linking positive illusions to health benefits are consistent with earlier findings that patients who deny the risks of imminent surgery suffer fewer medical complications and are discharged more quickly than other patients (Goleman, 1987, cited in Krebs & Denton 1997), and that women who cope with breast cancer by employing a denial strategy are more likely to remain recurrence-free than those utilising other coping strategies (Dean & Surtees 1989). In such cases the expectation of recovery appears to facilitate recovery itself, even if that expectation is unrealistic. This dynamic may be at work in cases of the ubiquitous placebo effect, whereby the administration of a medical intervention instigates recovery before the treatment could have had any direct effect and even when the intervention itself is completely bogus (Humphrey 2004). What is striking about these phenomena of positive illusions, from the point of view of the theorist of beliefs as representations, is that they uncover the implicit holism in any system of belief-attribution. To whom do the relevant functional states represent the unrealistic assessment? If only to the autonomic nervous system and the HPA, then theorists would have no reason to call the states misbeliefs at all, since the more parsimonious interpretation would be an adaptive but localized tuning of the error management systems within the modules that control these functions. The fact that this apparently benign and adaptive effect has been achieved by the maintenance of a more global state of falsehood (as revealed in the subjects responses to questionnaires, etc.) is itself, probably, an instance of evolutions sub-optimality as an engineer: in order to achieve this effect, evolution has to misinform the whole organism. With this model to guide us, we may discover similar free-floating rationales (D. C. Dennett 1995, Daniel C. Dennett 2006) for some of the bizarre religious beliefs (apparent misbeliefs) encountered by ethnographers and other social scientists.
References Abbey, A. (1982) Sex differences in attributions for friendly behavior: Do males misperceive females friendliness? Journal of Personality and Social Psychology 42: 830-838. Alicke, M. D. (1985) Global self-evaluation as determined by the desirability and controllability of trait adjectives. Journal of Personality and Social Psychology 49: 16211630. Bratman, M. E. (1992) Practical reasoning and acceptance in a context. Mind 101(401): 1-15. Cross, P. (1977) Not can but will college teachers be improved? New Directions for Higher Education 17(1-15). Dawkins, R. (1982) The extended phenotype, Freeman. Dean, C. & Surtees, P. G. (1989) Do psychological factors predict survival in breast cancer? Journal of Psychosomatic Research 33(5): 561-569. Dennett, D. C. (1987) The Intentional Stance, The MIT Press. Dennett, D. C. (1995) Darwin's Dangerous Idea, Penguin.Dennett, D. C. (2005, August 28th). Show me the science. The New York Times, p. 11. Dennett, D. C. (2006) Breaking the spell: Religion as a natural phenomenon, Viking. Duntley, J. & Buss, D. M. (1998). Evolved anti-homicide modules, Human Behavior and Evolution Society Conference. Davis, CA. Friedrich, J. (1996) On seeing oneself as less self-serving than others: The ultimate self-serving bias? Teaching of Psychology 23(2): 107-109. Gana, K., Alaphilippe, D. & Bailly, N. (2004) Positive illusions and mental and physical health in later life. Aging & Mental Health 8(1): 58-64. Gould, S. J. & Lewontin, R. C. (1979) The spandrels of San Marco and the Panglossian paradigm: A critique of the adaptationist programme. Proceedings of the Royal Society of London, Series B 205(1161): 581-598. Haselton, M. G. (2003) The sexual overperception bias: Evidence of a systematic bias in men from a survey of naturally occurring events. Journal of Research in Personality 37(1): 34-47. Haselton, M. G. (forthcoming) Error Management Theory. In: Encyclopedia of social psychology, ed.êds. R. F. Baumeister & K. Vohs, Sage. Haselton, M. G. & Buss, D. M. (2000) Error management theory: A new perspective on biases in cross-sex mind reading. Journal of Personality and Social Psychology 78(1): 81-91. Haselton, M. G. & Buss, D. M. (2003) Biases in social judgment: Design flaws or design features? In: Responding to the Social World: Implicit and Explicit Processes in Social Judgments and Decisions, ed.êds. J. P. Forgas, K. D. Williams & W. von Hippel, Cambridge University Press. Haselton, M. G. & Nettle, D. (2006) The paranoid optimist: An integrative evolutionary model of cognitive biases. Personality and Social Psychology Review 10(1): 47-66. Humphrey, N. (2004) The placebo effect. In: Oxford Companion to the Mind, ed.êds. R. L. Gregory, Oxford University Press. Krebs, D. L. & Denton, K. (1997) Social illusions and self-deception: The evolution of biases in person perception. In: Evolutionary social psychology, ed.êds. J. A. Simpson & D. T. Kenrick, Lawrence Erlbaum Associates. McEwen, B. S. (1998) Protective and damaging effects of stress mediators. New England Journal of Medicine 338: 171179. McKenna, F. P., Stanier, R. A. & Lewis, C. (1991) Factors underlying illusory self-assessment of driving skill in males and females. Accident Analysis & Prevention 23(1): 45-52. Mele, A. (1993) Motivated belief. Behavior & Philosophy 21(2): 19-27. Millikan, R. (1984a) Language, thought and other biological categories, MIT Press. Millikan, R. (1984b) Naturalistic reflections on knowledge. Pacific Philosophical Quarterly 65(4): 315334. Millikan, R. (1993) White queen psychology and other essays for Alice, MIT Press. Myers, D. (2002) Social psychology, McGraw-Hill. Neuhoff, J. G. (2001) An adaptive bias in the perception of looming auditory motion. Ecological Psychology 13: 87-110. Paglieri, F. (2006). Belief dynamics: From formal models to cognitive architectures,and back again. University of Siena. Peck, M. S. (1978) The road less traveled, Simon & Schuster. Pronin, E., Gilovich, T. & Ross, L. (2004) Objectivity in the eye of the beholder: Divergent perceptions of bias in self versus others. Psychological Review 3: 781-799. Pronin, E., Lin, D. Y. & Ross, L. (2002) The bias blind spot: Perceptions of bias in self versus others. Personality and Social Psychology Bulletin 28(3): 369-381.
Quillian, L. & Pager, D. (2001) Black neighbors, higher crime? The role of racial stereotypes in evaluations of neighborhood crime. American Journal of Sociology 107(717-767). Ramsey, F. (1931) The foundations of mathematics and other logical essays, Routledge & Kegan Paul. Reed, G. M., Kemeny, M. E., Taylor, S. E. & Visscher, B. R. (1999) Negative HIV-specific expectancies and AIDS-related bereavement as predictors of symptom onset in asymptomatic HIVpositive gay men. Health Psychology 18: 354363. Reed, G. M., Kemeny, M. E., Taylor, S. E., Wang, H.-Y. J. & Visscher, B. R. (1994) Realistic acceptance as a predictor of decreased survival time in gay men with AIDS. Health Psychology 13: 299307. Rozin, P. & Fallon, A. E. (1987) A perspective on disgust. Psychological Review 94: 23-41. Rozin, P., Markwith, M. & Ross, B. (1990) The sympathetic magical law of similarity, nominal realism, and neglect of negatives in response to negative labels. Psychological Science 1(383-384). Stalnaker, R. (1984) Inquiry, MIT Press. Stich, S. (1990) The fragmentation of reason, The MIT Press. Taylor, S. E. (1989) Positive illusions: Creative self-deception and the healthy mind, Basic Books. Taylor, S. E. & Brown, J. D. (1988) Illusion and well-being: A social psychological perspective on mental health. Psychological Bulletin 103: 193210. Taylor, S. E., Kemeny, M. E., Bower, J. E., Gruenewald, T. L. & Reed, G. M. (2000) Psychological resources, positive illusions, and health. American Psychologist 55: 99109. Taylor, S. E., Kemeny, M. E., Reed, G. M., Bower, J. E. & Gruenewald, T. L. (2000) Psychological resources, positive illusions, and health. American Psychologist 55: 99109. Taylor, S. E., Lerner, J. S., Sherman, D. K., Sage, R. M. & McDowell, N. K. (2003) Are self-enhancing cognitions associated with healthy or unhealthy biological profiles? Journal of Personality and Social Psychology 85(4): 605615. Voltaire, F. M. A. (1759/1962) Candide, Washington Square Press. Williams, A. F. (2003) Views of U.S. drivers about driving safety. Journal of Safety Research 34(5): 491-494.
[1] Millikan (1993, p. 91) makes a related suggestion: [Belief-fixing] devices might even be, in a sense, designed to deliver some falsehoods. Perhaps, given the difficulty of designing highly accurate belief-fixing mechanisms, it is actually advantageous to fix too many beliefs, letting some of these be false, rather than fix too few beliefs. Coordinately, perhaps our belief-consuming mechanisms are carefully designed to tolerate a large proportion of false beliefs.
Functions, Modules and Dissociation: A Quibble

Bruce Glymour (Kansas State University)
(Date of publication: 11 December 2006)
Abstract: Two functional notions are introduced, and used to characterize the nature of evolutionary constraints. I argue that these constraints show that dissociation is a bad test of modularity: the kind of modularity dissociation tests for does not support the assumption that natural selection optimizes functional roles, and directs attention away from the variables most relevant to the evolution of behavior.
1. Introduction. Work by a host of philosophers over the last three decades has bequeathed us two distinct conceptions of biological function, causal role (CR) functions and proper etiological or evolutionary (ER) functions, each with a menu of bells and whistles. While some have suggested that a unification is possible, none is at present available. Those who use functional notions in practice move back and forth between the conceptions, often inferentially. It is with such inferences that I am nominally concerned. I say nominally because I think attention to them illuminates the importance of modularity, at least for those interested in the evolution of behavior, and this, as much as anything else, is my aim here. In any case, the points I wish to address concern functions of each kind with a particular set of bells and whistles, outlined below. If those bells and whistles exclude functional categories about which you care, or include too many, no matter: my concern is with inferential practice rather than proper definition. Functional claims are causal claims, of a sort. I know how to think about causal inference only when causal relations are held to obtain between variables; the particular set of bells and whistles deployed below allows translation of talk about the function of objects or properties into talk of the function of variables and their values. I claim no originality for the ideas; the causal background is in Spirtes, Glymour and Scheines (2000), and nearly all of the ideas about function and modularity can be found or are intimated somewhere else, e.g. Faber (1984) and Magwene (2001). The presentation is of course mine, as are one or two extrapolations. I take a causal system to be a causal structure over variables and a set of mathematically expressible dependencies between variables in that structure. Sets are represented by bolded letters, variables are italicized, and variable values are capital letters in plain font. Variables are taken to be measured on individual units in some population of units; a causal system over variables characterizes one or more units in the population. Causal structures are represented by directed graphs. Directed edges between nodes represent asymmetric relations of direct causation, relative to the set of variables in the graph. If there is a directed path from V to Y (a sequence of edges, all lying head to tail, out of V and into Y), then V is a cause, simpliciter, of Y. It is assumed that the joint probability density over S factors according to the Causal Markov and Causal faithfulness conditions.
2. Causal Roles with Bells. Many advocates of CR functions have been content that a variable should have a function just in case it has an effect. I am less generous, if only because the inferences with which I am concerned involve CR functions in a specific kind of causal system. Graphically, a variable is functional only if it is on a cycle, i.e. a directed path from the focal variable back to itself (Fig. 1).
Figure 1.
If a directed graph is diachronic, variables will be time-stamped. Suppose two time stamped variables Vt and Vt have values that represent an identical range of properties of a unit, and are measured at distinct times t and t. Then we may say that the values of Vt and Vt represent the same variable V measured at different times. Say that a t-cycle is a directed path in a diachronic graph between two such time-stamped variables (Fig. 2). T-cycles and cycles share certain features that will be important in what follows, so though they differ in others I will call both cycles. Figure 2
Systems with cycles form an interesting class because they can generate peculiar phenomena, among them: one or more focal variables can equilibrate, or may exhibit a cyclical pattern of values over time. Either phenomena may be phase-dependent, e.g. the focal variable exhibits different cyclical patterns in different contexts. For a name, call either sort of phenomena patterned behavior. For another, say that a system S of variables is an economy if 1) the associated causal graph is connected, 2) there is at least one cycle in the graph, and 3) there is some variable P exhibiting patterned behavior P in S. An example is given in Fig. 3. Figure .3 An n-function is specified by the causal role of a variable V on some cycle in an economy E producing patterned behavior P. Exactly what a role is depends on context and interest, but may include at least the following: the immediate causes and effects of V, the mathematical form of the dependencies between V and its immediate causes and effects, and the set of cycles in E on which V occurs. Figure 4 A single variable may be connected to itself by several cycles. If a variable lies on two or more cycles, the cycles may have no variables in common other than the focal variable, or they may share several variables in common. If the cycles are nested, as in Fig. 4, I will take them to be part of a single economy. If cycles are not nested, and each cycle generates a distinctive patterned behavior, as in Fig. 5, then I take it to be a matter of choice whether we diagnose one economy defined over variables in all cycles, or a distinct economy for each choice of cycle and patterned behavior. But, if the latter choice is made, cycles not included in a given economy will constitute constraints on the functional organization of that economy. E.g., if in Fig. 5 an economy E is defined over S={V, M1, M2, P}, then the economy E defined over S={V, I1, I2 , P} constrains E. Figure 5
3. ER Functions with Whistles.
Individual units in a population may differ in the causal system over some set S of variables so that the population includes a mixture of units with different economies over S. The economy over S for a unit may then be regarded as the value of a non-quantitative variable E defined over units in the population, with values E1,En, each representing a specific causal system over S. If in such a population E is a cause of survival and reproductive success (hereafter, SRS) through P, then an nfunction of V with respect to E and P is also an evolutionary function (hereafter an e-function) of V provided the usual further conditions are satisfied. Units characterized by the variables in S must reproduce; the value of E for a unit must be heritable; there must have been variation in E in the lineage to which a unit belongs; this variation must imply variation in fitness among units; and it must be that the fitness of units with E=E was higher than that of units with different values of E. Given this definition, it will not follow from the fact that V has an e-function in E with respect to P that this e-function is evolved. E can be selected for against alternative economies even if the causal role of V is identical in all instantiated values of E. Put differently, that V has an e-function in E with respect to P does not imply that Vs n-function in E is optimized, or even improved, with respect to the influence of P on SRS. Vs n-function may fail to be optimized for trivial reasons, e.g. the optimal variant never occurred, or for structural reasons. Among the structural reasons is the possibility that V has an effect on SRS that is not mediated by P. A special case occurs when V is in distinct but not alternative economies, both influencing SRS through different patterned behaviors. The secondary nfunction of V in such cases represents a constraint on the evolution of E. In what follows some terminology will be helpful. Consider the graph (Fig. 6) of two economies below. Let us call any variable which is in both economies a shared variable (here, V), any variable that is on a cycle that contains a shared variable a constrained cyclical variable (M1, M2, I1 and I2), any variable that is a cause of aconstrained cyclical variable a constrained non-cyclical variable (C1 and C2), and any other variable an unconstrained variable (C3).
4. Inferences. Space prevents any sustained discussion of inference from n-function to e-function. But it is important to note that the strongest available inferences, inferences that can be formalized and quantitatively assessed with respect to reliability, are inferences from the fact that some evolutionary force will tend to remove a trait from population to the claim that the trait is e-functional. But such inferences do not permit a characterization of the relevant e-function. Inferences to more fully characterized e-functions typically depend essentially on a characterization of current n-functions, with the strength of the inference and the specificity of the inferred e-function varying directly with the specificity of the known n-function. As a consequence we are commonly in a position to diagnose at most only the e-functionality of traits involved in generating patterned behavior. My concern here is with inferences from e-functionality to n-functions. It has seemed plausible to some that evolutionary considerations can illuminate the n-function of a trait. The idea seems to be that even if we do not know the exact e-function of a trait, we can sometimes plausibly infer that the trait is e-functional in virtue of its influence on some patterned behavior P=P. Supposing the trait is efunctional, we can then ask what the causal role of that trait would be in an economy that optimized P with respect to SRS. An answer tells us something about the causal structure characterizing the economy generating P, and about the n-function of the trait in that economy. Various extensions are possible, and occur in the literature. Arguments of these kinds nowadays nearly always presuppose modularity, and for good reason. Modularized phenotypes are supposed to be subject to independent evolution under natural selection. It is only given such independence that the assumption of optimization is at all plausible.
Reasons to doubt the cogency of such arguments are legion. I wish to focus on just one: optimization requires the absence of constraint, but standard tests for modularity are not tests for the absence of constraint. Let P be the optimal value of P with respect to SRS; let E be an economy that produces P=P with maximal efficiency; and let V be some variable in E. E will necessarily be selected for over alternatives only if every path from V to SRS goes through P. Any such path that is not mediated by P imposes a constraint on the evolution of E: the optimal value of E is that which maximizes fitness, and this in turn depends on all paths from an economy to SRS. The causal role of V in E will have been selected to optimize P with respect to SRS only if V itself is unconstrained. In particular, if V is a constrained cyclical variable, e.g. is on cycles in distinct economies E and E, then the causal role of V may be relatively stable under selection on P. Changes in E which involve new causes of V will modify the dependence relations between V and its causes in E. So if E is the optimal value of E relative to P, these changes have a fitness cost not born by values of E that leave the causal role of V intact. Further, since the effects of a variable on a cycle are its own (future) causes, this cost is born by values of E that change the effects of V, when those effects are on the cycle between V and itself. This is not to say that constrained cyclical variables are immune to evolution (indeed, there is evidence that shared cyclical variables are especially liable to evolved change, c.f. Fraser, 2005). It is to say that such variables do not evolve in ways that optimize their role in any particular economy. And it is to say further that if economies containing such variables do optimize the patterned behavior they produce, much of the evolution by which they came to optimize this behavior will have involved changes in the causal role or value of unconstrained variables. There are then two relevant notions of independent evolvability. Economies are independently evolvable in a weak sense if natural selection can lead to the evolution of changes in one economy without also leading to changes in others (the quasi-independence of Lewontin, 1978). This requires only the presence of at least one unconstrained, unshared variable in the target economy. Economies are independently evolvable in another, strong, sense if they contain no constrained variables. The arguments from e-functionality to n-function glossed above presuppose the strong sense of independence. So the question is whether standard tests for modularity can establish anything like such independence. As it happens, they cannot. I set aside tests employing genotype-phenotype maps (e.g. Mezey et al. 2000) and functional specialization. The first (and much the best) fails to consider the functional role of phenotypes in systems of phenotypic variables, and the second reduces to the more standard test of dissociation. I take it that two phenotypes are dissociated if there is some context in which they vary independently of one another. Double dissociation occurs if there is some context in which the first phenotype can be modified without a concomitant modification in the second, and some other context in which the second can be modified without concomitant variation in the first. Suppose the phenotypes in question are patterned behaviors P and P, generated by economies E and E. P and P are dissociable just in case there is some intervention on some variable that modifies P, so PP, without also modifying P, so P=P. That is possible so long as E contains any one unshared, unconstrained variable. Double dissociation requires only that E also contain an unshared, unconstrained variable. Clearly, dissociation is really bad evidence for the claim that E, and the role of any arbitrary variable V in E, are unconstrained by E. Hence, dissociation is also really bad evidence for strong independent evolvability. Tests for modularity, then, do establish independent evolvability in the weak sense. Dissociation establishes the presence of an unconstrained variable; that variable and its causal role are subject to evolution by natural selection, and so dissociated modules can be modified by natural selection without changes in other economies. But standard tests, and dissociation in particular, do not establish independent evolvability in the strong sense, which requires the absence of constraint. So much, perhaps, differs from common knowledge only in the causal details by which constraints are characterized. I rehearse the result because I think the details bear further consideration.
5. Conclusion: Independence, Modularity and Function. I do not know whether some particular conception of modularity will turn out to adequately serve the interests of theoreticians and experimentalists, or whether, as with functions, no univocal notion will do. I suppose that any adequate notion will require at least the weak sense of evolutionary independence. I do claim, however, that conceptions of modularity that do not also require independence in the strong sense will not underwrite inferences to optimization. I also claim that dissociation is an inadequate test of modularity if modularity is taken to underwrite inferences to optimization. I wish to suggest something further. Modules can be defined so that they are independent in the strong sense. Suppose we say that variables have n-functions relative to modular economies, and take a set S of variables to be modular relative to a larger set U of variables if S is connected and there is some proper subset N of S such that conditioning on N renders every other member of S probabilistically independent of every member of U not in S. Then there will be no shared cyclical variables in modular economies. However, the economies operative in modules so defined may generate more than one selectively relevant patterned behavior. To see why, consider Fig. 6. There are here two economies, E and E, but only one modular economy, E+E. This is because the set S={V, C1, C2, C3, M1, M2, P} over which E is defined is not modular: if V is omitted from N then I2 and V are associated conditional on N; if V is included in N then I2 and M2 are associated conditional on N. To induce the relevant independence relations, we must collapse economies that share cyclical variables. Consequently, V has an nfunction in E+E relative not only to P and P individually, but also to their conjunction, which is itself a patterned behavior. It is this n-function, and no other, that is identical with its e-function: the e-function of V in E+E must be defined with respect to the optimal tradeoff between optimal values of P and optimal values of P. I think this is revealing. In order to validate the assumption that evolution optimizes a given behavior we must define modules so that they are strongly independent, else the possibility of evolutionary tradeoffs between otherwise optimal behaviors threatens to undermine the assumption that evolution optimizes any one of those behaviors. But modules so defined may potentially generate several evolutionarily relevant behaviors, internalizing, as it were, the tradeoff. And realizing that tradeoff is, in some sense, the function of cyclical variables shared by distinct non-modular economies operating within the larger modular economy. So my suggestion. Dissociation is essentially a test for presence of unconstrained variables in an economy. Insofar as the notion of independence used in conceptualizing modularity is tested by dissociation, a focus on modularity is a focus on the causal role of such variables. But if one is interested in the evolution of behavior these are perhaps the least interesting of variables. As the above definition of modularity reveals, it is shared variables rather than unconstrained variables which embody tradeoffs between components of fitness. Characterizing those tradeoffs is, of course, one of the really hard problems in the evolution of behavior, e.g. the evolution of life-histories. So for the evolutionary behaviorist, at least, the action would appear to be in shared cyclical variables. If modularity is not useful for illuminating their evolutionary role, it may turn out not to be so terribly interesting after all.
References Faber, R. (1984): Feedback Selection and Function: A Reductionistic Account of Goal-Orientation, in Methodology, Metaphysics and the History of Science, R. Cohen and M Wartofsky (eds.), pp 43-136, D. Reidel Publishing, Dordrecht, Holland. Fraser, H. (2005): Modularity and evolutionary constraint on proteins, Nature Genetics, 37 (4):351-2.
Lewontin, R. (1978), Adaptation, Scientific American, 239(3); 156-159. Magwene, P. (2001): New Tools for Studying Integration and Modularity, Evolution, 55(9):1734-1745. Mezey, J., J. Cheverud and G. Wagner (2000): Is the Genotype-Phenotype Map Modular?: A Statistical Approach Using Mouse Quantitative Trait Loci Data, Genetics, 156:305-311. nd Spirtes, P. C. Glymour and R. Scheines (2000): Causation Prediction and Search, 2 edition, MIT Press, Cambridge, Mass
The Dynamic Nature of Representation

Mark Bickhard (Lehigh University)
(Date of publication: 27 November 2006)
Abstract: Representation has emerged in evolution, and it likely emerges quotidianly in constructive processes of learning and development. Accounting for the possibility of such emergence is among the deepest contemporary challenges to naturalism. A model of representation is outlined, called interactivism, which accommodates a naturalistic evolutionary emergence. It is in the generally pragmatist tradition, though it differs in crucial ways from pragmatist notions of representation. It has resources for accounting for higher level forms of representation and cognition, such as of objects, abstractions, rationality, language, and so on.
The Dynamic Nature of Representation (1) Representation did not exist moments after the Big Bang; it does now. Representation has emerged. Accounting for that emergence is among the central problems of naturalism today. I outline a model of the emergent nature of representation called interactivism. This model is in the general tradition of pragmatism, and fits well with the evolutionary and biological ground for representation. Representation Interactivism models representation as emergent in a particular kind of biological function, so the first focus is to model the emergence of biological function (2). The normativity of representation derives from that of biological normative function (Bickhard, 1993), and the normativity of biological function derives from certain thermodynamic considerations. Self-Maintenance and Function There are two general kinds of stability of patterns of process: 1) Some organizations of process are in energy wells, in the sense that a change in the organization would require the introduction of energy above what is currently impinging on the process. Atoms, molecules, and much of the standard furniture of the world is temporally persisting because of such energy well stabilities. 2) The second form of such stability is that of processes that are far from thermodynamic equilibrium. Such a process will move toward equilibrium, and thus cease to exist, unless some active counterinfluence is operative. Thus, they are open systems of ontological necessity: If cut off from their environments, they cannot remain far-from-equilibrium, and they cease. In some cases, those influences are completely external to the system itself. A chemical bath can be maintained in a far-from-equilibrium condition, for example, with the pumping into the chamber of appropriate chemicals. Any stability is dependent on the continuing operation of the pumps and availability of the chemicals. Self-Maintenant Systems. A more interesting case for current purposes, however, is the class of farfrom-equilibrium systems that make contributions to their own stability. A canonical example is a candle flame. A candle flame maintains above combustion threshold temperatures; it vaporizes wax into flammable gases; and in standard atmospheric and gravitational conditions it induces convection, which pulls in fresh oxygen and gets rid of waste products. A candle flame is, in several ways, selfmaintenant.
Recursive Self-Maintenance. A self maintenant system can maintain itself over some range of conditions if a candle is put into a vacuum or doused with water, it ceases. Some systems, however, can, in addition, contribute to their own stability over a range of changes in conditions. They can change what they do to maintain stability in accordance with changes in environmental conditions. A bacterium, for example, might swim and continue swimming if it is going up a sugar gradient, but tumble if it finds itself swimming down a sugar gradient (D. T. Campbell, 1990). It maintains its condition of being self-maintenant in the face of changing environmental conditions: it is recursively self-maintenant (Bickhard, 1993). (For a defense of metaphysical emergence, see Bickhard (2000c, 2003, in preparation) Function. There is now in place a sufficient model to address both function and representation. Function first: Serving a function is modeled as making a contribution to far-from-equilibrium stability. Serving a function, therefore, is relative to the system which is being contributed to. A heart, for example, may serve a function for a parasite, but be dysfunctional for the host. The normativity of function will be similarly contextualized. Note that serving a function contributes to the stability of a farfromequilibrium process, which has distinct causal consequences in the world: this is not a model of epiphenomenal function. The Function of Action Selection and Dynamic Presupposition A recursively self-maintenant system may just switch from one interaction with its environment to another as differentiated conditions change, such as is the case for the swimming and tumbling of the bacterium, or it may set up indications of multiple interactions that would be appropriate in current circumstances, and engage in some more complicated process of (inter)action selection. That is, action selection can occur via simple triggering, or via more complex selection processes among indicated interaction potentialities. There is much to be addressed about such systems of action selection, but the crucial point for now is that any triggering of an interaction, or any indication of the current appropriateness of an interaction, presupposes that that interaction is in fact appropriate for the current conditions. Continuing to swim down a sugar gradient is, in general, not appropriate. Appropriateness here is a normative notion, and the normativity is a functional normativity. That is, it is derived from the norm of contributing to the farfrom-equilibrium stability of the system. Interaction (types) will tend to be appropriate in some conditions, and not in others. An indication of the appropriateness of an interaction, therefore, dynamically presupposes that those conditions obtain. The dynamic presuppositions of an interaction or interaction indication are those conditions that would make that interaction appropriate, that render it likely to make a functional contribution. More generally, a process dynamically presupposes whatever those conditions are, internal to the system or external to the system, that support its being functional for the system. Representation and Content Representational content. The dynamic presuppositions of a blood circulatory system will in general be internal: hearts and kidneys, for example. The dynamic presuppositions of an interaction indication will be about the environment. If those dynamic presuppositions do not hold, then the interaction will fail. That is, if those dynamic presuppositions are false, the interaction will fail. Interactive dynamic presuppositions, then, can be true or false, and they can be true or false about the environment. Interactive dynamic presuppositions constitute representational content about the environment. Interaction indications, in this model, are the primitive form of representation. They predicate of the environment that the environment possesses the dynamically presupposed conditions. They predicate that content of the environment. Such an interactive representation may be false: the dynamically presupposed conditions may not be true. Furthermore, they may be (fallibly) discovered to be false: if the system engages in the indicated interaction, and it does not proceed as indicated, then the dynamic presuppositions, the content, was false, and was falsified. In this model, not only the possibility of error, but also of system-detectable error, are trivially accounted for.
This is indeed a primitive form of representation. More needs to be addressed to indicate its potential to be ground for all representation. It is also a model of representation that has several unfamiliar properties properties not common in standard models. These too will be outlined. More Familiar Representations: Objects. I will address first how the interactive model could account for the representation of physical objects. If an organism differentiates a relevant condition in its environment, it will invoke indications of appropriate further interactive potentialities: If a frog differentiates a fly, that differentiation will invoke indications of the possibility of tongue flicking and eating. Even when that differentiation process is inactive, however, the control infrastructure that would engage in it, and its relationships to interaction indications, are still present in the system. Such an aspect of the control structure constitutes a conditionalized indication of interaction potentialities: if XYZ differentiation is made, then QRS interactions will be indicated. Conditionalization, in turn, creates the possibility of iterating such indications: if XYZ differentiation occurs, then QRS is possible, and if QRS occurs, then ABC will be possible (4). So, interaction indications can both be multiple they can branch and they can iterate. As such, they can form webs of interconnected conditionalized indications of interaction potentiality perhaps vast and complex webs. Some subwebs of such a larger web may come to have special properties. In particular, they may be internally reachable, in the sense that any indicated interaction anywhere in the subweb is reachable as a direct interaction potentiality, perhaps via various intermediary conditional interactions, and that internal reachability property may remain invariant under some relevant class of other kinds of interactions. For example, a childs toy block will afford multiple potentialities of visual scans and manipulations. Any one of these potentialities is available from any other e.g., you can always turn the block back so that an earlier visual scan is again possible so the subweb of interactive potentialities for this block is internally reachable. And that internal reachability itself remains invariant under a large class of other interactions, such as putting the toy away in the toy box, the child leaving the room, and so on. It is not invariant under all possible interactions, however, such as crushing or burning the block. This outlines the general manner in which the interactive model can scale from simple interaction possibility representations to representations of physical objects. It is a generally Piagetian, or pragmatic (5) model of object representation, and I would suggest a generally Piagetian approach to other more complex kinds of representation, such as abstract representations such as of numbers (6) What about Input Processing? Models of representation are standardly what the pragmatists called spectator models. They are models of some homunculus staring back down the input stream, processing inputs, rather than future oriented models of interactive anticipation. But input processing clearly does occur in sensory systems, for example. If such input processing is not to be taken as somehow constituting or generating representation, what account is to be given of it? The interactive model distinguishes between two aspects of epistemic relationship to the world: contact and content. Contact with the environment is provided by the differentiations of that environment. Such differentiations are the basis for setting up indications of further interactive potentialities; they are how the system can locate itself in its web of conditional interactive indications. Without contact, no interactive content, no indications of potentiality, would have any likelihood of being appropriate for any particular environment. Such indications, in turn, constitute representational content. It is such anticipatory indications that involve dynamic presuppositions, presuppositions that can be false. It is in these presuppositions that representation is emergent. Differentiation in general is generated by the internal outcomes of previous interactions. If an interaction control system is engaged in interaction with an environment, the internal course of that interaction will be partially determined by the control system, but importantly determined by the environment. Differing environments will yield differing internal courses of the interaction, and differing internal outcomes of the interaction. Any particular possible outcome of an interaction serves to differentiate those environments what would yield that outcome from those that would yield a different outcome: the outcomes differentiate types of environments. There is no other information available in
such a differentiating outcome per se about what kind of environment it differentiates, but nevertheless it may be useful for setting up indications of further interactive potentialities. If so, then any such indication predicates of that environment whatever properties are dynamically presupposed by those indications. It is the future oriented indications that represent (something about) the differentiated environment, not the differentiations per se. Differentiations in general may involve full interactions, but a simple version would be a differentiation process that had no outputs, a passive differentiation. A passive differentiation is a differentiation nevertheless, and can serve as the basis for further indications of interactive potentiality. But passive differentiations are just input processing. Input processing, then, is an aspect of the interactive model just as it is for spectator models. The difference is that standard models take input processing as constituting or generating representation, while the interactive model takes it to be only a simple case of the general function of differentiation of contact. In effect, input processing models conflate contact and content; they take whatever the contact is in fact with as somehow the content of the purported representation. Properties of Representation It is a programmatic task to demonstrate the adequacy of the interactive model for all forms of (purported) representation perception, memory, rational thought, language, and so on. These have been addressed elsewhere.7 For current purposes, I will take it as demonstrated that the interactive model is a candidate for a model of the nature of representation, and proceed to examine some further aspects of that nature, making a few comparisons with alternative models along the way. Representational Error. As pointed out earlier, the possibility of representational error is trivially accounted for in the interactive model: the dynamic presuppositions may be false. This is in contrast to correspondence models of representation, that simply do not inherently have the resources to account for error, and must, at best, superimpose some additional criterion for error on the basic correspondence framework. The limitation is, simply, that if a purported representation constituting correspondence exists,then the representation exists and is correct, while if the crucial correspondence does not exist, then the representation does not exist. There are only two model possibilities the correspondence exists or the correspondence does not exist but there are three conditions to be modeled the representation exists and is correct, the representation exists and is false, and the representation does not exist (8). One attempt to introduce such an error criterion is Fodors asymmetric dependency criterion. Consider two conditions under which a representation is invoked, one purportedly correct and the other incorrect. If the representation is constituted simply in the invocation relationship (be it causal, nomological, informational, or whatever), then the purportedly incorrect deployment of the representation is just as legitimate a participant in the representational constitution as is the correct object. So, if the objects are X and Y, there are no grounds for the claim that the representation is supposed to represent Xs and that its invocation for Y is in error. Instead, since both Xs and Ys activate the representation, the content should be construed as X or Y and the possibility of error evaporates. This so-called disjunction problem is just one version of the general problem of accounting for representational error. Fodor has suggested that the correct and incorrect cases can be distinguished in the following way: the incorrect invocation is dependent on the correct invocation in the sense that the incorrect deployment would never occur if the correct case didnt exist, but that dependency is not reciprocated; it is asymmetric in the sense that the correct case could occur even if the incorrect case never did. In the by now canonical example, if the COW representation is invoked by a horse on a dark night, that is in error because such horse on dark night invocations are asymmetrically dependent on invocations by cows (Fodor, 1990, 1991) There are multiple problems with Fodors model, but a straightforward counterexample to the asymmetric dependency criterion is the following: Consider a neurotransmitter docking on a receptor in a cell surface and evoking corresponding activities in the cell. Here we have full biological and nomological correspondences. Now consider a poison molecule that mimics the neurotransmitter and
also docks on that receptor. Here is a clear case of asymmetric dependency, yet at best we have a case of functional error, not representational error. Fodor cannot account for the possibility of representational error (Bickhard, 1993; Levine & Bickhard, 1999)(9). System-detectable Error. In the interactive model, if an indicated interaction is undertaken and the interaction does not proceed as indicated, then the indication is false, and is falsified for the system itself in a way that is potentially usable by that system. Representational error is system-detectable. Only if error is system-detectable can it be used to guide further behavior or to guide learning processes. Clearly system-detectable error occurs, and, therefore, any model which cannot account for it is impeached. In general, models of representation do not even address the criterion of systemdetectable error. It is clear, however, that standard models cannot account for it (Bickhard, 1999, 2003, in preparation). No organism can take into account the evolutionary or learning history of its functional representations, or the asymmetric dependencies among potential invocations of its representations, to be able to determine what its representations are supposed to represent. Nor can they then compare that normative content with what is currently being represented to find out if the representation is being truly applied or falsely applied to accomplish the latter is the problem of representation all over again. So, if system detectable error is not possible, then error guided behavior and learning are not possible. We know that error guided behavior and learning occur, therefore system detectable error occurs. The model outlined in the text is the only model in the literature that addresses system detectable error. Therefore, it is the only model that can account for a fundamental property of representation, that is not refuted by the fundamental fact of system or organism detections of error. Furthermore, the core radical skeptical argument is an argument against the possibility of system detectable representational error we would have to step outside of ourselves to be able to compare our representations with what they purport to represent to be able to detect error in our own representations. This argument has stood for millennia. It is a valid argument, but unsound: it presupposes that the only form of representation is an encoding, past-oriented, form, such as in information semantics. Representation as pragmatically future oriented interactively anticipatory transcends the presuppositions of this argument. Future Orientation. Correspondence models of representation are past oriented, with the input processing spectator looking backwards down the input stream. The interactive model is future oriented. Representation is, most fundamentally, of future potentialities of interaction. Future orientation is a feature of pragmatist models generally, but is rarely found in contemporary models. It is the future orientation of the interactive model that makes accounting for error and for systemdetectable error so immediate. Modality. Interactive representation is of future potentialities of interaction that is, representation is of possibilities. Interactive representation, then, is inherently modal. Standard models rarely address this issue, but the presumption is that representation is of actualities (whatever is actually on the other end of the input stream) and that modality is something to be added or dealt with later. Interestingly, young childrens cognition is inherently modal, with actuality, possibility, and necessity being poorly differentiated, rather than being non-modal with modality developing later (Bickhard, 1988; Piaget, 1987). Implicitness. Interactive content is the dynamic presuppositions made in indications of interactive potentiality. Those presuppositions are not explicitly represented; instead, they are implicit in the indications themselves. It can be explicit that an interaction of a particular kind, arriving at a designated outcome, indicates that one or more further interactions would be possible, but what supports those indications, what is presupposed about the environment by those indications, is not explicit. This implicitness of content is fundamentally different from standard models. Encodings cannot be encodings without explicit content. Implicitness is a source of some of the power of the interactive
model for example, I argue elsewhere that the frame problems arise largely from attempting to render implicit content in explicit form (Bickhard, 2001; Bickhard & Terveen, 1995)(11) The interactive model easily accounts for the possibility of representational error, as well as the possibility of an even stronger criterion: system-detectable representational error. It also has the consequences that representation is future oriented, modal, and, at base, implicit. In all these respects, it differs radically from standard models. Conclusion Interactive representation is naturalistically emergent as the solution to the problem of action selection. It is not epiphenomenal, and it emerges naturally in the evolution of biological agents. It has resources with which to model more complex forms of representation. Interactive representation has truth value; it trivially accounts for the possibility of representational error; and it accounts for the possibility of systemdetectable error, and is thus compatible with the facts of error guided behavior and error guided learning. Interactive representation is a candidate for modeling the fundamental nature of representation. References Bickhard, M. H. (1980). Cognition, Convention, and Communication. New York: Praeger Publishers. Bickhard, M. H. (1988). The Necessity of Possibility and Necessity. Review of Piagets Possibility and Necessity. Harvard Educational Review, 58, No. 4, 502-507. Bickhard, M. H. (1993). Representational Content in Humans and Machines. Journal of Experimental and Theoretical Artificial Intelligence, 5, 285-333. Bickhard, M. H. (1998). Levels of Representationality. Journal of Experimental and Theoretical Artificial Intelligence, 10(2), 179-215. 12 Bickhard, M. H. (1999). Interaction and Representation. Theory & Psychology, 9(4), 435-458. Bickhard, M. H. (2000). Autonomy, Function, and Representation. Communication and Cognition Artificial Intelligence, 17(3-4), 111-131. Bickhard, M. H. (2000b). Motivation and Emotion: An Interactive Process Model. In R. D. Ellis, N. Newton (Eds.) The Caldron of Consciousness. (161-178). J. Benjamins. Bickhard, M. H. (2000c). Emergence. In P. B. Andersen, C. Emmeche, N. O. Finnemann, P. V. Christiansen (Eds.) Downward Causation. (322-348). Aarhus, Denmark: University of Aarhus Press. Bickhard, M. H. (2001). Why Children Dont Have to Solve the Frame Problems: Cognitive Representations are not Encodings. Developmental Review, 21, 224- 262. Bickhard, M. H. (2003). Process and Emergence: Normative Function and Representation. In: J. Seibt (Ed.) Process Theories: Crossdisciplinary Studies in Dynamic Categories. (121-155). Dordrecht: Kluwer Academic. Bickhard, M. H. (2004). The Dynamic Emergence of Representation. In H. Clapin, P. Staines, P. Slezak (Eds.) Representation in Mind: New Approaches to Mental Representation. (71-90). Elsevier. Bickhard, M. H. (in preparation). The Whole Person: Toward a Naturalism of Persons Contributions to an Ontological Psychology. Bickhard, M. H., Campbell, R. L. (1989). Interactivism and Genetic Epistemology. Archives de Psychologie, 57(221), 99-121. Bickhard, M. H., Campbell, R. L. (1992). Some Foundational Questions Concerning Language Studies: With a Focus on Categorial Grammars and Model Theoretic Possible Worlds Semantics. Journal of Pragmatics, 17(5/6), 401-433. Bickhard, M. H., Richie, D. M. (1983). On the Nature of Representation: A Case Study of James J. Gibsons Theory of Perception. New York: Praeger. 13 Bickhard, M. H., Terveen, L. (1995). Foundational Issues in Artificial Intelligence and Cognitive Science Impasse and Solution. Amsterdam: Elsevier Scientific. Campbell, D. T. (1990). Levels of Organization, Downward Causation, and the Selection-Theory Approach to Evolutionary Epistemology. In Greenberg, G., & Tobach, E. (Eds.) Theories of the Evolution of Knowing. (1-17). Hillsdale, NJ: Erlbaum. Campbell, R. L., Bickhard, M. H. (1986). Knowing Levels and Developmental Stages. Basel: Karger. Christensen, W. D., Bickhard, M. H. (2002). The Process Dynamics of Normative Function. Monist, 85(1), 3-28.
Cummins, R. (1996). Representations, Targets, and Attitudes. MIT. Dretske, F. I. (1988). Explaining Behavior. Cambridge, MA: MIT Press. Fodor, J. A. (1990). A Theory of Content. Cambridge, MA: MIT Press. Fodor, J. A. (1991). Replies. In B. Loewer, G. Rey (Eds.) Meaning in Mind: Fodor and his critics. (255319). Oxford: Blackwell. Levine, A., Bickhard, M. H. (1999). Concepts: Where Fodor Went Wrong. Philosophical Psychology, 12(1), 5-23. Millikan, R. G. (1984). Language, Thought, and Other Biological Categories. Cambridge, MA: MIT Press. Millikan, R. G. (1993). White Queen Psychology and Other Essays for Alice. Cambridge, MA: MIT Press. Piaget, J. (1987). Possibility and Necessity. Vols. 1 and 2. Minneapolis: U. of Minnesota Press. Rosenthal, S. B. (1983). Meaning as Habit: Some Systematic Implications of Peirces Pragmatism. In E. Freeman (Ed.) The Relevance of Charles Peirce. (312-327). La Salle, IL: Monist. Notes
1. An earlier version of this paper was given at Representation in Mind, University of Sydney, Sydney, Australia, June 27-29, 2000, as The Dynamic Emergence of Representation. It is published as Bickhard (2004). 2.. For a defense of metaphysical emergence, see Bickhard (2000c, 2003, in preparation). 3. It should be noted that, in addressing serving a function prior to having a function, this explication turns upside down the explicatory organization in etiological models. There are additional fundamentaldifferences. I argue, among other points, that etiological models of function are causally epiphenomenal. See Bickhard (1993, 2003, in preparation). 4. For a more detailed treatment, see (Bickhard, 1993, 2000, 2003, 2004, in preparation; Bickhard & Terveen, 1995). 5. The interactive model is a pragmatic model in the sense of being action based rather than a spectator model (see below), but it is closer to Peirces model of meaning as anticipatory habit than to his model of representation per se (Rosenthal, 1983). Anticipations can be false, and can be (fallibly) detected to be false. 6. I characterize these as generally Piagetian because, although one of Piagets many massive contributions was to construct such action based representations, I dont think the details of his model are all correct (e.g., Bickhard & Campbell, 1989; Campbell & Bickhard, 1986). 7. E.g., Bickhard, 1980, 1998, 2000b, 2001, in preparation; Bickhard & Campbell, 1992; Bickhard & Richie, 1983; Bickhard & Terveen, 1995; Campbell & Bickhard, 1986. 8. See Millikan (1984) for this point. 9. Millikan (1984, 1993), Dretske (1988), and Cummins (1996) all offer differing ways of accounting for error, which, I argue, all fail (Bickhard, 1993, 1999, 2000, 2003, 2004, in preparation). For reasons of space, I will not develop these arguments here.
Adaptation and representation an introduction

Anne Reboul and Adrianna Wozniak (Institute for Cognitive Sciences, CNRS)
(Date of publication: 13 November 2006)
Abstract: In this paper, we argue that neither phylogenetic externalism, nor semantic externalism, which in its naturalist version is linked with it, can offer a guarantee of semantic adequacy for innate (or phylogenetically acquired) representations, though it can offer a guarantee for ontogenetically acquired (current) representations. This, we argue, is because the type of causality involved in phylogeny is not of the right type in that it does not link specific individuals or occurrences to specific representations.
1. Introduction The notion of representation plays an important if not a central role in philosophy, biology, ethology, linguistics and neurosciences. It is also the criterion for differentiating cognitive from behaviorist approaches. (Innate) representations and their relation with adaptive processes will be the topic of the present paper. We will begin by a tentative definition, according to which a representation is the way by which an object is given to the subject. It should be clear that this definition is absolutely neutral regarding the origin of the representation, i.e. whether it is innate or acquired. What will interest us here is the criterion for the truth of a representation (in the case in which it is propositional) or its adequation to reality (when it is not). In the case of acquired representations, those that are produced by experience, such a criterion seems, at least intuitively, relatively transparent: a representation is semantically adequate in as much as it agrees with its object. Additionally, the very fact that it is caused by its object acts as a guarantee of its semantic adequacy (the guarantee in question is not a guarantee of truth (for propositional representations), but the guarantee that the representation has a cause in the present experience of the subject and the guarantee that there is a criterion of semantic adequacy). The problem is, however, rather more complex when the representation does not arise as the result of an experience, but is genetically determined, in other words, when it is innate. In such a case, prima facie, no experience guarantees the adequacy of the representation to its object. This is what we will call the problem of the missing criterion. The present paper aims to show that it is indeed a problem and that some attempts to solve it may not be as successful as is generally thought. The problem, however, will only arise in case there actually are innate representations. Can there be a proof of the existence of innate representations? Briefly, we want to follow Quines (1960, 5) path here, when he says Substracting his [environmental] cues from his world view, we get mans net contribution as the difference. This difference marks the extent of mans conceptual sovereignty the domain within which he can revise theory while saving data and suggest the following formula: innate (for a species) = representation (environmental contribution + individual variability) In other words, to know what is innate in a (possibly species specific) representation (or in a representation schema), one must substract from the representation the contribution of the environment and what is variable from one member of the species considered to the next. It has been claimed by both psychologists and philosophers that the human species shares a common conceptuel schema, which up to a point determines the human view of the world. This has been claimed not only for relatively mundane things such as natural kinds (trees, animals, etc.), but also for domains such as religion (see Boyer 2001). Though some variation has indeed been found, notably in the domain of color and in that of space, as attested by linguistic variation in the expression of color categories and
spatial relations (see Roberson et al. 2005, Levinson 2003, Kay & Regier 2006), it is not clear that it amounts to anything much in terms of criticizing the notion of a innate human conceptual schema. Indeed, Gopnik (2001) has claimed that such variations are not only limited in scope, but are either notational variants or are equivalent in terms of cognitive efficiency. Let us suppose, for the time being, that there is such a thing as a human conceptual schema which, in one way or another, actually constrains the human ability for knowledge (The term constrain is not intended positively as unleashing unlimited intellectual powers or negatively as implying basic limits on epistemic abilities , but as neutral). If this is so, how can such a representation schema be consider as fitting external reality? What, if anything, is the criterion which could determine such a fit? In 1942, Konrad Lorenz tackled the problem in a way which was designed not so much to solve it as to dissolve it: he proposed that in fact exactly the same criterion and the same guarantee should be applied for innate as for acquired representations. In other words, in both cases, what guaranteed the fitness of the representation was that it was caused by an (external) object. The difference between acquired and innate representations, on this view, is not the presence and absence of a relevant experience, but rather the time scale of that experience: whereas acquired representations are caused by an experience during the life of an individual (during its ontogeny), innate representations are caused by (repeated) experiences during the lives of ancestors (during the phylogeny of the species). Thus, the guarantee and adequacy of innate representations is to be determined by the same criterion as the guarantee and adequacy of acquired representations, i.e. the causal relation between the representation and its object. Up to a point this makes Lorenzs view a kind of ancestor to the current philosophical position on representation which is called semantic externalism. 2. Semantic externalism Semantic externalism has its roots in a paper by Putnam (1975) which relied on a by now famous thought experiment. Putnam supposed that there exists a planet, microphysically identical with the Earth Twin Earth but for one detail: what passes for water on Twin Earth is not H2O, but XYZ. On this planet, live counterparts of Earthians, who, being microphysically identical with Earthians, are also behavorially indistinguishable from them. When Earthian Ruth says, on Earth, Water is liquid at room temperature, Twin Earthian TWRuth says the same. The question which interested Putnam was whether they actually mean the same thing or two different things. If they mean the same thing, then given the (type) identity of their brain, meaning is in the head. If they dont and for the same reason meaning is not in the head. As is well-known, Putnam answered his question in the negative: though they utter the same sentence and have type-identical brains, Ruth and TWRuth dont mean the same thing, as the word water in Ruths utterance refers to H2O, while, in TWRuths utterance, it refers to XYZ. Thus, in Putnams immortal phrase, meanings aint in the head. This has more often than not been taken as implying that error is impossible: the meaning of whatever you say is determined by external reality (hence the name semantic externalism) and it is thus impossible that it should not agree with it3 (see the papers in Nucettelli 2003). Semantic externalism has developed since Putnam, through, notably, Ruth Millikans (1993, 2000, 2004, 2005) and Dretskes (1981) work. Basically, Dretske and Millikan have pursued Putnams externalism and have used it to try and naturalize intentionality, that is the fact that mental states are about, aimed at, objects. Their way of doing it, very roughly, is through a view which is, in some ways at least, very near to Lorenzs. They see representations as either phylogenetic or ontogenetic, i.e. as caused by the (external) environment, as adaptations to it (in as much as they can influence behavior) either at the level of the species (during the course of its evolution) or at the level of the individual (for instance through associative learning). In other words, for them, representations are biological features of the organism, on a par with its organs. For instance, according to Millikan (1993), a representation has a proper function, determined by its origin (either phylogenetic or ontogenetic), just as an organ has: and, just as an organ (e.g. the heart) will still have a proper function (e.g. pumping blood) even if it malfunctions, a representation will still have its proper function (representing the (type of) object which caused it), even if misapplied. In other words, this version of semantic externalism,
which has been prominent in the current endeavor to naturalize the mind, is heavily dependent on central features of the theory of natural selection. 3. Phylogenetic externalism Thus, according to semantic externalism in its naturalistic version, the guarantee and criterion for semantic adequacy should be the same for innate and for acquired representations. As far as innate representations are concerned, it rests on a strong separation between the organism or the species in which the representations occur and their environments. This agrees with one major tenet of the theory of natural selection, i.e. phylogenetic externalism, which supposes that what evolves e.g. the representations and the organisms of which they are features is always the same thing and what causally triggers the evolution the external environment is also always the same thing. In other words, the causality involved is not only asymmetrical (which is as it should be), it is also one-sided: environmental factors cause change (evolution) in the organisms, but never the reverse. More precisely, given genetic variability among individuals, the role of the environment is that of a sieve: it will let some organisms live and reproduce, while others will not, changing the species. In brief, the organism provides the genetic variability, but the environment is the judge of the fitness of the specific variant provided. In other words, the adaptive evolution is always a function of the environment. However, natural selection being only operative on hereditary features, it only concerns heritable characteristics. Evolutive causality is only relevant for inheritable (hereditary) properties. In what follows we will only concentrate on innate representations, which are the only representations of which it can be said that they are inheritable. In its naturalistic version, semantic externalism, in keeping with Lorenzs position, sees innate representations as adaptations in the sense of the theory of evolution. Thus, the missing criterion which they propose is the causal evolution of such representations during the phylogeny of the species: this is supposed to act as a guarantee of the semantic adequacy of the (innate) representation to its object. This seems very close to considering that, for innate representations, semantic adequacy is equivalent to biological fitness, i.e. that it is gradual and it depends on the environment. So the main question regarding this solution to the missing criterion problem is whether being an adaptation, i.e. being caused by past (phylogenetic) experience, can be considered as a guarantee, on a par with the present experience of the object as a cause of the representation for acquired representations, of semantic adequacy. 4. Two kinds of causality Most current works in the Modern Synthesis (The Modern Synthesis was born of the fusion of modern genetics with the Darwinian logic behind the theory of natural selection) relie on population genetics, i.e. a statistical approach to evolutionary dynamics (gene frequency). This relies on an abstract kind of causality, called either property causality (Sober 1987) or causal general assertions (Glennan 2002). More commonly, the notion of cause is relative to specific instances. This second kind of causality, called token causality (Sober 1987) or causal singular assertions (Glennan 2002), is distinct from the first in as much as, rather than relying on statistic tendencies, it concerns actual, individual, causal events. This can be seen as the difference between the two following assertions: 1. Smoking causes lung cancer. 2. My cousin died of lung cancer because he was a heavy smoker. In 1, the causality is statistic: in other words, though some non-smokers do die of lung cancer, the probability of getting lung cancer is higher for smokers than for non-smokers (property causality) and 1 still is true. In 2, the causality is specific: if my cousin was a non-smoker, he could still have died from lung cancer, but 2 would be false. We would like to say that the guarantee and criterion of semantic adequacy, though it is causal in both cases, does not rely on the same kind of causality for innate and for acquired representations:
for innate representations, the guarantee and criterion of semantic adequacy rest on property causality; for acquired representations, the guarantee and criterion of semantic adequacy rest on token causality.
This raises a question: given that the criterion and guarantee of semantic adequacy rest on different kind of causality depending on whether they apply to innate or acquired representations, can it be claimed that they are indeed one and the same for these two kinds of representations? And, if they are not, what is the consequence for innate representations and their adequation with reality? 5. Two kinds of criteria? The present question, we want to emphasize, is not on whether there are innate representations, or whether they originated as adaptations. It is whether the fact that they originated as adaptations can function as a guarantee and criterion of their semantic adequacy. As seen above, the Externalist view of innate representations suggests that there is a proximity, if not indeed equivalence, between the guarantee and criterion of semantic adequacy for such representations and their fitness value (their contribution to the fitness of the organism, as mesured in terms of number of viable offspring). We will leave the question of whether the fitness value of innate representations can be seen as a criterion of semantic adequacy for later, concentrating right now on the (externalist) guarantee and criterion proposed by Lorenz and contemporary externalist semanticists. A good way of approaching the question is to look at how the guarantee and the criterion of semantic adequacy operate for acquired representations. In acquired representations, the actual experience of an object causes (in the sense of token causality) its representation. It is because a specific object is perceived or experienced at a given time by a given individual that the representation caused in that individual by that object can be considered as guaranteed and as semantically adequate (Up to a point, it is because in hallucination the representation occurs without this causal relation to reality that hallucinations do not have any guarantee of semantic adequacy. It is not clear that hallucinations have criteria of semantic adequacy for precisely the same reason). Could the same thing be said about innate representations? The answer is obviously in the negative. In the case of an innate representation, by definition, no present (ontogenetic) experience is necessary for the existence of the representation. Accordingly, no present (ontogenetic) experience can act as either a guarantee or a criterion of semantic existence for the representation. It seems thus that the different kinds of causality implied by the innate and acquired representations means that there is not one kind of guarantee and criterion, but two kinds of guarantees and criteria. In other words, the validity of the guarantee and criterion of semantic adequacy for acquired representations cannot be extrapolated to the guarantee and criterion of semantic adequacy for innate representations. Hence, the guarantee and criterion for innate representations may be valid, but their validity cannot be established on the same grounds as for acquired representations because property causality does not work in the same way as token causality. We will now turn to the validity of the guarantee and criterion for innate representations. 6. Can past experiences guarantee the semantic adequacy of innate representations? As we have just seen, present experiences guarantee the semantic adequacy of acquired and present representations, because of the token causality between these experiences and these representations. The difference between the case of acquired and innate representations is that in the first present experiences guarantee the semantic adequacy of present representations, while in the second past experiences are supposed to guarantee the semantic adequacy of present representations. The problem is whether they can really do so. This is tantamount to the following question: can property causality act as a guarantee between past experiences and present (innate) representations? As seen above, property causality is statistics, which is why smoking can cause lung cancer, eventhough some non-smokers die of lung cancer.
However, though it is certainly true that smoking can cause lung cancer, this does not make it true that a non-smoker who dies from lung cancer died because he was a smoker. In the same way, the fact that in the past a given (type) of object (token-)caused a given representation in a given organism and even the fact that that type of object ended up (property-)causing that type of representation to become an adaptation, cannot guarantee that a present specific representation of that type is semantically adequate. In other words, the fact that in the past a type of representation added to the fitness value of the members of a species and hence became an adaptation (an innate representation) cannot, in any way, be a guarantee that it is semantically adequate today. This raises one further question: does it still have a positive fitness value today if it is no longer semantically adequate? 7. Conclusion Millikans view is that the proper function of an organ or of a representation is wholly determined by what it was selected for in the past (i.e. what it was an adaptation to). For a representation, its proper function is what it was selected to represent in the past, in other words, there is an equivalence between what a representation means (its meaning or signification) and what the proper function of that representation is. If this is the case, then the meaning of a present (innate) representation is independent of what produces it now. It only depends on what used to cause it. If this is so, then innate representations, supposing that their original (property-)cause has disappeared, because for instance the environment has changed, provided that they are produced by another element in the present environment, still represent that original cause. In other words, they can be systematically false. This consequence has been anticipated in Sperbers (1994) org story. According to Sperber, orgs developped an (innate) representation of elephants who used to come in their environment and trampled them. That feature disappeared though the (innate) representation subsisted, being now triggered by another (and new) feature of the environment, i.e. trains passing nearby. Sperber, in keeping with Millikans notion of proper cause, distinguishes between the proper domain of the representation and its actual domain. In other words, the distinction allows us to say that the orgs innate representation is (property-) caused by the original factor which determines its proper domain , but is (token-) caused by the new factor which determines its actual domain. Thus, the proper domain does not coincide (now) with the actual domain. As shown by this example, past experience cannot guarantee the semantic adequacy of present innate representations. References Boyer, P. (2001) Et lhomme cra les dieux: Comment expliquer la religion, Paris, Robert Laffont. Dretske, F. (1981) Knowledge and the flow of information, Cambridge, MA, The MIT Press. Glennan, S. (2002) Contextual unanimity and units of selection, Philosophy of Science, 69, 118 137. Gopnik, A. (2001) Theories, language, and culture: Whorf without wincing, in Bowerman, M. & Levinson, S.C. (eds), Language acquisition and conceptual development, Cambridge/New York, Cambridge University Press, 45-69. Kay, P. & Rgier, T. (2006) Language, thought and color: recent developments, in Trends in cognitive sciences 10/2, 51-54. Levinson, S. (2003) Space in language and cognition : explorations in linguistic diversity, Cambridge, Cambridge University Press. Lorenz, K. 1981. Lhomme dans le fleuve du vivant. Flammarion, Paris. 1978 Piper & Verlag, Mnchen. Millikan, R.G. (1993) White Queen Psychology and other essays for Alice, Cambridge, MA, The MIT Press.
Millikan, R.G. (2000) On clear and confused ideas: An essay about substance concepts, Cambridge/New York, Cambridge University Press. Millikan R.G. (2004) Varieties of meaning, Cambridge, MA, The MIT Press. Millikan, R.G. (2005) Language. A biological model, Oxford, Oxford University Press. Nucettelli, S. (ed.) (2003) Semantic externalism and self-knowledge, Cambridge, MA, The MIT Press. Putnam, H. (1975) Mind, language and reality: Philosophical papers, volume 2, Cambridge/New York, Cambridge University Press. Quine, W.V.O. (1960) Word & object, Cambridge, MA, The MIT Press. Roberson, D., Davidoff, J., Davies, I. & Shapiro, L. (2005) Color categories: evidence for the relativity hypothesis, in Cognitive psychology 50, 378-411. Sober, E. (1987) What is adaptation?, in Dupr, J. (ed.)(1987), The latest on the best, Cambridge, MA, The MIT Press, 105-118. Sperber, D. (1994) The modularity of thought and the epidemiology of representations , in Hirschfeld, L.A. & Gelman, S.A. (eds) Mapping the mind : domain specificity in cognition and culture, Cambridge, Cambridge University Press, 39-67.

Archive 4

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Archive 4

Uploaded by

Copyright:

Available Formats

Adaptation and Representation

In paternership with : lInstitut des Sciences Cognitives et lUniversit de Genve :

- The empirical content of the notion of linguistic mental representations.

- Representation in digital systems.

- Population Thinking, Darwinism, and Cultural Change

- Representational Requirements for Evolving Cultural Evolution

- Content From Development

- An Evolutionary Solution to the Radical Concept Nativism Puzzle

-Ideas that stand the [evolutionary] test of time

- The Theory of Biological Adaptation and Function

- The Evolution of Misbelief

-Functions, Modules and Dissociation: A Quibble

-The Dynamic Nature of Representation

- Adaptation and representation an introduction

Paper: There is an attractive way to explain representation in terms of adaptivity; roughly:

Deductive inference: On traditional views, the inference:

is valid, whereas the inference:

Then the cost of I1 is:

The empirical content of the notion of linguistic mental representations.

(2) [CP That Caesar conquered Gaul] is true

Representation in digital systems.

Population Thinking, Darwinism, and Cultural Change

Figure 1: Three categories

Z(t+1) = (Z(t)) + (X*) + (1- - )(i(Xi))/n

Representational Requirements for Evolving Cultural Evolution

Content From Development

[15] Keil & Batterman, 1984.

An Evolutionary Solution to the Radical Concept Nativism Puzzle

(Date of publication: 23 April 2007) Abstract:

Ideas that stand the [evolutionary] test of time

The Theory of Biological Adaptation and Function

The Evolution of Misbelief

Functions, Modules and Dissociation: A Quibble

3. ER Functions with Whistles.

The Dynamic Nature of Representation

Adaptation and representation an introduction

You might also like