You are on page 1of 21

Social learning Comprehensive Review

I. What is social learning?


1. Some concepts:
- Social learning is a type of learning involving the transfer of
information, knowledge between individuals in the populations
(biological, computational, or robotic autonomous systems), usually from
a more experienced individual to a nave one.
- Social learning is different from individual learning, or asocial learning,
in which each individual learns the appropriate response to an
environment through experience and trial-and-error.
- Social learning could be unreliable because social learning relies on the
actions of others rather than direct response. This is especially true in
dynamic environments, when appropriate behaviors may change
frequently. Consequently, social learning is most beneficial in stable
environments, where everything is not likely to change rapidly.
- Although asocial learning may result in the acquisition of reliable
information, it is often costly for the individual obtain. This is the reason
why social learning comes in.
- Increasingly, it has been recognized that such a constructive approach
towards imitation and social learning via the synthesis of artificial agents
(in computer software, programs, or robots) can
a) yield important insights into mechanisms that can inform
biologists and psychologists by fleshing out theory, as well as b) help
in the creation of artifacts that can be instructed and taught by
imitation, demonstration and social interactions rather than by
explicit programming.

2. Social learning taxa:


- Social learning has been researched in both humans and animals.
2.1 Human:
- There is a Social learning theory in humans, first proposed by Alberta
Bandura, describing how people learn in social context. It is also called
Observational learning, in which each person learns from others by
observing their characteristics or behaviors.
- Observational learning starts with attention process. In a social group,
members with interesting qualities are likely to receive more attention
than the others. So in the attention step, people decide which
characteristics or behaviors of observed models should be captured and
which could be ignored.
- Retention: Retain or keep the observed behaviors or characteristics in
memory.
- Reproduction: Reproduce a behavior or characteristic
- Motivation: even if 3 processes finished, the person will not engage in
the behavior without motivation. In the motivation step, the learnt
behavior, which behavior remained unexpressed, will take action when
incentives are provided.
- Reinforcement could be occurred between the reproduction and
motivation processes. Its very important in observational learning
because it differentiates learning from simply imitating the others. In
social learning theory, vicarious reinforcement is defined as the
adaptation in the behavior of observers when they notice the response
consequences of the models.
+ Positive: further the learnt behaviors if they are good
+ Negative: reduce the behaviors learnt before if they dont have good
effect.
Significant finding: The combination of various characteristics
from different models could result in INOVATION. So we should
not learn from few people, but many.

2. Social learning in animals (in general, could include humans):


- Model the way animals (including humans) learn how to forage, search
for food or other resources, how to do some daily actions There have
been some principle types of social learning occurring in various species,
including humans, differentiated by the conditions in which they occur.

2.1. Local Enhancement


- Arises spontaneously through grouping behavior. When individuals
approach each other and stay together in a group, they automatically
affect each others learning opportunities. Particularly, each individual is
inclined to approach locations in which other members of their group are
found, and thereafter to interact with resources in those regions.
- Local enhancement has been observed to transmit foraging information
among birds, rats, and pigs [1]

2.2. Stimulus Enhancement


- Contrary to Local enhancement, stimulus enhancement is direct results
of the specific sensing and decision making of an individual. SE prime
individuals to interact with particular kinds of objects in the
environment, because other individuals have been observed to have
interacted with those objects.
- Vander Post et al [2], Modeled as an increase in probability that a
forager processes and consumes a resource type after observing another
forager interacting with that resource type.
- This increases the chance that the animal will learn to gain the reward it
has seen its conspecifics obtain, often by performing the same actions,
whereas an individual on its own would seldom do so. However, the
mechanism that generates this apparent copy is the conventional one of
individual trial-and-error learning. Observation of the conspecifics
pattern of behavior is not causal to changing the observers pattern of
behavior. Here we agree with others in the field, that copying that can be
explained in terms of stimulus enhancement coupled with individual
learning does not qualify as an instance of imitation.
- This facilitates learning by focusing the exploration of the observer
on interesting objects objects that have useful affordances for other
members of the social group.

2.3. Observational conditioning:


- A phenomenon similar to SE. In OC, the behavior of demonstrator
exposes the learner to a new relationship between stimuli that it had not
previously known, and causes the learner to form an association, between
them [3].
- Example: In an experiment with monkeys, young monkeys that
observed their parents fearfully responding to model snakes also
developed a fear of snakes without direct contact. Another example of
this is how blackbirds learn to identify predators; they observe other birds
mobbing unfamiliar objects they havent seen before.

2.4. Imitation learning:


- Main focus of researches on social learning for many years
- Most common social learning method.
+ A person or an animal can acquire through observation of another
tendency to go to the same place, effect the same transformation of an
object, perform the same body movements, make the same sounds,
feel similar emotions or think similar thoughts.
- No one is surprised by evidence of imitation in adult humans. Most of
us will imitate someone in some aspect in our lives. In the past, most
researchers believe that imitation is found only in humans. However,
recent research in cognitive science and psychology showed some
evidence of simple imitation in adult humans, and of complex imitation
in animals that are distantly related to humans. [3]

2.4.1 Mimicry:
- The lowest level of imitation learning. The observer copy exactly
behaviors of the models without the appreciation of their purpose (goal or
intention). The observer later comes to discover the effects of the action
by suggesting actions that can produce useful results.

2.4.2. Behavior imitation


- Refers to reproducing the actions of others in hopes of obtaining the
same results with the same goal.
- Examples:
+ Pigeons are able to learn behaviors that lead to the delivery of a reward
by watching a demonstrator pigeon [5].
+ Chimpanzees can also imitate. A demonstrator performed the same
actions in two conditions: Hands Free and Hands Occupied. It was
found that the chimpanzees imitated more often when the demonstrator
was in the Hands Free condition, indicating that chimpanzees are able
to understand the reasoning behind another beings actions and apply
that understanding when imitating others.

2.5. Emulation:
- A process in which the observer witnesses a particular result on an
object while others interact with it, but then employ its own action
repertoire to produce the same result on the same object. In this case,
learning is facilitated both by attention being directed to the object of
interest and by the observation of the goal.
- Different from SE in that SE changes the salience of certain stimuli in
the environment, emulation changes the salience of certain goals.
- Different from Imitation in that Emulation does not require the observer
to use the same method as the models used to produce the same results.
- Critical points:
+ Researches by Tomasello 1990 [4] (citing Wood 1989) showed that
Although this is goal emulation, how the observer reaches that goal is
a matter of individual learning or prior knowledge, neither of which is
directly influenced by the technique it has observed. The observer could
by chance use the same techniques as the demonstrator, thereby giving
the appearance of imitation. Seeing the action of other is not important;
what matters is that the concrete result of them is identified, and so can
be emulated.

II. Social learning strategies:


(Based on Kevin N. Laland Prof of St. Andrews Scotland - 2004)
- There are several questions in social learning or imitation
learning, most notable are: When to copy, whom to copy... Many
researchers proposed several strategies to answer these questions.
- The term copy used from now on refers to any form of social
learning method, not just imitation.

1. When strategies: When copy others


a) The simplest when strategy is perhaps to copy when established
behavior is unproductive. Here, established behavior could refer to
unlearned behavior or to the learned solutions to related problems.
- For example, Lefebvre and Palameta [6] 1998 conducted an
investigation of the spread of a food-finding behavior in populations of
pigeons, in which birds were required to peck open a carton containing
seed. The pigeons would scrounge (take food from others) if possible,
and only when there were so few birds producing food that scrounging
was unproductive did some scroungers switch to adopting the food-
finding behavior => suggesting that this reflects a strategy of learning
only when there is no easier option supported by the facts that scroungers
diminishes as the producers share increases. That means, if there arent
enough producers to go around, the returns to scroungers are poor,
and some scroungers will learn to produce.
- Copy-when-established-behavior-is-unproductive strategy refers to the
initial acquisition of the producing behavior; whether or not

b) Copy when asocial learning is costly


- Theoretical analyses used in the exploration of the adaptive advantages
of social learning have led to a consensus that greater reliance on social
learning should be favored as the costs of asocial learning increase (Boyd
& Richerson, 1995, 1998 [7, 8]. Such costs include the energetic costs of
searching for and processing valuable resources, the risk of unreliable
(asocially acquired) information
- Such theory implies that a copy-when-asocial-learning-is-costly
strategy might be adaptive.
- Boyd & Richerson suggest that when information is too costly to
acquire or utilize personally, individuals will take advantage of the
relatively cheap information provided by others.
- A common circumstance in which established solutions are likely to
prove unproductive is that when individuals are confronted with
particularly difficult novel tasks. If solutions to related tasks have
failed to deliver a reward, individuals would have little to lose from
seeking guidance from others.
-

c) Copy when uncertain:


- This is very intuitive. If one feels uncertain about a circumstance, they
should look at others that experienced with and got reward in the very
circumstance before.
- Boyd and Richerson postulated that animals would rely on their own
experience when reasonably certain which environment they were in, but
would rely on social learning when the nature of the environment was
unclear.

2. Who strategy: = From whom we should learn?


- There are several who strategies proposed in social learning literature.
a) Copy the majority
- Boyd & Richerson 1985 [9], in circumstances in which natural
selection favor reliance on social learning, conformity is also favored.
Conformity refers to positive, frequency-dependent social learning in
which the probability of acquiring a trait increases disproportionately
with the number of demonstrators performing it.
- For example: Large shoals were found to locate food faster than small
shoals, in consistency with similar findings in other fishes. This is
probably because fish in large shoals have more shoal mates from which
to acquire information, and large numbers of individuals at a food site
attract conspecifics more rapidly than small aggregations.

b) Copy if rare
- There are sometimes rare males have a mating advantage in populations,
so rare behavior patterns may be disproportionately adopted with the use
of a copy-if-rare strategy.
- One example is interspecific vocal mimicry in birds such as starlings
(chim so ), parrots, and mynah birds. Another one is the fact that male
European marsh warblers copy the sounds of an average of 77 other
species (according to the reaserch of Dowsett-Lemaire, 1979 [10). They
point out that the most striking cases of vocal mimicry occur in species
with very elaborate and rare songs.

c) Copy successful individuals:


- Very intuitively, most people are partial to look at successful ones.
- This has the advantage of being relatively easy to implement, but the
disadvantage is that it is not always clear which trait of a successful
individuals is the major source of their successes.
- For example:
+ Pop and film stars do not make their millions as a result of their
political views, yet they frequently exert (make an effort/try hard) an
influence on the political beliefs and values of their fans.
- Mathematical analyses suggest that this strategy may be favored by
natural selection even though it may sometimes allow neutral and
maladaptive traits to hitchhike (i nh xe) along with those traits that
bring about success (Boyd and Richerson 1985)
- Henrich and Gil-White (2001) [11] suggested that evolution of a copy-
successful-individuals strategy could explain the information of prestige
hierarchies, since highly skilled individuals will be at a premium.
In the context of animal studies, prestige may equate to social rank. Are
animal more likely to copy high ranking than low-ranking conspecifics?
- Adopting this strategy, individuals will be required to evaluate the
payoffs associated with behavioral alternatives. It is clearly shown in a
theoretical models of human decision making (Gintis 2000 and Schlag
1998 [12, 13]), but not clear to what extent animals are able to make such
judgment.

d) Copy if better
- Another heuristic is copy-if-better, whereby individuals switch strategy
if the returns of the behavior adopted by the demonstrator exceed those of
their own behavior (Schlag, 1998)
- Shlags game theoretical analyses reveal that when information
concerning the success of others is unreliable and noisy, a copy-if-
better strategy outperforms a copy-the-most-successful-behavior
strategy.
+ However in risky environments, always copying all individuals that
seem to be reaping (obtaining) greater returns can lead the entire
population to choose the alternative with the lowest expected payoff.
+ A much better rule, which Schlag calls proportional imitation, is
one by which observers copy an individual that performed better than
they did with a probability that is proportional to how much that
individual performed.
+ Another finding is that this version of copy-if-better strategy will
always lead the population to the expected payoff maximizing action.
- However, animals lack the ability of such an appraisal required in
proportion imitation rule, typically absent in nonhuman primates, making
a copy-if-better strategy less likely to be adaptive.
- Schlag proposed an alternative strategy for animals called proportional
observation, requires individuals to copy the behavior of a demonstrator
with a probability equal to the demonstrators payoff. Thus, once again,
animals have to make a judgment as to the profitability of another
individuals behavior, but this rule seems less complicated than the
proportional imitation rule, since a comparison between self and other is
not required.

e) Copy good social learners:


- According to Blackmore (1999) [14], the most effective social learners
would require the most advantageous cultural traits. Hence, her copy-the-
good-social-learners strategy is intrinsically tied to a copy-the-most-
successful-behavior strategy.
- Conversely, Kendal (2003) [15] found that a copy-the-good-social-
learners strategy would not evolve, since it has no selective advantage
over the strategy used by generalist social learners that copy other
individuals at random. On the other hand, he did find that the copy-the-
most-successful-behavior strategy would evolve.

f) Copy friends:
- Proposed by (Boyd and Richerson 1985) and (Griffiths, 2003, [15]),
when guppies have been reported to acquire foraging information more
effectively from familiar than from unfamiliar demonstrators (Swaney,
Kendal, Capon, Brown, and Laland 2001 [16])
g) Copy older individuals:
- Older means more experienenced.

III. Rogers paradox: Why cheap social learning


doesnt raise mean fitness
1) Rogers paradox:
- Individual learning would probably have taken far more time than
social learning, which is thought to be a common scenario: Social
learnings prevalence is often explained in terms of its ability to reduce
costs such as metabolic, opportunity or predation costs.
- Individual/asocial learning produce new information about the
environment, but at a cost. Social learning avoid the cost by copying the
existing behaviors of others, but does not generate new information. This
lead to information parasitism scenario, also known as Rogers paradox
proposed by Rogers (1998), stating that Social learning does not increase
the average fitness of the population.

- Rogers (1988) considered a population of individuals learners tracking


a temporally varying environment. Because social learners acquire
information cheaper than individual learners, they are selected for when
introduced. However, this eventually results in there being too few
individual/asocial learners tracking the environment for up-to-date
information to spread. Consequently, social learners fitness declines until
an evolutionary stable state (ESS) is reached, with the population
becoming a mix of both types of learners.
- Rogers key observation was that, by the definition of an ESS, social
learners fitness at this stage must equal that of individual learners.
In other words, while lower cots gave social learners an initial fitness
advantage that allowed them to invade, social learning does not
necessarily increase the populations mean fitness in the long run.
These results contradict both the assumption that social learning
improves population fitness by reducing costs, and more broadly, that
adaptiveness is an inevitable consequence of evolution. This finding
was considered so striking that it came to be known as Rogers paradox.
- The key to understanding Rogers paradox is that, unlike individual
learning, which results in a constant fitness payoff, social learning is
frequency dependent. Specifically, social learners fitness inversely
depends on the number of other social learners in the population;
higher frequencies of social learners result in lower fitness. This is
because social learners act as as free-riders or information scroungers
who produce no new knowledge of their own, but rather parasitize the
knowledge generated by individual learners (Kameda & Nakanishi,
2002).

- Another extension proposed by Rogers (1988) was to consider the role


of favorable learning biases, such as a preference for copying high fitness
individuals or making the type of learning used contingent on the quality
of environmental information. These are examples of what Laland
(2004) has termed social learning strategies (who and when
strategies, respectively).

2) Solutions:

- Boyd and Richerson (1995) found that adopting a who strategy of


preferentially copying individual learners fails to resolve the
paradox, because it does not dodge (avoid, evade) the problem of
social learning being frequency dependent. Similarly, Kameda and
Nakanishi (2002) showed in both modeling and experimental work that
Rogers paradox severely restricts the utility of conformity bias (a who
strategy of preferentially copying the majority).

- Enquist et al. (2007) proposed two other when strategies that also
resolve Rogers paradox: the critical social learner and conditional
social learner. The critical social learner attempts to learn socially,
but switches to individual learning when unsuccessful. Conversely, the
conditional social learner attempts to learn individually, but switches
to social learning when this fails. They showed that critical social
learning generally raises mean fitness and is selected for by evolution,
except when social learning is highly unfaithful, the environment is
highly variable or social learning is much more costly than individual
learning. They further showed that conditional social learning is only
selected for if individual learning is quite cheap, whereas pure
individual learning is only selected for if the environment is extremely
unstable or transmission fidelity approaches zero. In short, the critical
social learning strategy resolves Rogers paradox under the broadest
range of conditions.
- Rendell et al. (2010) further extended these results, showing that,
because spatially varying environments make social learning less
effective, they broadly favor conditional over critical social learning. In
fact, if the cost of individual learning is sufficiently low, then this effect
may be powerful enough for conditional social learning to be selected
for over critical social learning. In such cases, the conditional strategy
may also be selected for over pure individual learning, but only if
individual learning is unreliable, making social information a useful
backup source of data.

2. Contradiction:
- Recently, there has been a significant finding on the research

paper on BMC Evolutionary Biology 2016, called Skill


learning and evolution of social learning mechanisms by Laland et al.
The authors did some researches investigating the Rogers paradox in
foraging behavior of animals.
- Experimental results showed that Local Enhancement does not
benefit foraging success, but could evolve as a side-effect of grouping.
In contrast, stimulus enhancement and observational learning can be
beneficial across a wide range of environmental conditions because they
generate opportunities for new learning outcomes, enhancing skill
development.
- They also did find that the evolution of OL and SE actually lead to
increased payoffs in the absence of any special social learning
strategies, in contradiction of Rogers paradox. Moreover, we find that
exploration rates do not decline in case of LE and OL. For SE,
exploration rates decline because this enhances skill development.
- They reason why information parasitism does not arise it their model is
because they explicitly implemented a learning process in such a
gradual manner. Gradual learning was explained to have 2 advantages:
+ First, Observable foraging behavior is not necessarily fully developed
and therefore social learning does not automatically avoid the costs of
exploration. This is because without any special copying strategies,
foragers not only copy choices involving fully developed behaviors, but
also exploratory choices involving developing and unfamiliar behaviors.
+ Second, information production is nearly unavoidable during social
learning because there is no clear trade-off between social learning and
information production.

IV. Future research directions to combine social


learning with genetic programming:
1) Investigate Social learning mechanism behind EAs
and apply social learning alternative methods:
- In most EAs, crossover could be considered a social learning method,
whereby individuals exchange information to other mates. Its also an
exploitative operator. And Mutation could also be considered an
individual learning step (but very simple), also known as explorative
operator.
=> This is a potential avenue to investigate deeply several alternative
general social learning methods instead of using crossover.

a) My first attempt: (CEC 2017 paper)


- Using simple imitation operator to directly replace crossover operator.
- Copy both private and public information
- Public information comes from the global best information.
- Private information comes from private-best experience.
- Experimental results showed that SGE outperformed GE on most
regression benchmarks.

b) The second attempt:


- I will use different resource types of public information by utilizing
neighborhood topologies with local best-information.
+ Ring
+ Von Neumann (hard to implement in variable length GE, easier for
fixed-length swarm representation)
+ Voronoi (could be useful)

c) Social learning methods:


c1) Local enhancement:
- Individuals approach each other and stay together, they automatically
each others learning opportunities.
c2. Stimulus Enhancement:
- Individuals are partial to interact with particular kinds of objects in the
environment because other individuals have been observed to interact
with those objects.
- Using stimulus enhancement for mate selection for other social
operators: Crossover, imitation.

2) Investigating the Rogers paradox, the effect of


social-asocial learning relationship to the fitness of
population:
- In current EAs, crossover is social learning method, mutation is one
type of individual learning, but canonical mutation is very simple, just
one step, whereas asocial learning should be something like local
neighborhood search, I mean it should be multi-step mutation.
- There have been many researches showing the balance between
exploitation-exploration in EAs, but most EA practitioners use higher
crossover rate than mutation rate. There also have been several researches
showing that using only mutation, or other types of local search could
result in better performance than using both crossover and mutation.
BUT I think they did not explain it in terms of general social learning-
individual learning relationship like the findings in Rogers paradox.
- Therefore in the next step, I will do

2.1. Studying the effect of rogers paradox in canonical GP or GE.


- In GE or GA its easier to perform an individual neighborhood search
thanks to linear representation, multi-steps mutation could be used here.
+ For example I will run 30 steps hill-climbing to find a local optimal for
each individual learning step.
- The purpose of this potential research is to see whether or not rogers
paradox exists in GP (or EAs in general). If so, I will further the research
by using some solution of rogers paradox presented before to see if the
result could be better.
- For example:
+ Who strategy could not solve the rogers paradox in Social learning, but
we could apply some who strategies (choosing mates for social learning)
to see if the results conforming to social learning or not.
- Applying when strategies to resolve rogers paradox.
+ Critical social learning: learn social first, then switch to individual
learning if social learning does not produce good outcomes.
+ Conditional social learning: learn asocial first, and switch to social
learning if failed.
So we would not apply social learning first for all individuals (like
crossover) and then individual learning (like mutation after crossover),
but apply for parts of population.

3) Applying imitation in genetic programming:


- like in CEC paper

Citation
1. Galef, Bennett G.; Giraldeau, Luc-Alain (2001). "Social influences
on foraging in vertebrates: Causal mechanisms and adaptive
functions". Animal Behaviour. 61 (1): 315.
doi:10.1006/anbe.2000.1557. PMID 11170692. .
2. Van der Post DJ, Ursem B, Hogeweg P. Resource distributions
affect social learning on multiple timescales. Behav Ecol Sociobiol.
2009;63:16431658.
3. Hoppitt, William; Laland, Kevin N. (2013). Social Learning: An
Introduction to Mechanisms, Methods, and Models. Princeton
University Press.
4. Tomasello, M. (1990) Cultural transmission in the tool use and
communicatory signaling of chimpanzees? In: Language and
intelligence in monkeys and apes, ed. S. T. Parker & K. R. Gibson.
Cambridge University Press. [arRWB, MT]
5. Saggerson, A. L.; George, David N.; Honey, R. C. (2005).
"Imitative Learning of Stimulus-Response and Response-Outcome
Associations in Pigeons". Journal of Experimental Psychology:
Animal Behavior Processes. 31 (3): 289300. doi:10.1037/0097-
7403.31.3.289
6. Lefebvre, L., & Palameta, B. (1988). Mechanisms, ecology and
pop- ulation diffusion of socially-learned food-finding behavior in
feral pigeons. In T. R. Zentall & B. G. Galef, Jr. (Eds.), Social
learning: Psychological and biological perspectives (pp. 141-164).
Hillsdale, NJ: Erlbaum.
7. Boyd, R., & Richerson, P. J. (1995). Why does culture increase
human adaptability? Ethology & Sociobiology, 16, 125-143.
8. Boyd, R., & Richerson, P. J. (1996). Why culture is common, but
cul- tural evolution is rare. Proceedings of the British Academy, 88,
73-93.
9. Boyd, R., & Richerson, P. J. (1985). Culture and the evolutionary
pro- cess. Chicago: University of Chicago Press.
10.Dowsett-Lemaire, F. (1979). The imitative range of the song of the
marsh warbler Acrocephalus palustris, with special reference to
im- itations of African birds. Ibis, 121, 453-468.
11.Henrich, J., & Gil-White, F. J. (2001). The evolution of prestige:
Freely conferred deference as a mechanism for enhancing the
benefits of cul- tural transmission. Evolution & Human Behavior,
22, 165-196.
12.Gintis, H. (2000). Game theory evolving. Princeton, NJ: Princeton
University Press.
13.Schlag, K. H. (1998). Why imitate, and if so, how? A bounded
rational approach to multi-armed bandits. Journal of Economic
Theory, 78, 130-156.
14.Blackmore, S. (1999). The meme machine. Oxford: Oxford
University Press.
15.Griffiths, S. W. (2003). Learned recognition of conspecifics by
fishes. Fish & Fisheries, 4, 256-268.
16.Swaney, W., Kendal, J. R., Capon, H., Brown, C., & Laland, K. N.
(2001). Familiarity facilitates social learning of foraging behaviour
in the guppy. Animal Behaviour, 62, 591-598.
17.

You might also like