An annealing analogy
Analogy is perhaps one of our strongest tools for understanding physical or biological systems — take principles derived from observing one system and apply them to the investigation of another, and you can often gain insight you would never otherwise have imagined [1]. When considering the brain itself, probably the most complex system we can study, analogy has been extremely useful.
One such analogy borrows from the field of metallurgy. A basic requirement of the metallurgic process is to regulate the cooling rate of molten metals, in order to control the formation of crystal lattices and thereby achieve desired mechanical properties in the resulting material (i.e., how it deforms or breaks under various conditions and forces). This phase transition process is call annealing. If solidification of metallic particles is regarded as a stochastic process, annealing can be conceptualized as an optimization problem in which the objective is to determine the cooling rate which maximizes the desired properties. This phenomenon has thus been co-opted for use in computational optimization problems (where it is called simulated annealing), for which the "temperature" parameter is lowered at a rate which prevents "particles" (i.e., free parameters) from settling early into local extrema, allowing the parameter space to be efficiently explored but eventually (at low temperatures) converging on a global extremum.
Videos showing a single parameter search using (very simplified) simulated annealing, with low and high starting temperatures. The y-axis is the objective function, which we are trying to minimize. The x-axis is our free parameter. The temperature is slowly lowered, and the search points eventually settle on the minima. Note that the higher starting temperature allows the search points to better explore the parameter space and find the global minimum, while search points with a lower starting temperature are more likely to get stuck in local minima. In practice, many more free parameters are involved, and the energy of individual search points can vary stochastically, with T representing the mean energy as opposed to a fixed deterministic parameter.
As an analogy, annealing extends to many classes of phenomena [2]. It can, for instance, be used to understand how species evolve. The evolution of a species can be likened to an ongoing annealing process, where the objective is to maximize population. The "temperature" of an evolving species is in this case its behavioural variability, and the optimal state of the system is determined by a balance between its ability to adapt to changing environmental conditions (i.e., a changing objective function), and its ability to conserve the set of genetic traits that is advantageous to its ongoing survival and growth.
The behavioural variability of a species is related to, but not synonymous with, its genetic variability. The issue of time scales is particularly important here. Because genetic variation depends on random mutations of genomic elements, it requires many generations for a particular trait to emerge and assert its competitive advantage. Over very many generations and very many species, this process eventuates a gradual evolution of systematic changes which leads eventually to increasingly complex systems. Why is complexity advantageous? If there is a common thread to the emergence of larger, more complex, and longer living organisms, it is that these systems are capable of producing behavioural variability without genetic variation. In other words, these systems have evolved mechanisms of controlling the temperature of their annealing process in order to facilitate much more rapid (within single or relatively few generations) adaptations to environmental changes.
Same deal, but now the objective function itself is changing over time. The temperature is constant while the objective function changes, and then cools. The search points start out at their optimal position, and are also now more stochastic — speeds and temperatures vary. This system is analogous to an evolving biological species, which has to maintain a level of behavioural and genetic variability (i.e., temperature), in order to adapt to changing environmental challenges. Note that the system which starts out at a low temperature is never able to explore beyond its local minimum, whereas the higher temperature system explores and finds other minima, including the new global minumum.
As implied by the annealing analogy, the ideal organism must find a balance between such behavioural plasticity and stability. While it is advantageous for an organism to be able to change its behavioural responses when environmental factors change, it is a liability to have so much plasticity that the system never settles on appropriate behaviours. The trick is to be able to model the world in terms of probabilities, and to discover and entrain behaviours that have the highest probability to be optimal, given previous experience of one's environment and its fluctuations. In other words, it is important for a complex system to be able to predict future contingencies and have appropriate behavioural responses for them — a sort of insurance scheme.
These forces form the likely mechanism for our behaviour both as a species, and as interacting members of that species. As organisms, we are able to form episodic and semantic memories, arrange these into concepts, communicate these concepts as symbolic gestures or linguistic sequences, and both predict our environments and physically manage them so as to minimize unpredictability. As a species, we are able to form traditions and customs which allow us to function as cooperative units (teams, communities, nations, etc.), and we are able to channel our behavioural variability in order to more thoroughly explore our environments and make increasingly complex analyses of them (i.e., via specialization). Thus, annealing operates at multiple levels, and can be used to help understand how we function at these multiple levels.
But annealing also imposes contraints on our systems. Because of the sheer complexity of our environments, the notion of achieving an "optimal" solution is only ever an (unattainable) ideal. In reality, our systems find suboptimal solutions (local extrema), and explore these by varying temperature (behavioural variability) within a conservative range of values. The risk with raising temperature too much, of course, is that the stable solution the system has already found, and which to some extent satisfies the imperative of evolutionary persistence, can be jeopardized by allowing too much variation. At the level of social systems, this balance is evident as one between "conservatives" and "liberals", or between "status quo" and "radicalism". Struggles between these forces are essential for the system to evolve in an adaptive way, given a constantly changing environment, but an imbalance in either direction risks a catastrophic outcome (either by failing to adapt to new challenges, or by effecting systematic, revolutionary changes which destroy existing mechanisms without replacing them with an adequate alternative). At the level of the genome, high levels of random variations in genetic sequences can likewise result in catastrophic consequences. The key (at both levels) is rather many small variations across large populations, only a small fraction of which will eventually confer an advantage.
Exaptation
Notably, this equilibrium implies that it is often more advantageous for a system to build on its existing design principles, as opposed to trying out radical fundamental changes which are more likely to destroy it than improve it. A consequence of this is the phenomenon of exaptation, which refers to the tendency of an organism to recycle existing phenotypes for new purposes, as opposed to reinventing them from scratch. There is a wealth of evidence of exaptation in the history of Earth's evolution; for instance (as described here), feathers appear to have evolved for an entirely different evolutionary purpose than flight, for which they were later adopted.
Neuromodulatory systems, such as norepinephrine, dopamine, or serotonin (and their membrane-bound receptors), are also likely to have evolved through exaptation. These neurotransmitter and receptor proteins (or very similar versions) are found in some of the most primitive organisms, such as C. elegans, where they (perhaps unsurprisingly) serve quite different functions than they do in higher-order species. Such an arrangement makes sense — it is much more efficient to build upon existing genetic elements than to invent and perfect the action of entirely novel ones. This becomes especially evident given the number of auxillary proteins (e.g., receptors consisting of multiple subunits, transporters, kinases, cleavage and folding enzymes, RNA sequences), each requiring unique genetic sequences, that are necessary for a neurotransmitter system to function. As an example, the beta-1 adrenoceptor alone has 477 amino acids with 15 unique domains.
The exaptation of high level neural processes
Exaptation is likely at work at higher levels of complexity as well. In particular, cognitive processes such as episodic, semantic, and working memory are likely built on more primitive functions such as navigation. Mechanisms by which simple organisms are able to identify, locate, and react to salient environmental stimuli (e.g., food or predators), have likely been co-opted through evolution to serve as mechanisms by which our brains select between competing internal representations or concepts. In other words, the ways in which we (humans) navigate through our mental space is likely built upon the same mechanisms by which we (and many lower-order organisms) navigate through our physical space.
This requires elaboration. The most lucid explanation I have read (and indeed the main inspiration for this post) is provided in the book Rhythms of the Brain by György Buzsáki. In this opus, Buzsáki describes the well-known place cells of the hippocampus and grid cells of the entorhinal cortex, which are the likely substrate of navigation in rats (and by comparison humans). Place cells, discovered by O'Keefe and colleagues, have spatially specific "place fields" which fire maximally when a rat is located at a specific location in a 2D field; while grid cells, discovered later by the Moser lab, fire preferentially for multiple locations which form a regular hexagonal grid in the 2D field. Because of these striking properties, both grid cells and place cells have been studied extensively over the past few decades. From these investigations, we know that grid cells arise almost immediately when a rat is placed in a novel environment, while place cells require experience and active exploration to develop their place fields. Also, while established place cells in a 2D field display omnidirectionality — meaning their firing rate is invariant to the direction from which an animal approaches a location — place cells in a 1D field (i.e., a straight track) fire more strongly when approached from one direction than the other.
Studies by Buzsáki, O'Keefe, McNaughton, and others have helped to refine our understanding of how these neuronal elements form a mechanism of navigational processing. One key discovery is based on the observation of oscillatory behaviour in the hippocampal complex; in particular, the prominent intrinsic theta oscillation. Theta has a number of possible sources in these structures, and it appears that an interplay between oscillations with slightly different frequencies provides an essential timing mechanism through which an organism can estimate distances and build spatial maps of its environment. This relationship can understood as follows: if a cell's firing rate follows an oscillatory pattern at a slightly faster frequency than the global hippocampal rhythm, it will at each subsequent cycle fire at a different phase angle of the global rhythm. Thus, the phase at which a place cell fires can encode the position of its preferred location relative to the animal; and by extension, it can also encode the relative positions of many locations along a track. Indeed, if the animal traverses the environment at a known velocity, it can use this phase encoding to estimate distances along the track.
This theoretical model has been substantiated by a number of observations. Perhaps most strikingly, Dragoi and Buzsáki (Neuron, 2006) observed rats traversing a 1D track while recording from place cells with various field lengths. They found that the position in the place field was predicted by the phase of hippocampal theta at which the cell fired, with maximal firing occurring at the trough of the oscillation (phase \(\varphi=180^{\circ}\)). Most likely, positional information during traversal along a track is encoded by ensembles of cells, rather than single neurons. Such ensembles include cells with place preferences for landmarks both in previous and future positions, with the distance between landmarks being encoded as synaptic strength between place neurons.
How are these observations from 1D experiments relevant to the original discovery of omnidirectional 2D place cells? This is a critical question. 2D place cells take time to form and, importantly, they also appear to require both active navigation and crossing paths. Navigation of spiral or non-crossing mazes do not lead to the formation of a hippocampal map. A crucial aspect of these maps is likely that they are a compilation of an ongoing 1D path integration process. As an organism traverses an environment, it trains its place cells to recognize a sequence of landmarks, the mechanism for which is a strengthening of synaptic weights between an ensemble coding for these landmarks. As 1D paths cross in 2D space, they begin to lose their directionality constraints — if a path encounters a location from 10 different directions, the corresponding place cell may cease to code for 1D position and begin to encode an increasingly precise 2D coordinate. It is notable that many higher-level organisms have a strong drive to explore novel environments, which wanes only after they have become acquainted with these environments (again, the analogy with annealing is clear).
Coming full circle
If you'll pardon the pun. This post is about analogy, and the crux of it is this: the power of analogy as an aid to our understanding systems is no coincidence. The fact that our brain uses analogy to expand its comprehension of complex phenomena is a key observation to understanding how it... understands. The grand analogy I want to conclude with is that the hippocampal theta-driven mechanism of navigation is a likely candidate for higher level cognitive processes as well. The most obvious extension is episodic memory, which describes our ability to replay past episodes of our experience at a later point. We are pretty good at this: imagine, for example, you are asked to recall the day you got your first apartment, or had your first date, or graduated from high school or college, or got married. Most of us, once prompted (and assuming we have had such an experience), can find a starting point and then narrate how things played out that day. In precisely the same way that you can recall how to drive from your house to your workplace. By analogy, it is a very compelling hypothesis that this episodic retrieval uses an identical mechanism of sequentially activated neuronal ensembles, utilizing slight theta frequency offsets in a similar way to which we use a finger to pace our reading of a book, or a metronome to pace our playing of a musical score.
It is tempting to go further: why stop at episodic memory? There is no reason (that I am aware of) that ability to follow thought processes, to synthesize new ideas from previously learned associations, does not use an analogous mechanism. Indeed, almost every imaginable form of (comprehensible) interhuman communication aspires to a format which implicitly acknowledges such a mechanism. Every communication requires some form of introduction, which frames the context of what follows and primes the hippocampus to access the relevant concepts (associations, neuronal ensembles) which will subsequently be challenged. This blog post is no exception; the intent (whether or not it has been realized) was to provide a coherent sequence of concepts and associations in order to bring about a deeper understanding of how we think. The same is true of novels, scientific publications, movies, magazine articles, news reports. Each communication is a chain of linguistic symbols intended to activate "threads" of neuronal activations in the same way that navigation of a 1D track does. And every time a conversation "comes full circle", we can think of it as a path crossing itself, and forming the cognitive equivalent of an omnidirectional place cell. We are building mental maps [3].
I could likely go on forever, but this thread needs to end at some point. Which is probably a good way to come full circle. Annealing and other physical analogies emphasize the need for efficiency in any system. This implies that we can only spend so much energy on building cognitive (or spatial) maps, which will be a function of how large an adaptive advantage such maps serve for us as individuals or as a species (relative to the advantages of engaging in other activities, such as taking out the garbage, or remembering to pay your landlord). It is notable that I've included a few footnotes here. These are, to my mind, a compromise between following interesting mental tangents, and maintaining a degree of "focus" on a specific train of thought. This is another trade-off, and yet another analogy to the annealing process: forming clear, concise models of thought requires a balance between exploring one's mental tangents, and following a specific train of thought through to its "conclusion". Too many tangents and you have a mind that has trouble focusing (which describes mania, or psychosis). Too few tangents, and there is not enough creativity to expand one's mental models (and the adaptive advantages these entail).
As a parting note, it must be emphasized that this is all mainly theoretical musings. While the existence of place cells, grid cells, head direction cells, etc., has been well established through invasive electrophysiological animal models, such designs are clearly unsuitable for human research. There is thus a big challenge to investigate the truth of such ideas through existing noninvasive techniques, including MRI, EEG, PET, and MEG. An excellent example of pioneering efforts is this paper by Doeller and colleagues. My personal research will focus on using MEG (which combines decent spatial and excellent temporal resolution), in combination with the others, to test the explicit hypotheses implied by the speculations above. This is no simple task; literally, because no simple tasks exist for cleanly investigating high-level mental processes. But life without a challenge is like simulated annealing with a sufficiently high temperature... okay even I groaned there :)
- Indeed, to be very "meta", this use of analogy can itself provide much insight to how our brains implement mental models of the world, and internally compare the predictions of these models to similar predictions offered by other models of seemingly unrelated phenomena — but this tangent is best address as a future thread :)
- Although I don't discuss it here, annealing is also an apt metaphor for the function of norephinephrine and the locus coeruleus, where virtually all noradrenergic neurons reside. In this model, the tonic mode of LC firing is analogous to the "temperature" of the optimization process, and switches the organism between exploratory and focused behavioural states. See Aston-Jones and Cohen, 2005.
- Deleuze and Guattari have elaborated an extensive philosophical framework based on the idea that thought is "rhizomatic"; in other words, if a plant analogy is to be made, the more appropriate one is a rhizome (as opposed to a tree), which forms recurring networks with itself. Check out A Thousand Plateaus for elaboration...