The basal ganglia for action selection

Peter Redgrave, Kevin Gurney, and John Reynolds have a series of papers out where they detail a basal ganglia model, looking at its physiology and potential functional role in the brain. They address a number of different points in their papers, and I’m going to write up a couple of posts in hopes of making the model / material more accessible and furthering my own understanding. I’ll also be adding in my own thoughts and questions as I go along, but I’ll try to keep explicit when ideas are coming from papers and when they’re coming from me. In this post I’m going to look at the basal ganglia’s proposed role as an action selection center.

Basal ganglia as an action selection center
In complex systems like the brain, there are numerous processes and sub-systems operating in parallel. Things like feeding, predator avoidance, mating, etc are all going to be suggesting a specific course of action for the body to follow (hereafter these different sub-systems will be referred to as ‘command systems’, keeping with terminology from the paper series). The problem arises in that there is only one body, and letting all the command systems have at controlling the body all at once is a poor idea for generating effective / efficient behavior. What is needed is a method of relegating control of the motor system to a single command system, and preventing signals from other command systems from interfering. This can be done by having all command systems put forward an ‘urgency’ (or saliency) level, and then using a winner-take-all (WTA) function to choose one to be in control.

In [Redgrave 1999], a set of possible solutions from engineering are presented in three WTA system architecture types: subsumption, distributed, and central selection.

Subsumption: In the subsumption architecture, the command systems have a priority ordering. In the event of a conflict, systems higher up on the priority list can override those lower than them to interrupt and suppress or replace outgoing commands. Although this allows quick response to environmental contingencies (such as the appearance of a predator, with the ‘evade predators’ command system given top priority), the prioritization is built in to the system, and as more command systems are added it becomes difficult to determine a proper prioritization. Additionally, due to the ordering of systems being built-in, the subsumption architecture displays far less flexibility than biological nervous systems.

Distributed: The distributed architecture is a popular choice for winner-take-all implementations, where each option is connected to all the others with an inhibitory connection. As the saliency of a given option increases, it inhibits the other options, which in turn reduces the inhibition they project back, until only one option is uninhibited. Here, selection is considered an ’emergent’ property of the network. This architecture also supports adaptation, as the weighting of the connections between options can be tuned, giving rise to complex dominance dynamics. However, there is a costly implementation. First, every option must be connected to every other option (resulting in n(n – 1) connections), and the connection weights properly balanced to give the desired prioritization. Second, to integrate a new option into the system another 2n connections must be added, and they must be properly balanced with the already existing connection weights. Third, the more options that are added to this system, the longer it takes to choose between them, especially if several options present saliency values very close to one another (this last point was added by me, and is not stated in the papers).

Central selection: In the central selection architecture, all of the command systems send their saliency values to a central switching device, which chooses one of them as the winner. In this case, the complexity of system connectivity is significantly reduced, to only 2n connections total (one from and to each command system), and to add a new system only requires 2 connections be added. Additionally, the case of tuning the connection weights from each system becomes significantly easier, as the dynamics that determine the winner are now explicitly based on the weighting of the only connection from each command system in to the central switching device.

Unsurprisingly, the central selection architecture is proposed to best model the structure of the brain for selecting between command systems (although the authors suggest each command system may implement a distributed selection architecture internally), and the basal ganglia is proposed as the central switching device. Supporting this, a computational model of the basal ganglia was presented in [Gurney 2001], and implemented in spiking neurons in [Stewart 2010], based on biological structure that very efficiently perform winner-take-all functionality. Interestingly, its architecture is such that it effectively chooses a winner quickly regardless of the number of competitors and its performance does not suffer from competitors presenting very similar saliency values [Stewart 2010].

Central selection constraints: The use of a central selection architecture also imposes several constraints: 1) the saliency of each competitor must be measured in some ‘common currency’, and 2) the output of the central switching device (the basal ganglia) must be set up such that it can activate the winning command system, and disable the losing ones.

For the common currency between command systems, the authors propose the use of dual population encoding [Koechlin 1996], which I’ll go into in another post more in detail, but basically says two things can be extracted from the firing pattern of a population of neurons: the first is the information being represented, and the second is the saliency of this information, determined as the norm of firing rates of the neurons.

To address the second constraint, we’ll first need to look briefly at the structure of the basal ganglia.

Basic structure of the basal ganglia
This is a very low-res diagram of the neurobiological structure of the basal ganglia, taken from [Gurney 2001]:
The principle input components of the basal ganglia are the striatum and the subthalamic nuclean (STN). These structures receive projections from pretty much the entire cerebral cortex, including the motor, sensory, association, and limbic areas. The main output components of the basal ganglia are the internal segment of the globus pallidus (GPi), and the substantia nigra pars reticulata and lateralis (SNr). The output of the basal ganglia projects then through the thalamus and back to the cortex. Notably, projections routed through the thalamus go to both the same sites that originated the basal ganglia input, as well as others, forming both closed and open loop systems [Joel 1994].

Parallel functional loops: There are two particular points of interest of the basal ganglia structure relevant to this discussion. The first is that there is an intrinsic separation of information from different brain regions as it travels through the basal ganglia, such that the basal ganglia can be viewed as having a number of different processing tracts that operate in parallel: limbic, associative, sensory, and motor. This is the set of closed loops mentioned above. Here is an illustration, taken from [Redgrave 2011]:
All of these loops have a highly similar structure, suggesting that each performs the same function on different information [Voorn 2004].

Tonically inhibitory output: The second point of interest addresses our second constraint mentioned above, of requiring some mechanism for enabling / disabling the output from a chosen command system to take control of the body: The output from the basal ganglia to the thalamus is tonically inhibitory. There have been several possible functional roles proposed for this tonic inhibition, both in the closed and open loop projections. I’ll discuss the closed loop case below. In the open loop projections, there seems to be a clear potential for a ‘gating’ mechanism, where the output from the winning system is disinhibited in the thalamus and allowed to pass forward. Extrapolating from this, I’ve made a very, very, very simplified diagram illustrating how open loop gating using tonic inhibition could work:
Here, the association area has a bunch of different command systems, labelled 1 through K, which all have their own ideas about what the motor control system should be doing. They each send out a branching projection, with the saliency values used by the basal ganglia, and the information carried into the thalamus. They all project to a part of the thalamus which routes the information to the motor system, but due to tonic inhibition from the basal ganglia, no information is passed through. Once the basal ganglia chooses a winner from the K command systems, however, that winner’s channel in the thalamus is disinhibited, and it can send it’s directions out to the motor system for execution. In this way, the basal ganglia has the ability to enable / disable output from a command system.

After discussion with a couple of the guys in my lab, a couple of benefits of using tonic inhibition over selective excitation as the output of the basal ganglia have come up.
The first is that the use of inhibition is a much simpler implementation of a gateway. When using inhibition, the connections from the basal ganglia fire if no information should pass through, and stop firing when it should. In the case of activation, however, there is necessarily some sort of multiplication operation being performed such that the output from the gateway is GATEWAY_VALUE * INPUT_VALUE. In addition to being more complicated that inhibition of undesired options, it’s inclined towards performance errors.
This is the second point, in that with tonic inhibition the basal ganglia stomps everything out. So nothing is accidentally passed through a gateway if a INPUT_VALUE becomes highly active. With selective activation, it’s foreseeable that high levels of INPUT_VALUE could mimic the activation levels of GATEWAY_VALUE * INPUT_VALUE. In these ways tonic inhibition makes a gateway functionality more efficient and effective.

Alternatively, the open loop gating could also function as my supervisory, Chris Eliasmith, comments below: The routing signal from the basal ganglia is projected through the thalamus out to modulate the cortico-cortico connections from the associative area to the motor cortex. Modifying the example diagram above to operate this way, we get:

In this case I drew out the different connections for clarity. The saliency values are projected to the basal ganglia, and a winner is chosen. The modulatory values projected through the thalamus then connect to the corticocortical connections from the associative area to the motor area, and set such that the winner is allowed to project into the motor area and the others are prevented. The benefit of performing gating this way is that the required bandwidth for information passing through the thalamus is significantly reduced.

Closed loop projections: The information in this subsection is not discussed in the paper series. The natural question following the discussion of the potential role of the open loops and tonic inhibition in the thalamus as a gating mechanism is what could the role of the closed loops be? The basal ganglia has been shown to play a strong role in motor learning and sequence learning. In [Stewart 2008], a spiking neuron model of the basal ganglia was developed that demonstrates how the recurrent connections with the cortex can be used to control the evolving dynamics of a population of neurons. In the paper a simple set of rules for counting are developed. In experiments on monkeys involving sequence learning, monkeys perform a similar type of learning figuring out how to appropriately move their arms to get the reward. If the basal ganglia is damaged, the monkeys are no longer able to learn new sequences, but can still perform previously learned sequences [Turner 2005].

[Ashby 2007] propose that information such as motor sequences can be learned in the basal ganglia, where very fine-grained mechanisms for identifying the timing and causal relationship between action and effect exit, and once learned, it can then be transferred to the cortex for more automatic execution. This is thought to be what has happened when monkeys are able to execute previously learned sequences, but not able to learn new sequences.

The use of tonically inhibitory output here is still unclear, but one possibility is that inside each command system there is a distributed network, containing each of the possible ‘next step’s for that command system. Inside the basal ganglia, one of these next steps is chosen, and it’s selection amounts to disinhibiting recurrent connections back to itself, allowing its saliency to increase to a point that all the other options are fully inhibited and the dynamics evolve according to the chosen next step.

Hierarchical selection of action
Now with this whole system in place, it is proposed that this structure serves to implement a hierarchy of action selection [Redgrave 1999]. In this hierarchy, the decision on how to next move would start out at a very abstract level, as a competition between some basic command systems arguing about how hungry, tired, horny, etc you are. Once it’s decided that you are more hungry than the others, the next level of the hierarchy is engaged to decide what your best option is: go to the store to get food, eat your canned beans, or order a pizza. This then continues on until you get to a level of deciding what muscles to move, all based on your goal of eating a can of beans. This of course is a gross simplification of any possible analogous process in the brain, but it hopefully gets the point across.

One of the major benefits of a hierarchical action selection setup is that decision making is simplified on the lower levels, because a large number of options are not in line with the decisions made at a higher level. For example, to the end of getting your can of beans, you probably don’t have to decide to not punch yourself in the face, because it doesn’t further you along your path to getting beans.

Things of course become even more complicated when you consider that is possible to be working towards to goals at the same time, in that it is possible for us to successfully walk and chew gum at the same time. But looking at that falls outside of the scope of this post.

In this post I’ve put forth the case presented in the paper series from Redgrave et al for the basal ganglia as an action selection center. Without a doubt there is much more experimental work that needs to be examined, but here I’ve focused on providing a brief overview of how the basal ganglia could be implementing action selection. In future posts on the subject, I’ll be looking at other issues addressed by the Redgrave paper series, in particular the role of the short-latency phasic dopamine signal in the basal ganglia. My goal is to work through these papers and then present an incorporation of this work into a larger model of the motor control system.

————————————————————————————-

[Ashby 2007] – A neurobiological theory of automaticity in perceptual categorization
[Gurney 2001] – A computational model of action selection in the basal ganglia. I. A new functional anatomy
[Joel 1994] – The organization of the basal ganglia-thalamocortical circuits: open interconnected rather than closed segregated
[Koechlin 1996] – Dual Population Coding in the Neocortex: A Model of Interaction between Representation and Attention in the Visual Cortex
[Redgrave 1999] – The Basal Ganglia: A Vertebrate Solution To The Selection Problem?
[Redgrave 2011] – Functional properties of the basal ganglia’s re-entrant loop architecture: selection and reinforcement
[Stewart 2008] – Building production systems with realistic spiking neurons
[Stewart 2010] – Dynamic Behaviour of a Spiking Model of Action Selection in the Basal Ganglia
[Turner 2005] – Sequential Motor Behavior and the Basal Ganglia: Evidence from a serial reaction time task in monkeys
[Voorn 2004] – Putting a spin on the dorsal–ventral divide of the striatum

ResearchBlogging.org
Redgrave P, Prescott TJ, & Gurney K (1999). The basal ganglia: a vertebrate solution to the selection problem? Neuroscience, 89 (4), 1009-23 PMID: 10362291

Tagged ,

3 thoughts on “The basal ganglia for action selection

  1. Interesting thought about each control system keeping track of its own set of next actions… the question, then, is would each system be able to create and project new actions through the BG, or would the BG still have to, essentially, know about all possible actions?

    • travisdewolf says:

      The way I think about it (in the case of motor system) is that there are a set of primitive in the motor cortex, and the basal ganglia is able to generate novel movements through weighted summation of the available primitives in the cortex. In effect, the noisy weighting of these primitives brings about an exploration of possible actions without the basal ganglia explicitly ‘knowing’ what is going on!

  2. Chris Eliasmith says:

    On open loop gating:

    It seems to me that in a structure like the brain, there is no difference between having the connection controlled by thalamus send information through thalamus to be ‘forwarded’ to a brain area, and just having that information directly connected to the receiving area which is gated by thalamus in the area. Generically speaking, there is a difference (wiring in cortex is needed that wouldn’t be otherwise). But wiring in cortex is pretty common… and you can avoid problems of loss/change to high-bandwidth information occurring on the roundabout trip through thalamus.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: