Planetary Assessment with Epsilon Machines

The culmination of several years work by myself and a group of fantastic collaborators was just published in the journal Nature Astronomy (full equivalent preprint version). Several years ago, a colleague at JPL, Dr Jonathan Jiang, presented a seminar about his work on detecting planetary properties using single point reflected light data (original: Jiang et al. 2018, and more recent work: Fan et al. 2019, Gu et al. 2021). By single point reflected light I mean the light reflected by a planet from its star that appears as a single point when viewed from very great distances away. The great distance removes all spatial resolution from what would otherwise be an image if you were looking at the planet from close by. So if you were watching a distant exoplanet for an extended period with a sufficiently powerful and sophisticated instrument, you would see a single point of light that varies in magnitude over time as the planet rotates, moves around its host star, and undergoes its own internal dynamics (such as clouds and their motion). With such an instrument you would likely be able to see these light variations at several wavelengths (sometimes called channels).

Dr Jiang had been using images of Earth from the Deep Space Climate Observatory (DSCOVR) to conduct a special kind of proxy experiment, where the Earth images were disk integrated (removing spatial resolution and converting images to a single pixel) as if the Earth was being viewed from a great distance. The DSCOVR camera being used takes images in 10 different wavelength bands. From the time series of these 10 disk integrated image sets, one can actually deduce many planetary characteristics including orbital period and rotation period. In fact, the great work of Dr Siteng Fan showed that one can even reconstruct spatial features of the planet from non-spatial, multi-wavelength data (see Fig. 4 of Fan et al. 2019).

Following on from this, Dr Jiang posed the following question: could we also use multi-wavelength time series of reflected light to detect life? Around that time I had been discussing various topics in complexity science with friends at the Earth-life Science Institute, and reading some of the incredible work of Prof. Jim Crutchfield and colleagues at UC Davis (this article is a particular favourite: Crutchfield 2012). In this context, a potential answer to Dr Jiang’s question occurred to me: perhaps the presence of life correlates with the complexity of planetary reflectance time series. It is this idea that drove the last ~3 years of research on this project. We had many challenges along the way, and the research would not have been possible without the superb work of Lixiang Gu. Her technique of surface type de and re-composition allowed her to generate many ‘synthetic Earth’ time series with various combinations of surface and cloud types (Gu et al. 2021). It was through analysing these different synthetic Earths that we were able to establish that the statistical complexity metric from computational mechanics, when applied to single point reflectance time series, can actually distinguish qualitative groups of planet types based on their surface features (see Fig. 1 of Bartlett et al. 2022). Further down the line, we were also able to compare Earth time series with Jupiter data from the Cassini mission, thanks to the great work of Jiazheng Li. He was able to extract equivalent time series data for Jupiter in three wavelength channels that are sufficiently close to channels in the original DSCOVR data. After completing the complexity analysis on this new data, we found that Earth’s complexity is approximately 50% higher than Jupiter’s.

So this work highlighted a potential route to quantify planetary complexity using only time series of light reflectance measurements. Advantages of this include the fact that it works for essentially any planetary body, regardless of type or chemistry. It is also completely agnostic to the types of physical, chemical or biological processes that might be occurring on a planet, hence has the potential to detect signatures that might be missed by other techniques that are more focused on life as we know it. At the same time, the method comes with unique challenges, starting with the acquisition of high quality time series for distant exoplanets. I am optimistic that the likes of JWST and the next generation of telescopes will rise to this challenge. Additional challenges include false positives: complex planets that might be completely sterile. We can only make progress on controlling for this possibility by applying our technique to more and more planets (which are in the process of doing). The other side of this is false negatives: simple planets that might contain life. My own opinion is that this is not a huge concern for the following reasons: from what I understand of life, in particular the non-linear characteristics of it, it seems unlikely that it would remain quiescent, hiding in the background of its host planet for extended periods. While I’m sure that early/proto life would be difficult if not impossible to detect from a great distance, after sufficient time (let’s throw out the number of ~billions of years), I would expect most types of life to expand and develop a variety of niches and entangled relationships with its planetary environs (see ideas such as those in: Chopra & Lineweaver 2016 and Lenton et al. 2018).

My hope is that this work introduces many people to ideas from complexity science that might not have otherwise been introduced to them. Epsilon machines and computational mechanics will likely play a key role in the understanding of causal reconstruction from data (as opposed to the pattern recognition one achieves when using a typical machine learning algorithm). Our work is certainly not an absolute answer to the great biosignatures question, but I hope it will play a role among a suite of other methods, which together, eventually, allow us to pinpoint that second data point of lyfe.

Complexity-entropy diagram with the various synthetic Earths used in our analysis.
Comparison of complexity-entropy values between Earth and Jupiter

Lyfe here, elsewhere

Me and a close friend, Dr Michael Wong at the University of Washington, just published a paper on a new definition of life that we have been refining in recent years. It is based on what we call ‘four pillars’. These are four characteristics that we predict must be present in a system in order for it to qualify as living. They are:

  1. Dissipation: the existence of thermodynamic gradients, driving forces, and the dissipation thereof
  2. Autocatalysis: the ability to grow exponentially when resources are abundant
  3. Homeostasis: the ability to regulate physical variables to within viable ranges
  4. Learning: the ability to encode and process information

We hence introduce a new term, Lyfe, which is the set of all living (actually lyving) systems in the universe, in contrast to life, which is all the living things on Earth (life is a subset of lyfe). We hope that this definition and distinction will help the astrobiology community as it searches for extraterrestrial life and seeks new understandings of Earthly life.

Figure 5

We also introduce a conceptual lifeform called a mechanotroph, which uses fluid kinetic energy to power its internal metabolism. Such an organism has never been detected on Earth, but its bioenergetics should be possible based on the idea of reversing the operation of a flagellar motor and using it as a generator.

The paper is open access and can be found here.

Ribowhat?

The ribosome lies at the heart of biological organisation. Indeed it is the nexus between the memory molecules of DNA, and the incredible machinery of proteins. It is known as the translation apparatus, because it reads information from DNA (via an intermediary, messenger RNA), and uses that information to form the precise sequences of amino acids that make proteins. Without this bewildering RNA-protein complex there would be no accurate synthesis of proteins, and life as we know it would not be possible (or have to use some alternative system).

There has been great progress in recent decades on re-constructing the historical development of this marvel. In particular, the group led by Loren Williams has used various molecular and computational techniques to establish the likely sequence that new sections of RNA were added as the ribosome became bigger, more sophisticated, and more accurate over evolutionary time. Their work offers the possibility to imagine very early ribosomes, and possibly the primitive version that was used by the so-called Last Universal Common Ancestor of life. The big mystery here is how much of the original structure of these ancient versions is left in modern ribosomes for us to use as a record? The ribosome presents difficulties to the RNA-world hypothesis for the origin of life because there has been a close coevolution between RNA and proteins in the early history of the ribosome and there seems to be little evidence that RNA was playing a catalysis role in the early stages of life.

The modern ribosome is an RNA-protein complex: it’s comprised of two large and complicated pieces of RNA called the large and small subunits, and it’s structure depends critically on ribosomal peptides. So it’s hard to imagine a ribosome, even a primitive one, working without peptides, especially since the whole point is to make peptides of a given sequence. We also have to consider transfer RNAs (tRNAs). They are also strands of RNA which bring amino acids to the ribosome, bind to the relevant mRNA (which passes through the ribosome as it is being translated), and then are discarded once their amino acid has joined the growing peptide chain. How do the tRNAs get hold of their cognate amino acids? Well they have to be ‘charged’ by molecules called aminoacyl tRNA synthetases (let’s call them fetchers). The fetchers have to first bind to their amino acid, and then pass that amino acid to their relevant tRNA. Now the fetchers are themselves proteins (coded by DNA and translated by the ribosome). Oh, and let’s not forget how the cell makes the ribosome itself. This is a complicated process (what a surprise) involving many proteins and genetic molecules.

So at this stage we have a sizeable mess of dependencies and compositions:

  • The ribosome is made of mainly RNA, but cannot be separated from its ribosomal peptides.
  • The mRNA and tRNA are of course made from RNA but the charging of the tRNAs is done by peptides (the fetchers).
  • The job of the ribosome is to translate information in the DNA to sequence structure in the proteins

From the perspective of the origin of life this compounds the existing chicken and egg paradox that genetic molecules need proteins for replication but proteins need to be assembled with information in the sequences of genetic molecules. Which came first? Of course this question is on the one hand confusing, and on the other misleading and probably irrelevant. Asking which thing came first is not helpful because prebiotic chemical systems would have been messy by definition, the molecule classes were probably never present in isolation, and even if they were they would not have had the same ‘function’ as they do in modern life. It’s a bit like asking whether positive or negatively charged particles emerged first after the big bang (maybe).

I prefer to imagine a scenario where both classes of molecules were intermingling in a stupendous chaos that included many other types of molecules, including those that play no role in life today. Whatever environment life emerged from, somehow some organising forces connected the dissipation of free energy to the condensation of coherent units capable of growing, regulating and learning. One fairly recent work used sequencing to compare the different RNA players in the ribosome story. They claim that the ribosomal RNA, and the tRNAs share a common evolutionary history, i.e. that the ribosome was originally coding for and translating it’s own proteins, and the specialisation to tRNAs and mRNAs came after the ribosome was simply directing the synthesis of itself. So here we have a coevolutionary story of primitive peptide ligation (joining together of amino acids, which is really hard to do outside of the cell, without all the clever cellular machinery) performed by the early peptidyl transferase center (the oldest and most central part of the large subunit of the ribosome), which somehow stumbled upon using extra bits of RNA to assemble amino acids according to a sequence of RNA bases. This is a fascinating take on the ribosome’s story and the closest thing I’ve seen so far to a potential storyline between the first lump of RNA that looked a bit like a ribosome, to a functioning ribosome. It does not explain where that first lump of RNA came from, which is generally a much harder problem. Prebiotic chemists work hard to understand how RNA can form on its own, but in general there is a long way to go. And even once that is solved, there will probably still be an open question of why those RNA bits might form into a lump that might work a bit like a very simply ribosome.

Underneath all these molecular details there is a cleaner, fundamental problem of why a non-equilibrium system (one containing free energy gradients, such as our planet) would start to couple the dissipation of free energy with the construction of coherent entities capable of autocatalytic growth, self-regulation and information processing. We have examples of patterns that result from free energy gradients (convection cells are of course one of my favourite), but nothing even remotely close to the complexity of life. We should be wary of the hope that modern life has kept a sufficient record of its history in its current construction. Life’s evolution has never followed any kind of logical design process. Sure it has optimised and tuned many things, but much of its nanoscale construction is both brilliant and confusing from a design perspective. I have great hope that results from information thermodynamics may eventually reveal why and how dissipation can lead to emergent complexity and the phenomena of life. I suspect that even a primitive ability to process information from an environment can lead to an exponential positive feedback on that ability, and that this is reflected in evolutionary history: the major transitions are changes in the way information is processed, communicated and utilised.

For the not-so-humble ribosome, I look forward to future experiments where we synthetically bring primitive ancestors of the modern ribosome back to life, and watch as they provide clues to their distant and veiled beginnings.

Unscrambling Therapy

No longer is anyone unaffected by the Covid-19 crisis. There is also no one who is neither frustrated, upset, stressed, nor anxious. Much frustration comes from not understanding the situation, especially when in theory it should be understandable using appropriate analytical techniques. Even though reading some of the flurry of papers in the literature can augment worry, I also find it somewhat calming at the same time, because it creates a (perhaps illusory) feeling of understanding (which in turn can give hope that there might eventually be a degree of control). From my own perspective of someone with a keen interest in the natural world and its large-scale organisation, the turmoil raises several themes related to complex systems, networks and nonlinearity, alongside the basic humanitarian aspects.

Firstly, is there anything positive that can come out of such a pandemic? Well, hopefully most people will end up with long term immunity to the virus, preventing future outbreaks, healthcare systems and preparedness should be enhanced in the long term, and hopefully everyone’s awareness will be raised. Thankfully, this virus was not as pathological as some historical examples. What would have happened if the fatality rate was >5%?

If nothing else, this event has highlighted how incredibly interdependent the modules of our global society are. If I injure my index finger, I feel pain in my brain and my immune system coordinates to repair the damage. Eventually the issue is rectified and homeostasis returns. Does a damaged finger disrupt my digestion? Does it impact the oxygen saturation of my blood, or compromise my insulin response? Probably not, at least not by any significant amount. If we compare to the current situation, the pandemic is bringing the healthcare systems of the world to a critical situation. But several weeks before that, the financial system already suffered its own dramatic panic. Problems in that system propagate rapidly into almost all other subsystems of the global system, and in particular our information systems are highly vulnerable to panic, which tends to spawn fake news just as fast as the worst virus spawns progeny. And toxic fake news propagates out of the digital world, and disrupts the real world with negative behaviours motivated by falsities.

My point is that many biological systems are modular in a highly resilient way. The modules are connected in terms of supporting one another, but isolated in a way that problems in one module do not immediately propagate into and compromise other modules. For our global society, the finance industry module should be interfaced with the real economy and information system in such a way that it can provide benefits, but not infect and disrupt the other modules as soon as something negative occurs. Likewise the healthcare system should be supported and maintained by the other modules of the system, but problems from outside should not be able to propagate in and compromise it. For example, the lack of supply of PPE is an example of a constraint in an external system propagating into the healthcare system and compromising it. When the body is under attack it prioritises self-protection. Likewise global society should have a healthcare system that is shielded from fluctuations, problems or constraints in other subsystems (to the extent that this is possible, certainly more so than the current setup).

The primary measure against the spread of the virus has been social isolation, which cuts off the virus’ primary spread mode. When staying at home what do most of us do? Well we turn of course to the digital world, social media, news media and shows of all sorts. The basis of the isolation policies is that at home we are all mutually shielded from this biological infectious agent, which works as we know. However the digital realm has its own infectious agents, as we know. The primary difference between biological and digital infectious agents, is that biological agents produce physiological symptoms and problems, whereas digital infectious agents primarily have psychological effects (alongside the otherwise relevant bank fraud and hacking, etc.). We know that digital infectious agents (misinformation, disinformation, information overload, information addiction, etc.) cause a broad range of issues from excessive consumption of attention to outright social deprivation, depression and suicide. However these effects are much harder to measure and quantify than purely biological (physiological) infectious agents. For example, information addiction might slowly eat away at someone’s life over a long period, eventually causing them to abandon most life activities in favour of an addictive online activity, leading to abandonment of many facets of what would be considered a balanced life. But in such a case it would be difficult for governments or health providers to track such a decline, and I’m sure many such cases go somewhat unnoticed. So how is this relevant to covid-19? Well the point is that any system involving learning agents seems to produce parasitic entities which enslave cooperative systems for their own replication, while disrupting the function of the cooperative systems. So while staying at home shields us from biological agents, it potentially exposes us to a larger number of digital infectious agents, which have their own suite of damaging effects. We are given scientifically-backed advice on combatting biological infectious agents, but protective frameworks against digital infectious agents are still in their infancy, and in general those agents are currently raging almost unchecked online, causing all manner of disruptions, the extent of which can only be glimpsed at present. While governments have put in a certain amount of effort to helping society protect itself from digital infectious agents, there is a vast gulf between where we are and where we need to be. If we could measure the damage caused by these pathogens, what would be its magnitude? An online future will have to include institutions that are dedicated to the study and mitigation of these more subtle infectious agents.

The other theme that occurs to me is does the virome play any kind of controlling role on the scale of the planetary biosphere? Our global and connected society created a strong niche for this type of virus and in hindsight it’s no surprise that such an ‘organism’ discovered this niche and exploited it. However, by exploiting this niche, the weaknesses in our global system have been highlighted and the movements of every person on Earth have been influenced. And this simply by a small piece of RNA and a handful of proteins. So it is undeniable that this and many other viruses exert a controlling influence on whole ecosystems and the entire biosphere. But again, the question is, is there any possible positive outcome from such events? One would hope that it raises awareness globally of the need to be mindful and respectful of the natural world instead of simply consuming and extracting, while watching it fall apart at the seams. But in general I think that the resilience of the global system should be made demonstrably stronger in the wake of this crisis. In that sense when the 5% fatality rate pathogen arrives, it will not wipe us out, as it would have done had it arrived before SARS-coV2. I hope that leaders eventually take seriously the fact that critical modules of our global system need strong firewalls such that when one is compromised, the others do not have to suffer proportionate damage. Likewise when submodules have to be shut down, there should be mechanisms by which the pausing of their operations does not force them to suffer catastrophic losses.

I was also confused for a while about the seemingly bizarre lags and delays between virus onsets in different countries. Given the origin’s position as a global hub, one would expect the distribution and spread of the virus to be fairly homogeneous and simultaneous across the rest of the world. Why is it that Italy had such a crisis while other countries were seeing relatively few cases, only to have their own crises ~1 week later? Surely the rest of the world should have seen their exponential growth at roughly the same time. But herein lies the reason they did not: nonlinear processes such as exponential growth are highly sensitive to initial conditions. So perhaps by some random fluctuation, such as Italy receiving a slightly higher number of visitors from infected locations just before the virus took hold there, Italy was dealt a highly unfortunate early blow. Perhaps other places that saw their cases spike later, just happened to have a slightly lower number of early infected visitors. It’s completely plausible that differences of only a handful of people at the early stages could translate to differences of weeks for peak exponential growth later on. Nonlinear systems are known to exhibit such sensitive dependence on initial conditions.

The exponential aspect also partly explains the complacency of some countries to take drastic action: as humans we are much better at perceiving and extrapolating linear processes than exponential ones. Indeed, exponential growth is virtually indistinguishable from linear growth in the early, ‘flat’ phase before the steep ascent. So while Italy was experiencing its terrifying exponential phase, the view and numbers from other countries gave the illusion of being linear. And from afar, again as humans, perhaps we perceived that we would not suffer the same fate.

It seems clear that big-data, internet-of-things approaches are the best chance at early warning systems for these types of epidemics. The challenge will be balancing the distribution of such personal information with the protection of individual privacy. But before that, the resilience of our interconnected modules needs to be enhanced by first accepting, and then appropriately tuning, their interdependencies.

Vital Little Machines

This is a mini review of a great book that I recently read called Life’s Ratchet, How Molecular Machines Extract Order from Chaos. In my research I’m interested in how the complex of machinery of life stumbled out of its humble geochemical beginnings. I also have a keen interest in thermodynamics and something microscopic that extracts order from chaos sounds more than a little like a Maxwell’s demon. Said demon is an interesting little physical paradox that started life as an almost comical thought experiment. However it ended up taking some of the finest minds of the twentieth century to prove that this imaginary little gremlin couldn’t function as envisaged.

The central theme of the book is that some of the most important large molecules used by organisms are actually micro-scale machines that source their energy from background thermal noise (as opposed to being purely fuelled by high energy molecules). This is hard to believe at first glance because it sounds like a Maxwell’s demon and a second law of thermodynamics violater. As the book explains, the second law is safe from these little wonders, since they still require a supply of chemical free energy to bias their motion. If left on their own in an equilibrium system, they would just go about a random walk. But with a supply of ATP, they are able to ‘trap’ thermal fluctuations of a particular direction and produce sustained useful mechanical work.

The first third of the book introduces the history of research into the physics of life including the futile search for a vital force (once believed to be electricity, since it could ‘animate’ dead frogs, among other things). There is then a nicely presented summary of thermodynamics to prepare the reader for some of the concepts introduced when describing molecular machines.

The core of the book focuses on a family of walking protein molecules called Kinesin, which the author has studied intensively within his lab using state of the art measurement techniques including atomic force microscopy. This incredible molecule indeed gets its power from Brownian impacts (always present in any system at finite temperature). A supply of ATP (or rather when the concentration of ATP is out of equilibrium from its hydrolysis products), produces conformational biases in the protein that increases the chance that it steps forward rather than backwards (along its microtubule track). This raises a fascinating point: has life correctly discovered that at these length scales (the nano scale), chemical to work conversion is more effectively carried out by sucking energy from the “molecular storm” at the expense of a chemical potential gradient, than by simply plugging that chemical potential gradient straight into a molecular machine (while effectively ignoring or resisting the molecular storm). Of course such tiny objects can never resist brownian motion in the same way that a macroscale object (such as a car engine) does. So maybe life did make the correct functional choice since if you can’t beat thermal fluctuations, why not find a way to harness them?

Overall this was a great book, that I really enjoyed reading. The only minor disappointment was that I was expecting the discussion of molecular machines to be related back to the origin of life, or at least to the ancestors of modern molecular machines. The final parts of the book did not really touch on this, which was a shame. Nonetheless I thoroughly recommend this book for anyone who wants an accessible introduction to the dynamics of this incredible class of biomolecules.

Into Kamakura

 

I recently went back to a town called Kamakura, on the coastline south of Tokyo. This is an extremely popular place for Japanese and foreign tourists alike and for very good reason. There is a beach and copious amounts of wind, there are forested hills with decent hiking trails, and finally the main draw is the vast array of temples and shrines. Japan is incredible for the sheer density of its beautiful temples, and the high standards at which they are maintained. This is partly because they are not just historical relics or tourist attractions, but active and vibrant places of religious activity. As active, if not more so, than they would have been for hundreds of years.
Japanese philosophy would take years, perhaps decades, to begin to understand. However you can feel as if you are absorbing some of the wisdom by diffusion if you simply tour around the temples of Kamakura. I was able to go a bit further afield this time around and tried to capture the atmosphere of the place, where nature and human constructions intermingle in a way that is unique to Japan.

DSC02853DSC02870DSC02850DSC02848DSC02836DSC02846DSC02872

Signs of hope

This program struck me for its subtle presentation of something amazing: the gradual recovery of some iconic European species in the wilds of France. It’s easy to assume that here in central Europe we’ve lost our large predators and a big chunk of our biodiversity in general. To some extent this is true, but what is also true is that persistent efforts by conservationists are showing incredibly impressive results such as the re-introduction of bears and vultures to some regions of France. While we hear every day of extinctions and habitat destruction, it’s easy to overlook the positive work going on right on our doorstep to ensure that there are still wild places for the benefit of us and all the species that inhabit them.
http://www.bbc.co.uk/i/b041z55p/

Reaction Diffusion Phase Portrait

rd_phase

This image shows the various structures that form in what is known as a Gray-Scott reaction diffusion system, as the two key parameters are varied (for more info see the seminal paper or this excellent website). I’m running simulations of these systems as part of my PhD work, and this image struck me as particularly stunning. The pattern-forming part of the phase diagram is like the shoreline between realms of stability.