AI unravels chemical signs of the earliest life on Earth

Evidence for the earliest life on Earth has largely relied on finding signs of structures that may have been created during the Archaean Eon by micro-organisms. Actual fossils don’t turn up until the Proterozoic. The most distinctive and diverse of these are members of the Ediacaran fauna dated at around 635 Ma . The oldest widely accepted multi-celled eukaryote fossil was found in 2.1 billion-year old sediments from Gabon (see: The earliest multicelled life; July 2010). There have been a few claims for biogenic material, such as microscopic tubular structures in 3.5 billion-year (Ga) old pillow lavas and 3.2 Ga cherts from South Africa (see: Early biomarkers in South African pillow lavas; April 2004 and Believable Archaean fossils; March 2010) which some researchers dispute. Then there are Archaean stromatolites, which may be evidence for bacterial mats. The oldest of them have been claimed to occur in the famous, 3.77 Ga Isua metasediments of West Greenland. But such early fossils are chance finds, so geochemists have entered the arena with attempts to find irrefutable chemical signatures for life in ancient rocks.

One approach is isotope geochemistry. Carbon isotope data have been widely used, because life processes, such as photosynthesis, result in a deficiency of 13C relative to 12C. This was tried on graphite crystals trapped in sedimentary phosphate minerals from Isua. The results were at first acclaimed as a sign of life at around 3.8 Ga, but then refuted. In 2015 a similar approach was applied to graphite trapped in a 4.1 Ga detrital zircon, seemingly pushing back evidence for life into the Hadean. But zircon is a mineral produced by crystallisation of magma, so the fractionation of carbon isotopes in trapped graphite seem unlikely to shed light on the earliest life. The main drawback to using carbon isotopes is because metamorphism, Fischer-Tropsch mechanisms in hydrothermal environments, and volcanic processes may be responsible for enrichment of lighter carbon isotopes relative to 13C. The relative abundance of the different isotopes of iron in Archaean sediment may give clues to the transient availability of oxygen generated by bacterial photosynthesis that would oxidise soluble Fe2+ to insoluble Fe3+. Promising results were obtained in 2013 from 3.8 Ga banded ironstones at Isua. But doubt was again raised, so the only generally accepted evidence is that of the microfossils found in hydrothermal cherts in Palaeoarchaean pillow lavas from South Africa and Western Australia and the earliest stromatolites, all around 3.4 to 3.5 Ga old. However, recent research may have opened up a more convincing route to tracking down ancient life forms –actual organic molecules that make up or are produced by organisms.

Michael Wong and co-workers at the Carnegie Institution for Science in Washington, DC, USA together with other colleagues from the US, Austria, Canada, China, Belgium, Norway, Australia, the UK and France used artificial intelligence to wade through the results of geochemical analysis of over 400 ancient and modern carbon-bearing samples. (Wong, M.I. and 28 others 2025. Organic geochemical evidence for life in Archean rocks identified by pyrolysis–GC–MS and supervised machine learning. Proceedings of the National Academy of Sciences, v. 122, article e2514534122: DOI: 10.1073/pnas.2514534122). Their objective was to track the presence of organically derived molecules as far back as possible. Their approach bears a passing resemblance to that used to build genomes of ancient fossils from broken bits of DNA that reside in them. Like DNA, bio-molecules degrade over time, but leave fragments in rocks that can be detected using pyrolysis gas chromatography and mass spectrometry. In itself PGC-MS is not especially new, but using artificial intelligence (machine learning) on a massive date set certainly is: perhaps the first major trial of AI in geology.

Percentages of samples designated as biogenic by Wong et al’s AI analysis. Credit: Wong et al, Fig 4

Their samples were not just ancient rocks going back into the Archaean as far back as 3.5 Ga, but included modern biological material, meteorites presumed to have been devoid of life since their origin in pre-solar system times and synthetic samples. Wong et al divided 272 samples with known biological affinities into 9 groups to train the AI algorithm. The analytical method breaks down organic and inorganic carbonaceous materials into fragments of molecules: the opposite of DNA sequencing. When subjected to PGC-MS each type of living organism, from bacteria to animals produces a distinct pattern of molecular fragments. The AI analysis is based on a sophisticated statistical algorithm being trained to recognise ‘debris’ from organic and inorganic carbonaceous compounds according to each sample’s geochemical ‘fingerprint’. Part of the ‘training’ was based on sediments that contain irrefutable fossil samples from as far back in time as the Mesoproterozoic (1000 Ma). Another part was based on definitely inorganic materials, such as carbonaceous meteorites. AI proved able to distinguish biological from inorganic material with a probability up to 0.9 (90%). These results suggested that older, more biologically uncertain material could be assessed.

The AI was able to distinguish general biogenic affinities from inorganic ones in samples with decreasing success going back in time: as high as 0.93 in the Phanerozoic to 0.47 in the Archaean. The oldest samples that reached the probability threshold for this distinction (0.6) were 3.3 Ga cherts from the Barberton Greenstone Belt in South Africa. Another distinction between photosynthetic and and non-photosynthetic affinities among the samples that ‘passed’ as probably biotic reached the 0.6 probability threshold at 2.5 Ga for a sample from South Africa. Non-photosynthetic, but still probably biotic samples extend as far back as 3.5 Ga in South African and Western Australian Greenstone Belts.

Although Wong et al’s preliminary exploration with their novel approach doesn’t take us beyond the current 3.4 to 3.5 Ga age for the earliest tangible suggestions of life. However, they note ‘…our sample inventory is notably lacking in ancient abiogenic samples’. This is a good indication of the promise for further progress that the approach offers. Previous research has sought intact biogenic molecules, with not a great deal of luck, over several decades. Their final conclusion is ‘…information-rich attributes of ancient organic matter, even though highly degraded and with few if any surviving biomolecules, have much to reveal about the nature and evolution of life.’ They have opened a very important avenue in palaeobiological research , as their methodology seems capable of fine tuning to all manner of pro- and eukaryote biochemical distinctions. It could even be used with extraterrestrial material, should we ever get any …

See also: Walsh, E. 2025. Researchers report earliest molecular evidence of photosynthetic life. Chemical & Engineering News, 18 November 2025.

A ‘worm’ revolution and ecological transition before the Cambrian explosion

Bioturbated ‘pipe rock’ of the basal Cambrian sandstones of NW Scotland. Credit: British Geological Survey photograph P531881

About 530 Ma ago most of the basic body plans of today’s living organisms can be detected as fossils, i.e. preserved hard parts. Yet studies of trace fossils (ichnofossils) – marks left in sediments by active soft bodied creatures suggest that many modern phyla arose before the start of the Cambrian (~539 Ma), as early as 545 Ma. So the term ‘Cambrian explosion’ seems to be a bit of a misnomer on two counts: it lasted around 15 Ma and began before the Cambrian. Preceding it was the Ediacaran Period that began around 100 Ma earlier in the Neoproterozoic Era. Traces of its eponymous fauna of large soft-bodied organisms are found on all continents, but apparently none of them made it into the Phanerozoic fossil record. Another characteristic of the Ediacaran is that its sedimentary rocks – and those of earlier times – show no signs of burrowing: they are not bioturbated. That may be why the Ediacaran pancake-, bun-, bag- and pen-like lifeforms are so remarkably well preserved. But a lack of burrowing did not extend to the beginning of Cambrian times. The most likely reason why it was absent during the early Ediacaran Period is that sea-floor sediments then were devoid of oxygen so eukaryote animals could not live in them. But the presence of these large organisms showed that seawater must have been oxygenated. Now clear signs of burrowing have emerged from study of Ediacaran rocks exposed in the Yangtze Gorge of Hubei,southern China ( Zhe Chen & Yarong Liu 2025. Advent of three-dimensional sediment exploration reveals Ediacaran-Cambrian ecosystem transition. Science Advances, v. 11, article eadx9449; DOI: 10.1126/sciadv.adx9449).

Tadpole-like trace fossils from the Ediacaran Dengying Formation in the Yangtze Gorge: 5 cm scale bars. The ‘heads’ show tiny depressions suggesting that there maker probed into the sediments as well as foraging horizontally. Credit: Zhe Chen & Yarong Liu; Figs 3B and 3D

Zhe Chen and Yarong Liu of the Nanjing Institute of Geology and Palaeontology and Chinese Academy of Sciences in China examined carbonates of the upper Ediacaran Dengying Formation. This overlies the Doushantuo Formation (550 to 635 Ma), known for tiny fossils of possibly the oldest deuterostome Saccorhytus coronaries; a potential candidate for the ancestor of modern bilaterian phyla. In the Yangtze Gorge locality sediments at this level show only traces of browsing of bacterial mats on the sediment surface; i.e. 2-D feeders. The basal Dengying sediments host clear signs that organisms could then penetrate into the sediments. These 3-D feeders , would have had access to buried organic remains, hitherto unexploited by living organisms. Such animal-sediment interactions would have disturbed and diminished the living microbial mats that held the sediment surface in place, and thus began to dismantle the substrate for the typical Edicaran fauna. Similar 3-D feeders occur throughout the 11 Ma represented by the Dengying Formation to the start of the Cambrian. This beginning of bioturbation heralded a period during which the Ediacaran fauna steadily waned. It also released nutrients into deep water, and opened up new ecological niches for more advanced animals on the seabed.  Dissolved oxygen could only slowly enter the sediments since atmospheric and oceanic O2 levels were low. But by the earliest Cambrian it had risen to about 5 to 10% by volume to support many other kinds of burrowing animals that could penetrate more deeply, as witnessed by the abundant sandstones that occur at the base of the Cambrian in Britain.