Multicentric Language Markers of NeuroDegeneration

In the next decade, Europe will experience a significant demographic shift, with over 35% of its population surpassing the age of 65. This foreshadows a dramatic growth of aging-related neurodegenerative diseases (NDs). Among the most prevalent NDs are Alzheimer's disease (AD) and Parkinson's disease (PD). While these conditions share some underlying neuropathological features, they exhibit distinct cognitive and brain connectivity profiles. AD primarily affects temporo-parietal networks, leading to memory impairments, whereas PD disrupts fronto-striatal networks, impacting motor skills and executive functions. Moreover, each disease involves distinct language difficulties, with AD affecting lexico-semantic skills and PD impacting semantic control, action meaning processing and morpho-syntactic skills.

Currently, diagnosing and monitoring NDs rely on lengthy, stressful cognitive tests and costly brain scans, which can be particularly challenging in regions with economic disparities. To overcome these limitations, MULTI-LAND aims to implement a ground-breaking approach focused on natural language markers (NLMs)—linguistic features extracted from patients’ speech that are analysed using machine learning in an automated fashion to identify ND conditions. While NLMs have shown promise in detecting mental health conditions, their application to NDs is limited. Additionally, no study has yet (i) examined the link between NLMs and disease-specific brain network disruptions, (ii) evaluated their reliability compared to cognitive tests, or (iii) tested their diagnostic potential across multiple research centres and countries.

The primary research objective of MULTI-LAND is to conduct a comprehensive, cross-methodological (behavioural, MRI/fMRI, EEG) and multi-centric (i.e. the BCBL in Europe and the CNC-UdeSA in Argentina) validation of NLMs. The overarching hypothesis is that NLMs will robustly discriminate patients from healthy controls, while unveiling disease-specific neurocognitive patterns, offering a cost-effective, scalable, and remotely applicable approach for ND detection and monitoring.

The major accomplishment of this project is the successful identification of NLMs that distinguish AD and PD patients from healthy controls at the probabilistic subject-level. Notably, these markers generalize across both Latin-American and European samples, highlighting their robustness and cross-country validity.
All participants completed verbal fluency tasks, which involved generating words within specific categories. Unlike traditional methods that simply count total valid responses, we employed a novel approach, extracting multiple psycholinguistic properties (e.g. frequency, granularity, phonological neighbourhood, word length, familiarity, and imageability) from each spoken word. These features were then fed into machine learning models, to evaluate whether they could discriminate patients from healthy controls.

Our analysis revealed distinct linguistic profiles for AD and PD. In AD, NLMs achieved a strong classification performance (AUC = 0.9) with key features including word frequency, granularity, and phonological neighbourhood. AD patients tended to produce high-frequency, conceptually imprecise words (e.g. “flower” instead of “rose”) with similar phoneme sequences. These language markers were also linked to atrophy in temporal regions, reduced fMRI connectivity in the default-mode network, and EEG hypoconnectivity in temporo-parietal regions within the beta band (15–30 Hz).

For PD, a good classification performance was also observed (AUC = 0.84) with this group showing a different linguistic pattern. PD patients favoured concrete words (e.g. “piano” over “symphony”), and produced semantically closer, less varied concepts. This pattern correlated with impaired inhibition, as measured by Hayling test scores, suggesting difficulties in suppressing dominant concepts to shift to new categories (e.g. switching from “domestic animals” to “wild animals”). Additionally, the preference for concrete words was negatively correlated with cognitive status (MoCA scores), suggesting that greater cognitive impairment leads to reliance on more accessible sub-domains of semantic memory. Similarly, NLMs were correlated with aberrant connectivity in fMRI sensorimotor and salience networks, both of which are commonly disrupted in PD patients.
Finally, the language markers identified in the Latin American Spanish-speaking sample generalized well to the European Spanish-speaking sample for both AD (AUC = 0.9) and PD (AUC = 0.81) patients. This underscores the potential of NLMs as reliable tools for the detection and monitoring of neurodegenerative diseases across diverse linguistic and cultural contexts.

MULTI-LAND advances the current state-of-the-art by identifying disease-specific language markers uniquely tailored to Alzheimer’s and Parkinson’s diseases. Our findings reveal distinct word properties that characterize these disorders, capturing cognitive and brain-specific signatures. Crucially, our approach is objective, scalable, and cost-effective, making it an ideal candidate for integration into user-friendly clinical tools. In this sense, the outputs of this project have contributed to the development of the “Toolkit to Examine Lifelike Language” (TELL), a mobile app designed to capture naturalistic speech markers of neurodegeneration. TELL leverages machine learning algorithms to record, pre-process, and digitize speech into linguistic and acoustic features, offering clinicians and researchers a novel and accessible tool for the detection and monitoring of neurocognitive decline. The socio-economic and societal impact of this work is thus considerable. Integrating language markers into routine clinical assessments offers a scalable solution for screening larger populations and enhancing the detection of neurodegenerative conditions, enabling timely interventions and personalized care. Furthermore, the affordability of this approach can help reduce the financial burden on healthcare systems, particularly in underserved regions, making diagnostic tools more widely accessible.

Periodic Reporting for period 2 - MULTI-LAND (Multicentric Language Markers of NeuroDegeneration)

Condividi questa pagina

Scarica