Periodic Reporting for period 2 - MULTI-LAND (Multicentric Language Markers of NeuroDegeneration)
Periodo di rendicontazione: 2023-09-01 al 2024-08-31
Currently, diagnosing and monitoring NDs rely on lengthy, stressful cognitive tests and costly brain scans, which can be particularly challenging in regions with economic disparities. To overcome these limitations, MULTI-LAND aims to implement a ground-breaking approach focused on natural language markers (NLMs)—linguistic features extracted from patients’ speech that are analysed using machine learning in an automated fashion to identify ND conditions. While NLMs have shown promise in detecting mental health conditions, their application to NDs is limited. Additionally, no study has yet (i) examined the link between NLMs and disease-specific brain network disruptions, (ii) evaluated their reliability compared to cognitive tests, or (iii) tested their diagnostic potential across multiple research centres and countries.
The primary research objective of MULTI-LAND is to conduct a comprehensive, cross-methodological (behavioural, MRI/fMRI, EEG) and multi-centric (i.e. the BCBL in Europe and the CNC-UdeSA in Argentina) validation of NLMs. The overarching hypothesis is that NLMs will robustly discriminate patients from healthy controls, while unveiling disease-specific neurocognitive patterns, offering a cost-effective, scalable, and remotely applicable approach for ND detection and monitoring.
All participants completed verbal fluency tasks, which involved generating words within specific categories. Unlike traditional methods that simply count total valid responses, we employed a novel approach, extracting multiple psycholinguistic properties (e.g. frequency, granularity, phonological neighbourhood, word length, familiarity, and imageability) from each spoken word. These features were then fed into machine learning models, to evaluate whether they could discriminate patients from healthy controls.
Our analysis revealed distinct linguistic profiles for AD and PD. In AD, NLMs achieved a strong classification performance (AUC = 0.9) with key features including word frequency, granularity, and phonological neighbourhood. AD patients tended to produce high-frequency, conceptually imprecise words (e.g. “flower” instead of “rose”) with similar phoneme sequences. These language markers were also linked to atrophy in temporal regions, reduced fMRI connectivity in the default-mode network, and EEG hypoconnectivity in temporo-parietal regions within the beta band (15–30 Hz).
For PD, a good classification performance was also observed (AUC = 0.84) with this group showing a different linguistic pattern. PD patients favoured concrete words (e.g. “piano” over “symphony”), and produced semantically closer, less varied concepts. This pattern correlated with impaired inhibition, as measured by Hayling test scores, suggesting difficulties in suppressing dominant concepts to shift to new categories (e.g. switching from “domestic animals” to “wild animals”). Additionally, the preference for concrete words was negatively correlated with cognitive status (MoCA scores), suggesting that greater cognitive impairment leads to reliance on more accessible sub-domains of semantic memory. Similarly, NLMs were correlated with aberrant connectivity in fMRI sensorimotor and salience networks, both of which are commonly disrupted in PD patients.
Finally, the language markers identified in the Latin American Spanish-speaking sample generalized well to the European Spanish-speaking sample for both AD (AUC = 0.9) and PD (AUC = 0.81) patients. This underscores the potential of NLMs as reliable tools for the detection and monitoring of neurodegenerative diseases across diverse linguistic and cultural contexts.