Репозиторий Dspace

Streamlining NMR Chemical Shift Predictions for Intrinsically Disordered Proteins: Design of Ensembles with Dimensionality Reduction and Clustering

Показать сокращенную информацию

dc.rights.license CC BY eng
dc.contributor.author Bakker, Michael J. cze
dc.contributor.author Gaffour, Amina cze
dc.contributor.author Juhás, Martin cze
dc.contributor.author Zapletal, Vojtech cze
dc.contributor.author Stosek, Jakub cze
dc.contributor.author Bratholm, Lars A. cze
dc.contributor.author Precechtelova, Jana Pavlikova cze
dc.date.accessioned 2025-12-05T14:37:58Z
dc.date.available 2025-12-05T14:37:58Z
dc.date.issued 2024 eng
dc.identifier.issn 1549-9596 eng
dc.identifier.uri http://hdl.handle.net/20.500.12603/2166
dc.description.abstract By merging advanced dimensionality reduction (DR) and clustering algorithm (CA) techniques, our study advances the sampling procedure for predicting NMR chemical shifts (CS) in intrinsically disordered proteins (IDPs), making a significant leap forward in the field of protein analysis/modeling. We enhance NMR CS sampling by generating clustered ensembles that accurately reflect the different properties and phenomena encapsulated by the IDP trajectories. This investigation critically assessed different rapid CS predictors, both neural network (e.g., Sparta+ and ShiftX2) and database-driven (ProCS-15), and highlighted the need for more advanced quantum calculations and the subsequent need for more tractable-sized conformational ensembles. Although neural network CS predictors outperformed ProCS-15 for all atoms, all tools showed poor agreement with H-N CSs, and the neural network CS predictors were unable to capture the influence of phosphorylated residues, highly relevant for IDPs. This study also addressed the limitations of using direct clustering with collective variables, such as the widespread implementation of the GROMOS algorithm. Clustered ensembles (CEs) produced by this algorithm showed poor performance with chemical shifts compared to sequential ensembles (SEs) of similar size. Instead, we implement a multiscale DR and CA approach and explore the challenges and limitations of applying these algorithms to obtain more robust and tractable CEs. The novel feature of this investigation is the use of solvent-accessible surface area (SASA) as one of the fingerprints for DR alongside previously investigated alpha carbon distance/angles or phi/psi dihedral angles. The ensembles produced with SASA tSNE DR produced CEs better aligned with the experimental CS of between 0.17 and 0.36 r(2) (0.18-0.26 ppm) depending on the system and replicate. Furthermore, this technique produced CEs with better agreement than traditional SEs in 85.7% of all ensemble sizes. This study investigates the quality of ensembles produced based on different input features, comparing latent spaces produced by linear vs nonlinear DR techniques and a novel integrated silhouette score scanning protocol for tSNE DR. eng
dc.format p. 6542-6556 eng
dc.language.iso eng eng
dc.publisher AMER CHEMICAL SOC eng
dc.relation.ispartof JOURNAL OF CHEMICAL INFORMATION AND MODELING, volume 64, issue: 16 eng
dc.subject molecular-dynamics simulations eng
dc.subject gaussian-type basis eng
dc.subject orbital methods eng
dc.subject force-field eng
dc.subject tyrosine-hydroxylase eng
dc.subject functional theory eng
dc.subject basis-sets eng
dc.subject binding eng
dc.subject phosphorylation eng
dc.subject accuracy eng
dc.title Streamlining NMR Chemical Shift Predictions for Intrinsically Disordered Proteins: Design of Ensembles with Dimensionality Reduction and Clustering eng
dc.type article eng
dc.identifier.obd 43881267 eng
dc.identifier.wos 001284734600001 eng
dc.identifier.doi 10.1021/acs.jcim.4c00809 eng
dc.publicationstatus postprint eng
dc.peerreviewed yes eng
dc.source.url https://pubs.acs.org/doi/10.1021/acs.jcim.4c00809 cze
dc.relation.publisherversion https://pubs.acs.org/doi/10.1021/acs.jcim.4c00809 eng
dc.rights.access Open Access eng


Файлы в этом документе

Данный элемент включен в следующие коллекции

Показать сокращенную информацию

Поиск в DSpace


Просмотр

Моя учетная запись