Workshop JUCC 2024

Bioinformática

Fecha: 17 de diciembre, 14:00 a 17:00 hs
Lugar: Facultad de Ingeniería (5to piso), Julio Herrera y Reissig 565, Montevideo.
Descripción: En este workshop participarán investigadores destacados extranjeros y nacionales presentando trabajos de investigación en el área de Bioinformática.

Expositores

Agenda


Hora: 14:00 - 14:30

Idoia Ochoa, Departamento de Ingeniería Eléctrica y Electrónica, Tecnun, Universidad de Navarra, España. Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), Universidad de Navarra, España.

Título: An interpretable and adaptive autoencoder for efficient tissue deconvolution

Resumen: Deconvolution models are a powerful tool for extracting cell type-specific information from bulk gene expression profiles. Current methods leverage advanced machine learning models and high-resolution sequencing, like single-cell RNA-sequencing (scRNA-seq), showing promising results across diverse tissues and conditions. However, they still present important limitations: Many depend on selecting a robust reference, which can strongly affect the deconvolution. Secondly, pseudobulk data used for training and real bulk RNA-seq samples often exhibit strong distribution shifts, which are currently unaccounted for. Finally, most deconvolution approaches behave as black boxes, which can compromise the reliability of the results.

In this talk, we will present Sweetwater, an adaptive and interpretable autoencoder that efficiently deconvolves bulk samples leveraging multiple classes of reference data. We propose an improved way of generating training data from a mixture of FACS-sorted FASTQ files, reducing platform-specific biases and outperforming current single-cell-based references. Furthermore, we introduce a gold standard dataset to facilitate fair and accurate evaluation of deconvolution approaches. Finally, we demonstrate that Sweetwater adapts effectively to deconvolved samples during training, uncovering biologically meaningful patterns and enhancing result's reliability. Sweetwater is publicly available, and we anticipate it will expedite the accurate examination of high-throughput clinical data across diverse applications.


Hora: 14:30 - 15:00
Mikel Hernaez, CIMA, Cancer Center Universidad de Navarra, España. Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), Universidad de Navarra, España.

Título: Towards a more inductive world for drug repurposing approaches

Resumen: Drug-target interaction (DTI) prediction is a challenging, albeit essential task in drug repurposing. Learning on graph models has drawn special attention as they can significantly reduce drug repurposing costs and time commitment. However, structural differences in the learning architecture of current models hinder their fair benchmarking. In this work, we first perform an in-depth evaluation of current DTI datasets and prediction models through a robust benchmarking process, and show that DTI methods based on transductive models lack generalization and lead to inflated performance, making them unsuitable for drug repurposing. We then propose a biologically-driven strategy for negative edge subsampling and uncovered yet-unknown interactions via in vitro and SFP validation, missed otherwise by traditional approaches, opening the doors to an accelerated personalized drug repurposing.


Hora: 15:00 - 15:30
José Sotelo y Pablo Smircich, Laboratorio de Bioinformática, Departamento de Genómica, Instituto de Investigaciones Biológicas Clemente Estable, Uruguay. Sección Genómica Funcional, Facultad de Ciencias, Universidad de la República, Uruguay.

Título: Single cell RNA-seq reveals trans-sialidase like superfamily gene expression heterogeneity in Trypanosoma cruzi populations.

Resumen: Trypanosoma cruzi, the causative agent of Chagas disease, presents a major public health challenge in Central and South America, affecting approximately 10 million people and placing millions more at risk. The T. cruzi life cycle includes transitions between epimastigote, metacyclic trypomastigote, amastigote, and blood trypomastigote stages, each marked by distinct morphological and molecular adaptations to different hosts and environments. Unlike other trypanosomatids, T. cruzi does not employ antigenic variation but instead relies on a diverse array of cell-surface-associated proteins encoded by large multigene families, essential for infectivity and immune evasion. In this study, we analyze cell-specific transcriptomes using single-cell RNA sequencing of amastigote and trypomastigote cells to characterize stage-specific surface protein expression during mammalian infection. Through clustering and identification of cell-specific markers, we assigned cells to distinct parasite forms. Analysis of individual cells revealed that surface protein-coding genes, especially members of the TcS superfamily, are expressed with greater heterogeneity than single copy genes. Additionally, no recurrent combinations of TcS genes were observed within the cell population. Our findings thus highlight transcriptomic heterogeneity within trypomastigote populations and reveal unique cell surface expression profiles. By focusing on the diversity of surface protein gene expression, this research aims to deepen our understanding of T. cruzi’s cellular biology and infection strategies.


Pausa para Café

(15:30 - 16:00)


Hora: 16:00 - 16:30
Jan Voges, Centro de Investigación Médica Aplicada, Universidad de Navarra, España. Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Alemania.

Título: High-Throughput Technologies in DNA Data Storage, Precision Medicine for Depression, and Metabolic Profiling.

Resumen: This talk explores three advanced applications of high-throughput technologies: DNA data storage, precision medicine for depression, and metabolic profiling. First, I will present the PEARL-DNA project, which is developing a modular end-to-end platform for data storage using block-by-block synthesized DNA. In the area of precision medicine for depression, I will discuss the P4D project, which leverages large-scale patient data and advanced sequencing to improve diagnostics and treatment strategies. Lastly, I will introduce the concept of metabolic profiling, focusing on the prediction of metabolic activity at the single-cell level using scRNA-seq perturbation data. This will serve as a primer for the talk by Guillermo Dufort y Álvarez, who will present the topic in full detail.


Hora: 16:30 - 17:00
Guillermo Dufort y Álvarez, Instituto de Computación, Facultad de Ingeniería, Universidad de la República, Uruguay.

Título: Exploring cellular metabolism via single-cell RNA data and neural networks.

Resumen: Metabolism is fundamental to cellular function and is closely linked to the outcomes of treatments for diseases such as cancer. In this talk, we will present a tool for predicting metabolic variation at the single-cell level in response to genetic perturbations. We will cover methodologies such as graph neural networks (GNNs) to model gene expression variation from perturbation data and linear programming methods to analyze metabolic activity in tissues. Our tool integrates concepts from both approaches to predict changes in metabolic activity when specific genes are perturbed, providing new insights into the relationship between gene function and metabolism.


Expositores


Dr. Idoia Ochoa graduated with B.Sc. and M.Sc. degrees in Telecommunication Engineering (Electrical Engineering) from the University of Navarra (Tecnun), Spain, in 2009. She then went to Stanford, where she obtained a MS and a PhD in the Electrical Engineering Department, in 2012 and 2016, respectively. During her time at Stanford Dr. Ochoa performed internships as a software engineer at Google, CA and Genapsys, CA. She also served as a technical consultant for the HBO show “Silicon Valley”. After obtaining the PhD, Dr. Ochoa joined the faculty at the Electrical and Computing Engineering department at the University of Illinois at Urbana-Champaign (UIUC), as an assistant professor, in January 2017. At UIUC she led her own research group, taught several undergraduate and graduate courses, and served in several committees. After three years at UIUC, Dr. Ochoa joined the faculty at Tecnun, University of Navarra (Spain), as a collaborator professor in January 2020.

Her research interests include computational biology, machine learning, data compression, bioinformatics, information theory and coding, and signal processing. Her research focuses on the development of computational methods tailored to omics data, to aid the storage, handling, and analysis of these data. She has developed several compression algorithms for genomic, methylation and mass spectrometry data that are currently state-of-the-art. Regarding the analysis of omics data, Dr. Ochoa has developed a myriad of computational tools that can advance personalized medicine, most focused on single-cell data.

Dr. Ochoa’s graduate studies were funded by a Stanford Graduate Fellowship and a La Caixa Graduate Fellowship, and she was awarded the MIT Innovators under 35 award (2019) for the importance of her potential and innovative research.


Dr. Mikel Hernaez has had highly interdisciplinary research training in the last years. From training in Information Theory during his PhD (2009-2012), followed by his training in Computational Biology during his postdoc at Stanford University (2013-2016, funded by a Stanford Data Science Initiative fellowship); to his previous position as Director of Computational Genomics the Carl R. Woese Institute for Genomic Biology (IGB) at the University of Illinois (UIUC), USA; where he had ample experience working on biology-centered interdisciplinary projects. In 2020 he moved back to Spain to lead the Computational Biology Program at the Center for Applied Medical Research (CIMA) of the University of Navarra. Among other projects, Dr. Hernaez was awarded the prestigious Marie S. Curie Fellowship from the EU (2020) and the Ramon y Cajal Fellowship, Spain (2023, equivalent to a National Career Award) to develop machine learning models to study the regulatory dynamics of cancer progression and drug resistance, both at bulk and single-cell level. As an example, and further funded by the US DoD and the Spanish Ministry of Science and Innovation, his lab, in collaboration with Mayo Clinic, has recently developed several methods to elucidate mechanistic alterations in metastatic cells associated with drug response in metastatic prostate cancer. Regarding drug resistance, Dr. Hernaez’s lab has developed novel drug repurposing models via Graph Neural Networks, and uncovered potential new drugs and drug targets for pancreatic cancer. Dr. Hernaez is also actively engaged on an ongoing project funded by the US NIH on foundational models for transcriptomic data deconvolution, with the goal of taking single cell resolution to the clinical practice. Finally, Dr. Hernaez has proposed several compression methods for different types of genomic information, some of which have been adopted by the International Organization for Standardization (ISO), where Dr. Hernaez has co-led the new ISO standard for genomic information representation which awarded him the ISO Excellence Award in 2023.


Dr. Pablo Smircich. Desde los inicios de su carrera como investigador, ha trabajado en el campo de la parasitología, particularmente enfocado en el estudio de tripanosomátidos y otros parásitos de importancia sanitaria en la región. Su investigación se centra principalmente en los aspectos moleculares y genómicos de parásitos, habiendo contribuido a la comprensión de la regulación de la expresión génica y la biología del ARN en estos organismos. Actualmente, desarrolla su trabajo de investigación como responsable del Laboratorio de Bioinformática del Departamento de Genómica del IIBCE y en la Sección Genómica Funcional de la Facultad de Ciencias de la Universidad de la República de Uruguay (UDELAR, Montevideo).


Dr. Jan Voges studied electrical engineering at Leibniz University Hannover, specializing in communications engineering. He received his Dipl.-Ing. degree from Leibniz University Hannover in 2015. After graduation, he joined the Institute for Information Processing at Leibniz University Hannover as a Research Assistant, where he pioneered the compression and standardization of genomic data. In 2018, he was a Visiting Scientist at the Carl R. Woese Institute for Genomic Biology at the University of Illinois Urbana-Champaign, where he worked on a project to automate and accelerate the analysis of DNA sequencing data at the Mayo Clinic. In 2022, he received his Dr.-Ing. degree — passed with distinction — from Leibniz University Hannover for his work on Compression of DNA Sequencing Data. From 2022–2024, he was Research Group Leader of the Genomic Data Science group (3 PhD students) at the Institute for Information Processing and L3S at Leibniz University Hannover, where his research focused on DNA sequencing data compression, DNA data storage, high-throughput processing of nanopore sequencing data, and 3D genome reconstruction. Since 2024, he has been Senior Research Scientist at CIMA University of Navarra and Instituto de Investigación Sanitaria de Navarra, with an additional focus on multi-modal modeling of cellular metabolic processes through perturbation studies. Since 2024, he has also held a position as Visiting Scientist at the Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School. In general, his research focuses on computational biology, machine learning and bioinformatics. More specifically, it is organized into the domains “Data Coding”, “Bioinformatics”, and “Computational Biology”. In the data coding domain, his research focuses on DNA sequencing data compression and DNA data storage. In the bioinformatics domain, his research focuses on high-throughput processing of nanopore sequencing data. In the computational biology domain, he focuses on 3D genome reconstruction as well as multi-modal modeling of cellular metabolic processes through perturbation studies. He has a special interest in international standardization, proposal writing, project management, IP generation and software engineering.


Dr. Guillermo Dufort y Álvarez nació en Montevideo, Uruguay. Obtuvo el título de Ingeniero en Computación y el doctorado en Informática en la Universidad de la República, Montevideo, Uruguay, en 2016 y 2022, respectivamente. Desde 2015, está afiliado al Instituto de Computación de la Universidad de la República. Actualmente, trabaja como investigador postdoctoral en TECNUN, Universidad de Navarra, España. Sus intereses de investigación incluyen bioinformática, compresión de datos y modelado estadístico.