Detection and quantification of RNA profiles and RNA modifications from Nanopore direct RNA sequencing data in human disease
Keywords: direct RNA sequencing, single molecule sequencing, RNA modification, human cancer, SARS-CoV2
Background: The recent advent of Nanopore sequencing allowed for the first time to directly sequence native, full-length RNA molecules without retro-transcription or amplification, combining in a single technique quantification and sequence-specific detection of RNA modifications [Workman et al., Nature Methods (2019)]. As such, Nanopore sequencing is a very promising tool making it possible to profile transcriptomes at an unprecedented level of detail. To exploit the full potential of Nanopore direct RNA Sequencing, the scientific community needs robust analytical strategies and computational tools to extract information from the sequencing data and address biologically relevant questions.
Aim: This project is aimed at designing, implementing and applying algorithms and computational methods for the de novo identification of RNA modifications using sequencing data from Nanopore direct RNA Sequencing.
Activities: The primary objective will consist in developing algorithms and data-analysis pipelines for the interpretation of RNA sequencing profiles and the identification of RNA modifications from direct RNA sequencing data. Such methods will provide an extremely valuable tool to study the largely unexplored landscape of RNAs and their modifications in a broad spectrum of contexts, with a high potential for discovering novel biologically relevant mechanisms. In parallel to the development phase, the fellow will also be involved in applicative projects where the methods developed will be used to address open questions in the field of genomics. Based on the research lines currently active at the Center, possible secondary projects are: i) characterisation of the expression profile and RNA modification profile of complex transcriptional units, such as microRNAs or long non-coding RNAs, in breast cancer; ii) detection and quantification of endogenous retroelements such as LINEs and ERVs in health and disease; iii) characterization of viral RNAs (e.g. transcript identification, RNA modification profiling) produced by the SARS-CoV2 virus during the infection life-cycle;
Methodology: The fellow will learn and apply the principles of open science and open source software development. During the course of the project he/she will develop software packages and analysis pipelines in languages such as Python, R or Bash and will implement algorithms based on machine learning and/or statistical modelling. The successful applicant will also gain familiarity with modern best practices in computational biology, such as thorough unit testing, continuous testing/integration, and heavy use of containerization for dependency management and full reproducibility.