Master's & PhD Thesis Opportunity
Uncovering Pioneer and Non-Pioneer TF Interactions via Deep Learning on Targeted Perturbation Data
Applications should be made to office-sasse@zmbh.uni-heidelberg.de
Please feel free to contact the PIs at their above email addresses with any enquiries​.
​
We are offering an interdisciplinary thesis project that aims to decode how pioneer and non-pioneer transcription factors (TFs) shape chromatin accessibility and gene expression. The project leverages the Grand Lab’s established degron cell lines to perform rapid, controlled TF degradation—capturing primary regulatory effects at unprecedented temporal resolution. These ATAC-seq, ChIP-seq, and RNA-seq datasets will be integrated in the Sasse Lab using advanced sequence-to-function deep learning approaches to predict differential chromatin accessibility, TF binding, and transcriptional output directly from DNA sequence.
​
Required Skills / Profile
-
Strong interest in gene regulation, chromatin biology, and transcription factor function
-
Experience with Python and machine learning; familiarity with deep learning frameworks (e.g., PyTorch, TensorFlow, JAX) is highly beneficial
-
Basic understanding of genomics or NGS data (ATAC-seq, ChIP-seq, or RNA-seq); prior experience with bioinformatics is a plus
-
Ability to work independently, collaborate across computational and experimental groups, and communicate technical concepts clearly
Summary
Transcription factors (TFs) and chromatin regulators are central to establishing and maintaining cellular identity by controlling gene expression through complex regulatory networks (1). Pioneer TFs—such as BANP, GABPA, and ZFP143—are distinguished by their ability to remodel chromatin and open previously inaccessible regions, thereby enabling the binding of additional TFs (2). In contrast, factors like YY1 and MYC lack pioneering activity. Our previous degron-based perturbation experiments show that pioneering activity is locus-specific and cannot be fully compensated by other pioneer TFs. Moreover, TF degradation reveals a striking diversity of regulatory outcomes: while most target loci lose accessibility and transcription, others exhibit opposite or even anticorrelated effects. These observations underscore that the interplay between sequence context, motif composition, and chromatin state in shaping transcriptional outcomes remains poorly understood.
To clarify the underlying regulatory sequence syntax, this project will integrate existing ATAC-seq, ChIP-seq, and RNA-seq datasets from controlled degron cell lines to build a sequence-to-function (S2F) deep learning model (3,4). This model will predict differential chromatin accessibility, TF binding, and gene expression between control and perturbed conditions using sequence features alone. Leveraging rapid, inducible TF depletion will enable the model to learn primary regulatory effects that are difficult to capture in steady-state systems. Together, these controlled genome-wide perturbations and computational approaches will allow systematic exploration of the combinatorial rules that govern pioneer and non-pioneer TF activity across diverse sequence contexts.
The trained model will be used to test mechanistic hypotheses about regulatory sequence grammar and to design synthetic promoter elements for experimental validation. Using explainable AI methods, we will identify motifs and sequence features that enhance or inhibit TF-dependent chromatin remodeling (5). In silico perturbations will allow us to evaluate how changes in motif arrangement, orientation, or co-occurrence influence chromatin accessibility and transcription. By clustering these results, we aim to derive generalizable principles governing pioneer activity and TF cooperativity.
Ultimately, this project will generate a predictive framework for TF interactions in pluripotent stem cells (PSCs) and provide fundamental insights into how pioneer factors shape gene regulatory landscapes. More broadly, this approach—combining deep learning, motif-level analysis, and inducible genomic perturbations—offers a powerful strategy for decoding cis-regulatory logic. The resulting principles could guide stem cell engineering efforts by enabling rational design of regulatory elements to enhance PSC identity or modulate differentiation potential, with implications for regenerative medicine and disease modeling.
​
References
1. Isbel, L., Grand, R., Schübeler, D. Nature Reviews Genetics (2022).
2. Barral, A., Zaret, K. Trends in Genetics (2024).
3. Sasse, A., Chikina, M., Mostafavi, S. Nature Methods (2024).
4. Chandra, NA., Hu, Y., Buenrostro, JD., Mostafavi, S., Sasse, A. BioRxiv.
5. Sasse, A., Chikina, M., Mostafavi, S. iScience (2024).

![RalphGrand_gr.Portrait_GroupSite_P1560201[4][1][1].png](https://static.wixstatic.com/media/132162_5c065b8d508d4cbda3be700830297a07~mv2.png/v1/crop/x_0,y_22,w_1906,h_2305/fill/w_86,h_104,al_c,q_95,usm_0.66_1.00_0.01,enc_avif,quality_auto/RalphGrand_gr_Portrait_GroupSite_P1560201%5B4%5D%5B1%5D%5B1%5D.png)
