Nikhil Prakash

22nd floor, 177 Huntington Ave

Boston, MA 02115

I’m a third year Ph.D. student at Northeastern University, advised by Prof. David Bau. I completed my Bachelor of Engineering from RV College of Engineering, Bangalore, India in fall 2020, with a focus on electrical and computer science.

I interned at Practical AI Alignment and Interpretability Research Group with Dr. Atticus Geiger this summer. Previously, I have participated in the SERI-MATS (first phrase) with Neel Nanda. Additionally, I have worked as a visiting scholar at the Max Planck Institute for Security and Privacy, a research intern at both the Korea Advanced Institute of Science & Technology and the Indian Institute of Technology Ropar, and as a student trainee at Samsung Research Bangalore.

Broadly, my interest lies in understanding the internal mechanisms of deep neural networks to enhance human-AI collaboration and prevent misalignment. Currently, I’m investigating cognitive abilities such as reasoning and theory of mind in large language models.

I have received invaluable support from many people throughout my career, and as a result, I’m always happy to assist others and share insights from my experiences. Please feel free to reach out.

news

Nov, 2024 Received a complimentary NeurIPS 2024 registration for my service as a reviewer.
Aug, 2024 Reviewing for ICLR 2025.
Jul, 2024 Our paper NNsight and NDIF: Democratizing Access to Foundation Model Internals is on ArXiv!
Jul, 2024 Interning at Practical AI Alignment and Interpretability Research Group with Dr. Atticus Geiger.
Jun, 2024 Reviewing for NeurIPS 2024 (main conference and workshop proposals).
May, 2024 Invited talk at Practical AI Alignment and Interpretability Research Group.
May, 2024 Invited talk at Computational Linguistics and Complex Social Networks in Indian Institute of Technology Gandhinagar.
May, 2024 Attending ICLR 2024 at Vienna in-person :beach_umbrella:!
Apr, 2024 Invited talk at New England NLP 2024.
Apr, 2024 Co-organizing Mechanistic Interpretability Social at ICLR 2024 with Gabriele Sarti.
Apr, 2024 Awarded Google's Gemma Academic Program :trophy:.
Jan, 2024 Our paper “Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking” got accepted at ICLR 2024!
Oct, 2023 Served as a reviewer for ATTRIB 2023 workshop @ NeurIPS.
Jul, 2023 Our short paper got accepted at Challenges of Deploying Generative AI workshop at ICML 2023!
Jul, 2023 Participated in Stanford Existential Risks Initiative ML Alignment Theory Scholars (SERI-MATS) 2023.
Jun, 2023 Participated in Alignment Research Engineer Accelerator (ARENA) 2023.
Feb, 2023 Our paper got acceptetd at IUI 23!
Sep, 2022 :innocent: Started my Ph.D. at Northeastern with Prof. David Bau!

selected publications

  1. ICLR
    Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking
    Prakash, Nikhil, Shaham, Tamar Rott, Haklay, Tal, Belinkov, Yonatan, and Bau, David
    In International Conference on Learning Representations (ICLR) 2024
  2. ICML
    Discovering Variable Binding Circuitry with Desiderata
    Davies, Xander, Nadeau, Max, Prakash, Nikhil, Shaham, Tamar Rott, and Bau, David
    In Challenges in Deployable Generative AI Workshop, International Conference on Machine Learning (ICML) 2023