Research

2025

*: primary contributors · ^†: intern supervised by me

Products: Gemini (2.0, 2.5, 3.0), Lyria RealTime
Invited Talks: Stanford Hearing Seminar
Area Chair for ICML · Reviewer for NAACL (ARR)

Long-Form Speech Generation with Spoken Language Models (SpeechSSM)
Se Jin Park*^†, Julian Salazar*, Aren Jansen, Keisuke Kinoshita, Yong Man Ro, RJ Skerry-Ryan
ICML 2025 [oral] audio samples · dataset (LibriSpeech-Long) · talk

Zero-Shot Mono-to-Binaural Speech Synthesis (ZeroBAS)
Alon Levkovitch, Julian Salazar, Soroosh Mariooryad, RJ Skerry-Ryan, Nadav Bar, W. Bastiaan Kleijn, Eliya Nachmani
Interspeech 2025 audio samples · dataset (TUT Mono-to-Binaural) in progress · Google Research blog

SequenceLayers: Sequence Processing and Streaming Neural Networks Made Easy
RJ Skerry-Ryan, Julian Salazar, Soroosh Mariooryad, David Kao, Daisy Stanton, Eric Battenberg, Matt Shannon, Ron J. Weiss, Robin Scheibler, Jonas Rothfuss, Tom Bagby
Google, technical report (2025) code · PyPI package

Prompting with Phonemes: Enhancing LLMs’ Multilinguality for Non-Latin Script Languages
Hoang H. Nguyen, Khyati Mahajan, Vikas Yadav, Julian Salazar, Philip S. Yu, Masoud Hashemi, Rishabh Maheshwary
NAACL 2025 long paper [oral]

Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech (Very Attentive Tacotron)
Eric Battenberg, RJ Skerry-Ryan, Daisy Stanton, Soroosh Mariooryad, Matt Shannon, Julian Salazar, David Kao
NAACL 2025 long paper [oral] audio samples · code

Humanity’s Last Exam
Long Phan*, Alice Gatti*, Ziwen Han*, Nathaniel Li*, 123 other authors, Julian Salazar, 900+ other authors, Summer Yue, Alexandr Wang, Dan Hendrycks
Center for AI Safety & Scale AI, technical report (2025) [300+ citations] dataset · news (NYT, Reuters)

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Gemini Team: 3k+ authors in random order, including Julian Salazar
Google, technical report (2025) [1500+ citations] Google blog · Open in AI Studio

2024

*: equal contributors · ^†: intern supervised by me

Products: Gemini (2.0 audio-out preview), Project Astra (preview)
Invited Talks: iCORE 2024 (plenary), Raion (podcast), Multimodal AI at CRV (panel)
Area Chair for NeurIPS · Reviewer for ICML, NAACL (ARR)

Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM (Spectron)
Eliya Nachmani*, Alon Levkovitch*, Roy Hirsch, Julian Salazar, Chulayuth Asawaroengchai, Soroosh Mariooryad, Ehud Rivlin, RJ Skerry-Ryan, Michelle Tadmor Ramanovich
ICLR 2024 [100+ citations] audio samples · dataset (spoken LLaMA-Questions) · Google Research blog · poster · slides · talk

Zero-Shot End-to-End Spoken Language Understanding via Cross-Modal Selective Self-Training
Jianfeng He^†, Julian Salazar, Kaisheng Yao, Haoqi Li, Jinglun Cai
EACL 2024 long paper code · datasets (MiniPS2SLURP, VoxPopuli2SLUE) · talk

2023

Products: Proof-of-concepts in speech and dialog (unreleased)
Reviewer for ACL, IEEE Transactions on Audio Speech and Language Processing

2022

^†: intern supervised by me

Products: Amazon Transcribe (improvements), Alexa (via rescoring methods and Echo Show live captions)
Reviewer for NeurIPS [top 10%]

Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation
Zejiang Hou^†, Julian Salazar, George Polovets
Transactions of the Association for Computational Linguistics (TACL), Vol. 10 (2022) code · EMNLP 2022 talk

2021

^†: intern supervised by me

Products: Amazon Transcribe (inc. custom LM improvements)
Reviewer for EMNLP

Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment
Ethan A. Chi^†, Julian Salazar, Katrin Kirchhoff
NAACL 2021 short paper code in progress · talk

2020

^†: intern supervised by me

Products: Amazon Transcribe (inc. self-service custom LMs)
Reviewer for ACL, EMNLP, NeurIPS [top 10%], IEEE Signal Processing Letters, IWSLT

Masked Language Model Scoring
Julian Salazar, Davis Liang, Toan Q. Nguyen^†, Katrin Kirchhoff
ACL 2020 long paper [700+ citations] code · talk · also at DeepLo 2019

BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling, Julian Salazar, Yuzong Liu, Katrin Kirchhoff
Speaker Odyssey 2020 [best paper runner-up] code

Unsupervised Bitext Mining and Translation via Self-Trained Contextual Embeddings
Phillip Keung, Julian Salazar, Yichao Lu, Noah A. Smith
Transactions of the Association for Computational Linguistics (TACL), Vol. 8 (2020) NAACL 2021 talk

Don’t Use English Dev: On the Zero-Shot Cross-Lingual Evaluation of Contextual Embeddings
Phillip Keung, Yichao Lu, Julian Salazar, Vikas Bhardwaj
EMNLP 2020 short paper [oral] talk

Deep Contextualized Acoustic Representations for Semi-Supervised Speech Recognition (DeCoAR)
Shaoshi Ling, Yuzong Liu, Julian Salazar, Katrin Kirchhoff
ICASSP 2020 [oral; 100+ citations] code · talk

Attentional Speech Recognition Models Misbehave on Out-of-Domain Utterances
Phillip Keung, Wei Niu, Yichao Lu, Julian Salazar, Vikas Bhardwaj
arXiv (Feb. 2020) data

2019

*: equal contributors · ^†: intern supervised by me

Products: Amazon Transcribe (inc. new languages + medical ASR)
Reviewer for IWSLT

Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition
Julian Salazar, Katrin Kirchhoff, Zhiheng Huang
ICASSP 2019 [100+ citations] poster · slides for AWS AI in Practice

Transformers without Tears: Improving the Normalization of Self-Attention
Toan Q. Nguyen*^†, Julian Salazar*
IWSLT 2019 [oral; 300+ citations] code · slides · WNGT 2020 talk

2018 and earlier

*: equal contributors · ^: the Hardy-Littlewood rule

Products: Amazon Transcribe (launch)

Invariant Representation Learning for Robust Deep Networks
Julian Salazar*, Davis Liang*, Zhiheng Huang, Zachary C. Lipton
Workshop on Integration of Deep Learning Theories at NeurIPS 2018 poster

Crepant Resolutions of Weierstrass Models with Torsion
Julian Salazar
Harvard University undergraduate thesis, Mar. 2017 (high honors) supervisors: Mboyo Esole, Noam Elkies · examiner: Joe Harris

Conway’s Subprime Fibonacci Sequences
Richard K. Guy^, Tanya Khovanova^, Julian Salazar^
Mathematics Magazine, Vol. 87, No. 5 (Dec. 2014) code · presented at MathFest 2012

Explorations

The Representability Hierarchy and Hilbert’s 13th Problem
Explicates the relationship between classical superpositions of algebraic functions and the viewpoint from algebraic geometry. Supervised by Benson Farb at the 2016 UChicago REU.

Persistent Homology and the Topology of Motor Cortical Activity
With Emma West. Introduces topological data analysis with an application to motor neuronal data from the Hatsopolous Lab. Written at the 2016 UChicago REU.

[DRAFT] Descent by 2-Isogeny on Elliptic Curves: from Fermat to Weil
My Harvard junior paper, supervised by Noam Elkies.

[DRAFT] A Peripatetic Course in Algebraic Topology
A lengthy survey of basic algebraic topology based on lectures by Peter May at the 2016 UChicago REU.