Matan Ben-Tov

I am a Computer Science PhD student at Tel Aviv University and an AI security researcher in the PLUS research group, advised by Dr. Mahmood Sharif. I’m currently interning at Google. In the past, I worked as a software engineer at Check Point Research.

I am interested in the security of Natural Language Processing (NLP) systems and models. In particular, I explore the execution, practicality, implications and potential mitigations of attacks against NLP models, aiming to better understand what makes these models vulnerable to certain attacks.

Selected Publications

TROPT: An Open Framework for Unifying and Advancing Discrete Text Optimization

Matan Ben-Tov, Mahmood Sharif

Under submission

An open-source framework for running and developing discrete text-trigger optimizers—against any NLP model and any loss. TROPT ships 40+ recipes spanning LLM jailbreaks, model auditing, and interpretability, lowering the barrier to adopting discrete text optimization methods.

Universal Jailbreak Suffixes Are Strong Attention Hijackers

Matan Ben-Tov, Mor Geva, Mahmood Sharif

In TACL 2026 and ACL 2026

Analyzing the underlying mechanism of suffix-based LLM jailbreaks, we find it relies on aggressively hijacking the model context 🥷, which intensifies with the suffix’s universality. Exploiting this, we enhance and mitigate existing attacks.

GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search

Matan Ben-Tov, Mahmood Sharif

In ACM CCS 2025

Through introducing a strong, new SEO attack ⛽💡, we extensively evaluate widely-used embedding-based retrievers’ susceptibility to SEO attacks via corpus poisoning, linking it to key properties in embedding space.

CaFA: Cost-aware, Feasible Attacks With Database Constraints Against Neural Tabular Classifiers

Matan Ben-Tov, Daniel Deutch, Nave Frost, Mahmood Sharif

In IEEE S&P 2024

We propose an efficient attack against neural tabular classifiers for automatic robustness evaluation, addressing attacker objectives such as feasibility (via incorporation of database constraints) and cost-efficiency.

See all publications

Blog

FEB 15 2025

NanoPGD: Minimal Implementation of PGD 🐼➡️🐒

PGD is the perfect [adversarial] example of neural networks’ vulnerabilities. I implemented a compact version of this method, that can be …

OCT 30 2024

Squeezing More Out of Vec2Text with Embedding Space Alignment 🧃

Transferring previously-trained Vec2Text’s embedding inversion to new text encoders, by training a mere affine mapping.