ECE Women Community

Efficient LLM Systems: From Algorithm Design to Deployment

Date: January,14,2026 Start Time: 10:30 - 11:30

Location: 861, Meyer Building

Zoom: Zoom link

Add to:

Lecturer: Dr. Rana Shahout

Research Areas:

Electronic Circuits, VLSI Systems & Power Systems

Large Language Models (LLMs) have transformed what machines can do and how systems are designed to serve them. These models are both computationally and memory demanding, revealing the limits of traditional optimization methods that once sufficed for conventional systems. A central challenge in building LLM systems is improving system metrics while ensuring response quality.

This talk presents approaches for reducing latency in LLM systems to support interactive applications, from scheduling algorithm design to deployment. It introduces scheduling frameworks that use lightweight predictions of request behavior to make informed decisions about prioritization and memory management across two core settings: standalone LLM inference and API-augmented LLMs that interact with external tools. Across both settings, prediction-guided scheduling delivers substantial latency reductions while remaining practical for deployment.

Bio:

Rana Shahout is a Postdoctoral Fellow at Harvard University, working with Michael Mitzenmacher and Minlan Yu. She received her Ph.D. in Computer Science from the Technion and previously worked as a Senior Software Engineer at Mellanox (now NVIDIA). Her research combines machine learning, systems, and algorithmic theory to design efficient and scalable AI systems. Rana is a recipient of the Eric and Wendy Schmidt Postdoctoral Award, the Zuckerman Postdoctoral Fellowship, the Weizmann Institute Women’s Postdoctoral Career Development Award, the VATAT Postdoctoral Fellowship, and first place in the ACC Feder Family Award for Best Student Work in Communications.

Seminar: ceClub: The Technion Computer Engineering Club

Seminars

Efficient LLM Systems: From Algorithm Design to Deployment

Seminars

Efficient LLM Systems: From Algorithm Design to Deployment

Upcoming Seminars

Fundamentals of Aligning General-Purpose AI

Revisiting ERM in the LLM Era: Old Ideas, New Tools

Intuitive and Controllable AI for 3D Geometry