Communal AI – Open, Collaborative & Accessible LLMs

Date: January,01,2025 Start Time: 11:30 - 12:30

Location: 1061, Meyer Building

Zoom: Zoom link

Add to:

Lecturer: Leshem Choshen

Research Areas:

למידת מכונה ומערכות נבונות

Developing better Language Models would benefit a myriad of communities. However, it is prohibitively costly. The talk would describe collaborative approaches to pretraining, such as model merging, which allows the combining of several specialized models into one. Then, it would introduce efficient evaluation to reduce overheads and touch on other accessible and collaborative aspects that best harness the expertise and diversity in Academia.

Leshem Choshen is a postdoctoral researcher at MIT-IBM, aiming to study model development openly and collaboratively, allow feasible pretraining research, and evaluate efficiently. To do so they co-created model merging, TIES merging, and the babyLM challenge. They were chosen for the postdoctoral Rothschild and Fulbright fellowship as well as IAAI and Blavatnik best Ph.D. awards. With broad NLP and ML interests, they also worked on Reinforcement Learning, Understanding how neural networks learn, and the Nature cover Project Debater – the first (2019) machine to hold a formal debate (live).

Leshem is also a dancer and an acrobat.

סמינר: Machine Learning Seminar

סמינרים

Communal AI – Open, Collaborative & Accessible LLMs

סמינרים

Communal AI – Open, Collaborative & Accessible LLMs

סמינרים קרובים

Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation

Robust and Actionable ML via Causality

Model-Based Deep Learning in Signal Processing