Click on Speaker headshot to read Abstract
Suhas Pai
Co-Founder & CTO, Hudson Labs
(formerly Bedrock AI)
Denys Linkov
Head of ML, Wisedocs
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Sameer Mohan
Senior Staff Architect, Solutions & AI, OMERS
Talk Title: AI in the Pension Pipeline: OMERS' GenAI Revolution in a Regulated Contact Centre
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Click Speaker to Read Abstract
Diederik van Liere
CTO, Wealthsimple
Panel: The New Career Playbook: Agents, LLMs, and What Comes Next?
Ehsan Amjadian
Head of Artificial Intelligence Acceleration & Innovation, RBC
Panel: The New Career Playbook: Agents, LLMs, and What Comes Next?
Graham Toppin
Co-Founder, Peerlabs.ai
Panel: The New Career Playbook: Agents, LLMs, and What Comes Next?
Sarah Sun
Senior Director, Risk Modelling, RBC
Panel: The New Career Playbook: Agents, LLMs, and What Comes Next?
Click Speaker to Read Abstract
David Hughes
Principal Solution Architect - Engineering & AI, Enterprise Knowledge
Workshop: Advancing GraphRAG - Multimodal Integration
Click Speaker to Read Abstract
Presenter:
Christos Melidis, Staff Machine Learning Scientist, Ada Support
About the Speaker:
Christos is a Machine Learning Scientist at Ada. His work evolved around Natural Language Processing and Conversation AI, nowadays with a strong focus on LLMs. He has performed research, design, and development on most of the parts of the Conversational AI stack, such as Retrieval and Knowledge management Systems, Reasoning architectures, Hallucination detection, Explainability, as well as testing methodologies for Conversational Agents.
Track: Advanced RAG
Technical Level: 300 – Advanced
Abstract:
Everyone’s talking about how to make generative AI more accurate. Fewer hallucinations. Better answers. Faster response times. But here’s the thing: you can’t fix bad answers with better models alone. You need better inputs. And that starts with better retrieval.
At Ada, we don’t just build AI agents—we build the platform that empowers companies to deploy their own. That means we give businesses the tools to create, train, and manage agents that not only generate responses, but generate them with precision.
Retrieval is a critical part of that equation, because how you find, rank, and pass context to a model determines how effective the agent will be in production. Standard RAG (Retrieval-Augmented Generation) methods aren’t always enough. That’s where GARAGe comes in.
GARAGe (Generative-Augmented Retrieval-Augmented Generation) flips the RAG script. Instead of using retrieval to improve generation, we use generation to improve retrieval. Sounds recursive? It is. And it works.
What You’ll Learn:
A new architecture optimization for RAG, focused on the Customer Support domain.
Presenter:
Abhimanyu Anand Senior Data Scientist, Elastic
About the Speaker:
Abhimanyu is a Sr. Data Scientist at Elastic, where he works on the development of search solutions powered by GenAI. He holds an M.Sc. in Big Data Analytics from Trent University, with a specialization in natural language processing. He has developed and implemented robust AI solutions throughout his career across diverse domains, including internet-scale platforms, metals and mining, oil and gas, and e-commerce.
Track: Advanced RAG
Technical Level: 200 – Intermediate
Abstract:
Deploying RAG systems at scale brings unique challenges. This session tackles the critical transition from prototype to production, focusing on how to reduce latency and cost without compromising performance.
The session is divided into two parts:
First, the presentation will cover practical strategies for optimizing different stages of the RAG pipeline: Data Preparation, Retrieval & Ranking, and Generation & Observability. We’ll walk through high-impact techniques including:
1. Embedding quantization to reduce memory footprint and compute cost
2. Context highlighting to improve relevance and reduce latency
3. Reciprocal Rank Fusion ranking technique for low latency use cases
4. Context compression to reduce latency
Second, the hands-on coding lab will guide participants through implementing these strategies in a real-world workflow. Using Python, Google Colab, Elasticsearch, and Hugging Face models, attendees will apply techniques including filtered search, embedding quantization, and context highlighting.
Prerequisites: A working knowledge of Python is required. Prior exposure to RAG concepts is helpful but not mandatory.
What You’ll Learn:
A set of practical strategies for optimizing their RAG pipelines, including high-impact techniques like embedding quantization, hybrid search, and context compression.
The framework to analyze their RAG architecture to identify the critical optimization points for both latency and cost.
An understanding of how to use observability to measure the impact of their optimizations and make informed, data-driven decisions.
Presenter:
Bang Liu, Associate Professor, University of Montreal & Mila
About the Speaker:
Bang Liu is an Associate Professor in the Department of Computer Science and Operations Research (DIRO) at the University of Montreal (UdeM). He is a member of the RALI laboratory (Applied Research in Computer Linguistics) of DIRO, a member of Institut Courtois of UdeM, an associate member of Mila – Quebec Artificial Intelligence Institute, and a Canada CIFAR AI Chair. His research interests primarily lie in the areas of natural language processing, multimodal & embodied learning, theory and techniques for AGI (e.g., understanding and improving large language models and intelligent agents), and AI for science (e.g., material science, health).
Track: Agents Zero To Hero
Technical Level: 100 – Beginner
Abstract:
The advent of large language models (LLMs) has revolutionized artificial intelligence, laying the foundation for sophisticated intelligent agents capable of reasoning, perceiving, and acting across diverse domains. These agents are increasingly central to advancing AI research and applications, yet their design, evaluation, and enhancement pose intricate challenges. In this talk, we will offer a fresh perspective by framing intelligent agents through a modular and cognitive science-inspired lens, bridging AI design with insights from different disciplines to propose a unified framework for understanding their core functionalities and future potential. We will explore the modular design of intelligent agents and present a framework for cognition, perception, action, memory, reward systems, and so on. Then we will discuss each module in detail. Our talk aims to provide a holistic and interdisciplinary perspective for intelligent agent research.
What You’ll Learn:
The concept and architecture of foundation agents.
Presenter:
Millin Gabani, CEO, Keyflow
About the Speaker:
Millin Gabani is the co-founder and CEO of Keyflow, where he helps businesses implement AI agent systems with a human-in-the-loop approach to automate operations and drive efficiency. With a unique blend of engineering, UX, and agent orchestration expertise, Millin has been building with agents since the earliest days of tool calling and interface-driven AI.
Previously, he worked on machine learning systems at Pinterest, contributing to the Home Feed recommendations team, and was part of the Search Infrastructure group at Google. He also helped develop AI-driven medical coding solutions at Fathom.
Millin has led workshops and lectures across Toronto focused on practical applications of LLM agents, Model Context Protocol, and the design of effective human-agent interfaces. His current work focuses on making agent systems reliable, modular, and aligned with real business workflows.
Track: Agents Zero To Hero
Technical Level: 400 – Expert
Abstract:
Most agent systems today either try to be too autonomous and end up brittle, or they are overly scripted and fail to adapt. In this talk, we’ll walk through how to design a deterministic network of modular agents that can reliably automate complex workflows in production.
Instead of relying on a single general-purpose agent, we structure the system as a directed acyclic graph (DAG), where each node represents a narrow, task-specific agent. This architecture ensures clarity in execution paths, supports debugging and retries, and enables clean handoffs between agents. We’ll explore how this design balances structure with flexibility, allowing for controlled decision-making while maintaining overall system reliability.
As a case study, we’ll demonstrate an agent system built for private equity and venture capital workflows. The system continuously ingests pitch decks, performs deep market research, and generates interactive reports. This approach has significantly reduced manual effort while maintaining quality and interpretability.
This talk is for builders and AI product teams looking to move beyond simple prompt-based agents and build scalable, production-grade systems that automate real work.
What You’ll Learn:
1. Agent systems are more reliable in production when structured as deterministic workflows rather than monolithic autonomous agents.
2. A directed acyclic graph (DAG) of modular agents enables clean handoffs, scoped decision-making, and scalable automation.
3. Balancing determinism with controlled autonomy is key to building agent systems that are both flexible and production-ready.
Presenter:
Amna Jamal, National Data and AI Expert, IBM Canada
About the Speaker:
Amna Jamal is a seasoned Data and AI expert at IBM, with over eight years of experience in data management, data science, and AI. As Watsonx Technical Leader for Canada, she drives innovation at the intersection of data and AI, helping organizations unlock their full potential.
Specializing in Information Management, DataOps, and ModelOps, Amna delivers customized solutions that optimize business processes and enhance revenue streams. Amna is a trusted advisor to senior leadership, guiding strategic decision-making and solving technical challenges. She leads a team of technical specialists, ensuring they meet client requirements and business objectives. Her ability to align diverse stakeholders around a shared vision has resulted in transformative solutions.
With a Ph.D. in Electrical and Computer Engineering from the National University of Singapore, Amna combines research expertise with hands-on experience to deliver impactful AI-driven solutions.
Track: Agents Zero To Hero
Technical Level: 200 – Intermediate
Abstract:
Agentic AI represents the next frontier in enterprise intelligence: AI systems capable of autonomously perceiving their environment, reasoning through complex decisions, and acting with minimal human intervention to deliver real business outcomes. Unlike traditional AI models that require human orchestration, agentic systems are designed to operate with a sense of autonomy and adaptability, making them especially valuable in fast-paced, data-rich business environments.In this session, we’ll explore what Agentic AI truly means in a business context, separating hype from reality. We will discuss the architecture of agentic systems, the importance of goal-driven behavior, and how these agents are being deployed across industries today. From intelligent HR agents that resolve issues end-to-end, to supply chain optimizers that make real-time adjustments, and agents that automate complex knowledge work, Agentic AI is reshaping the way businesses operate.We’ll also address practical considerations: What risks should organizations be aware of when implementing agentic solutions? How can enterprises balance autonomy with governance and compliance?Through a live demo and a real industry use case, we’ll review the design and development of an Agentic AI systems that are not just intelligent, but truly purpose-driven for business. Attendees will leave with tangible examples, actionable insights, and a roadmap to begin or scale their agentic AI journey.
What You’ll Learn:
Understanding the concept and benefits of Agentic AI
Learning about real-world applications and use cases
Receiving a roadmap to implement or scale Agentic AI in their own organizations
Presenter:
John Gilhuly, Head of Developer Relations, Arize
About the Speaker:
John is Head of Developer Relations at Arize AI, where he works on on open-source LLM observability and evaluation tooling. He holds an MBA from Stanford—where he focused on the ethical, social, and business implications of open vs closed AI development—and a B.S. in C.S. from Duke. He is passionate about ensuring the benefits stemming from AI and ML are felt equally across socio-economic divides. Prior to joining Arize, John led GTM activities at Slingshot AI, and served as a venture fellow at Omega Venture Partners. In his pre-AI life, John built out and ran technical go-to-market teams at Branch Metrics.
Track: Agents Zero To Hero
Technical Level: 200 – Intermediate
Abstract:
As AI systems shift from static language generation to dynamic, agentic behavior (reasoning over time, taking actions, and interacting with tools), our evaluation methods need to evolve. Traditional metrics like BLEU, ROUGE, or even hallucination detection fall short when applied to multi-step, goal-driven agents.
This talk introduces a modern toolkit for evaluating AI agents using LLMs themselves. We’ll explore techniques such as code-based validation, LLM-as-judge approaches, human-in-the-loop assessments, and benchmarking against ground truth. You’ll learn how to design high-quality evaluations aligned with real-world task goals, structure interpretable outputs, and scale evaluation processes across teams and systems. The session also covers telemetry best practices and emerging standards like OpenInference that aim to bring consistency and rigor to agent evaluation.
What You’ll Learn:
Evaluating AI agents requires rethinking traditional LLM metrics—effective evaluation must align with real-world tasks, multi-step behavior, and agent-specific goals.
Presenter:
Rose Genele, CEO, The Opening Door
About the Speaker:
As CEO of The Opening Door, Rose specializes in responsible artificial intelligence integration for investors and enterprise companies. Her work emphasizes the importance of safe and ethical AI—where technologies are designed with fairness, transparency, accountability, and human-centred design in mind. With years of experience in the tech industry, Rose has developed a reputation as an AI transformationalist with a penchant for data, ethics, and futures-forward thinking.
Rose sits on the board of the Canadian Centre for Ethics and Corporate Policy, and Volcano Theatre. She was a 2024 nominee for the RBC Canadian Women Entrepreneur Awards, and recipient of Canada’s Top 100 Black Women to Watch of 2024 Award. Rose is also alumnae of Toronto Metropolitan University, with a Bachelor of Commerce in Law.
Track: AI Ethics And Governance Within The Organization
Technical Level: 200 – Intermediate
Abstract:
As AI systems move from research labs into real-world applications, the question of responsibility becomes increasingly urgent: who owns AI ethics inside the organization? Is it the data scientists building the models? The legal team writing the policies? Or leadership setting the vision?
This talk dives into the organizational dimension of Responsible AI, unpacking what it takes to move ethical principles off the page and into the workflows of technical teams. Drawing from real-world examples across industries, we’ll explore how leading organizations are structuring cross-functional governance, distributing ethical responsibilities, and embedding accountability into the AI development lifecycle.
Attendees will leave with a clear understanding of:
-How to define and assign roles in Responsible AI initiatives
-Organizational models for AI governance (and when to use them)
-Practical strategies to empower ML professionals to make ethically-informed decisions
-Common pitfalls when ethics becomes a “check-the-box” activity—and how to avoid them
Whether you’re part of an ML research team, a startup shipping AI products, or a mature enterprise scaling AI operations, this session will help you reimagine AI ethics as a team sport, and not a compliance burden.
What You’ll Learn:
1. Governance models for AI within startups vs. enterprises
2. What effective RAI teams look like: roles, rituals, and decision gates
3. How to create a culture of accountability across technical and non-technical staff
Presenter:
Mehrin Kiani, Machine Learning Scientist, Protect AI
About the Speaker:
Mehrin Kiani, PhD, has over a decade of experience as a Machine Learning (ML) Researcher specializing in the development of ML algorithms. Dr Kiani’s influential research has been featured in renowned journals, including IEEE Transactions on Artificial Intelligence and Nature. In her current role as an ML Scientist at Protect AI, Dr Kiani focuses on leveraging open-source software to enhance the security of ML systems. Dr Kiani has presented work at multiple IEEE AI conferences as well in industry at RSA 2024.
Track: AI Ethics And Governance Within The Organization
Technical Level: 300 – Advanced
Abstract:
Together with Hugging Face (HF), Protect AI is paving the way to enable ML users to verify a machine learning (ML) models’ safety before use. As of April 1, 2025, Protect AI has scanned around 1.41 million HF model repositories. From the scan results, Protect AI has identified a total of 352,000 unsafe or suspicious issues across 51,700 ML models. The scan results showed the most prevalent form of attack on ML models as Model Serialization Attacks (MSA). To encourage all ML developers and users to develop zero trust for ML models, my talk will cover the MSA as well as findings of our HF scan results. The HF scan results are also available in Protect AI publicly available database, Insights DB.
What You’ll Learn:
Main message: Always scan a ML model before use/load. Just like any other file on internet, users should develop zero trust for ML models.
Presenters:
Fion Lee-Madan, Co-Founder, Fairly AI | Natalia Nygren, Practice Lead, AI, Omdia | Jeremy Souffir, Director of Engineering and GENAI, Loblaw Digital | Valence Howden, Advisory Fellow, Info-tech Research Group | Himanshu Joshi Applied AI, IBM | Moderator – Mayan Murray Senior Managing Consultant, Integrated AI Governance, Vector Institute
About the Speakers:
Fion Lee-Madan is a technology leader and innovator with over 25 years of experience in telecommunications, e-commerce, and artificial intelligence. As Co-Founder of Fairly AI, she drives AI governance, risk management, and compliance initiatives, earning recognition as a Globe and Mail Best Executive in 2023. With a global career spanning North America and Asia, she started as a software developer and held different technical roles at Nortel Networks, Sapient, and Intuit, specializing in emerging technologies and digital transformation. A strong advocate for diversity in tech, Fion mentors young innovators and contributes to AI trust and safety efforts worldwide through organizations like Partnership on AI, AI Verify Foundation, and Vector Institute. She holds an Honours BSc from the University of Toronto and an MBA from Boston University.
Natalia is a research director and practice lead for AI at Omdia where she leads a global team of industry analysts helping vendors and enterprises to scale and monetize AI, balancing profit, people and responsible technology use. Natalia has led her own development and strategy teams and helped 200+ clients globally to innovate through data and AI. She holds a PhD in AI (NLP) from the University of Edinburgh and an Executive MBA from Western University.
A passionate leader and technical expert with over 15 years of experience building and scaling high-performing teams to deliver innovative and reliable solutions. My career spans both the dynamic startup world and large-scale enterprise environments, providing me with a unique perspective on software development and leadership.
Half of my career was spent in startups, including one that grew into a multi-billion dollar, publicly traded (NASDAQ) company. This experience instilled in me a deep understanding of rapid innovation, agile development, and the challenges of scaling a business from the ground up. The other half of my career has been dedicated to large companies like Amazon, where I honed my skills in building and managing high-scale systems capable of handling extreme loads. I gained invaluable experience in designing and implementing robust software solutions for complex business problems.
For the past year, I’ve been leading the GenAI initiatives at Loblaw Digital, driving innovation and exploring new possibilities across the entire Loblaw organization.
At the heart of my leadership style is a focus on fostering a culture of innovation, where every team member is empowered to think outside the box and push boundaries. I believe that the key to building successful products is to foster a culture of experimentation, where it’s safe to take risks, learn from failures, and pivot quickly. By creating a collaborative and inclusive environment, I ensure that everyone on my team is aligned around a common goal, and has a shared sense of ownership and responsibility for the success of the project.
As a leader, I am known for my ability to inspire and motivate my team, driving them to exceed expectations and consistently deliver high-quality work. I believe that the key to building a high-performing team is to hire exceptional talent and provide them with the support and guidance they need to thrive. By creating an environment where everyone can grow and develop, I have built teams that are capable of taking on the most challenging projects and delivering exceptional results.
Overall, my passion for engineering, leadership, and innovation is what drives me every day. Whether I am managing a team or working on a project myself, I am always striving to build better solutions, learn new things, and make a positive impact on the world.””
Valence Howden is an International Speaker, Advisory Fellow and Distinguished Analyst at Infotech Research Group with over 30 years of experience in optimizing organizations through improving their governance, strategies, operating models and risk management practices.
Valence is a 2023 and 2024 HDI Top 25 thought leader in service management and is the Standards Council of Canada chair – and ISO voting member and delegate for Canada – for IT Governance, Service Management and Business Process Management.
Track: AI Ethics And Governance Within The Organization
Technical Level: 200 – Intermediate
Abstract:
AI governance is often seen as a bureaucratic hurdle—tedious compliance paperwork that slows down innovation. But with AI regulations already here, what if governance could be engineered just like the systems we build? This talk is designed by developers, for developers who want to deploy AI without drowning in red tape. We’ll break down how to transform compliance checklists into a practical, developer-friendly workflow that ensures AI systems are not just compliant on paper but actually functional and reliable in production. You’ll learn how other organizations have automated key governance tasks, integrating risk assessments into their CI/CD pipeline, and making AI governance a first-class citizen in your development process—without losing momentum.
What You’ll Learn:
Instead of just talking about responsible AI and ethics frameworks, learn from companies that have already operationalized AI governance.
Presenters:
Atinder Saini, Director, Advanced Analytics & AI, CIBC | Ankit Misra, Director AI Governance, CIBC
About the Speakers:
TBA
Track: AI Ethics And Governance Within The Organization
Technical Level: 200 – Intermediate
Abstract:
The rapid advancement of Agentic AI systems – capable of autonomous goal setting, planning, and execution in complex environments – presents unprecedented opportunities alongside significant governance challenges. Traditional approaches focusing mostly on pre-deployment testing and post-deployment monitoring often fall short in managing the emergent behaviors and potential risks of these highly autonomous systems. This talk emphasizes the need for a paradigm shift towards “Governance by Design,” integrating safety, alignment, and control mechanisms directly into the architecture, development, and operations of Agentic AI.
We will explore the unique characteristics of Agentic AI that necessitate this proactive approach, contrasting it with governance strategies for simpler ML models. This session will provide attendees with insights and techniques to embedding governance principles into their own Agentic AI projects, fostering the development of more trustworthy, reliable, and beneficial autonomous systems.
What You’ll Learn:
A clear understanding of the unique governance challenges posed by Agentic AI systems.
An appreciation for the limitations of traditional governance methods when applied to AI Agents.
Knowledge of the core principles behind the “Governance by Design” paradigm for AI.
Familiarity with a range of specific techniques and architectural patterns for embedding safety, alignment, and control into Agentic AI
Insights into the practical challenges, trade-offs, and open research questions in governing Agentic AI by design.
Motivation to proactively incorporate governance considerations early in the development lifecycle of Agentic AI systems.
Presenters:
Ozge Yeloglu, VP of Advanced Analytics and AI, CIBC | Yannick Lallement, Chief AI Officer, Scotiabank | Eric Morrow, Managing Director, Data Science & AI, AI+ Research and Commercialization, BMO | Saba Zuberi, Associate Vice President Applied AI, Layer6 AI | Patricia Arocena, RBC Distinguished Engineer, Head, RBC Innovation Labs, Royal Bank of Canada (RBC)
About the Speakers:
Yannick Lallement serves as the Chief AI Officer at Scotiabank, spearheading the integration and advancement of AI/ML technologies across the institution. He is passionate about making AI practical, safe and scalable. He earned his PhD in artificial intelligence from the esteemed French National Institute of Computer Science. Before his tenure at Scotiabank, Yannick contributed his expertise to a variety of AI/ML initiatives for diverse public and private entities.
Eric Morrow is a Managing Director within BMO’s AI+ Research and Commercialization group, leading a team of data scientists, AI developers and ML engineers that engage with groups and services across the organization to develop anything from data-driven insights to production ML/AI-based solutions on topics ranging from climate hazard assessment to cybersecurity-threat detection and beyond. Prior to BMO, he worked in the aerospace industry on spacecraft development and robotic assembly operations on the International Space Station. He has also worked in the exploration geophysics field on innovative superconducting gravity measurement systems. He holds a PhD in geophysics from Harvard University and Masters degrees in physics and aerospace engineering from the University of Toronto.
Saba Zuberi is the AVP of Applied Artificial Intelligence and Machine Learning (AI/ML) in the AI/ML Center of Excellence and Layer 6 at TD. She leads a team of machine learning scientists to build AI solutions across the enterprise and drives applied AI research to support cutting-edge AI applications at TD. She has 10 years industry experience in AI/ML, from startup to enterprise. She holds a PhD in theoretical particle physics from the University of Toronto and completed her post-doctoral research in particle physics at the Lawrence Berkeley National Laboratory in California.
Patricia Arocena is a Senior Director and Head, RBC Innovation Labs, at Royal Bank of Canada (RBC) where she is responsible for identifying emerging technologies and accelerating their adoption across the bank. With over 20 years of experience in technology and innovation, Patricia has held leadership innovation roles at Tier-1 research institutions in Canada, PwC, and other financial institutions. She was recently appointed an RBC Distinguished Engineer in recognition of her exceptional technology achievements and technical community leadership. Today, Patricia helps organizations create next-gen solutions powered by Artificial Intelligence and other emerging technologies, delivering innovation and measurable value across the industry.
Track: Executive Track
Technical Level: 400 – Expert
Abstract:
This talk will be a panel moderated by Ozge with other AI leaders from Canadian Banks. The panelists will speak to
1) new, exciting, innovative AI work and initiatives going on at banks.
2) specifically about talent/career opportunities existing or coming up and how to get engaged
3) how bank solve complex problems and finding unique opportunities in Canada’s regulatory environment
What You’ll Learn:
Canadian banks are innovating and driving the AI adoption in an evolving regulatory environment.
Presenter:
Helena Yu, Principal Consultant, Data & AI, Calian ITCS
About the Speaker:
Helena Yu is a dedicated data and AI leader with over 5 years of experience driving innovation through machine learning technologies. She specializes in bridging the gap between cutting-edge research and practical applications that create meaningful impact across diverse industries.
As an experienced team leader, Helena has successfully guided cross-functional teams in developing and implementing AI/ML solutions for multiple sectors including technology, finance, insurance, retail, telecommunications, and non-profit organizations.
Helena’s approach is fundamentally collaborative. She builds strategic partnerships with academic institutions, global technology companies, government agencies, and nonprofit organizations to ensure that advanced research translates into solutions that address real-world challenges.
What sets Helena apart is her commitment to human-centered AI development. She champions data-driven strategies that remain deeply aligned with the needs and experiences of the people they ultimately serve, ensuring technology advances genuine human progress.
With her technical expertise and collaborative leadership style, Helena continues to push boundaries in how organizations can leverage data, AI, and machine learning to create positive change in the world.
Track: Executive Track
Technical Level: 300 – Advanced
Abstract:
In today’s AI-driven landscape, organizations are eager to implement both traditional and generative AI solutions, yet many underestimate the critical foundation required: data readiness. This talk explores the comprehensive data readiness journey that enables successful AI adoption at scale.
While 90% of business leaders believe their data ecosystems are ready for AI, the reality is starkly different – with 84% of IT practitioners spending at least an hour daily fixing data problems. We’ll examine the key pillars of AI data readiness: infrastructure, integration, quality, governance, accessibility, and literacy, demonstrating how each contributes to AI success through practical assessment frameworks and implementation strategies.
This presentation will outline concrete steps for data readiness journey, from data collection to data curation and governance. We’ll contrast efficiency-focused AI applications with higher-value, growth-oriented use cases that require more sophisticated data foundations. Attendees will learn practical techniques for data readiness assessment, gap remediation, and quantitative measurement approaches.
Whether you’re beginning your AI journey or seeking to scale existing initiatives, this talk will provide actionable strategies to turn your organization’s data from a bottleneck into your greatest AI asset, ensuring both traditional and generative AI applications deliver meaningful business value.
What You’ll Learn:
1. Recognize the significant gap between perceived and actual data readiness
2. Understand the six pillars of AI data readiness and how to assess each
3. Learn practical techniques to systematically improve data quality, accessibility, and governance
4. Develop a roadmap that aligns technical data preparation with business objectives and use cases
5. Distinguish between requirements for efficiency-focused AI versus growth-oriented AI applications
Presenter:
Anurag Arora, Director of Business Strategy & Operations for Data & AI, TELUS
About the Speaker:
Anurag is the Director of Business Strategy & Operations for Data & AI at TELUS, where he leads enterprise-wide artificial intelligence strategy and transformation initiatives.
In his role at TELUS, he oversees AI literacy transformation, governance frameworks, and value creation through strategic data and AI implementations. He specializes in translating complex AI initiatives into actionable growth strategies and contributes to TELUS’ AI advancement through strategic partnerships and thought leadership initiatives.
With over 16 years of experience in the telecom, media and technology sectors across international markets, Anurag has helped organizations drive business growth, innovation, and digital evolution through strategic initiatives.
Anurag holds an MBA from the University of Toronto and an undergraduate degree from York University.
Track: Executive Track
Technical Level: 200 – Intermediate
Abstract:
In this talk, we’ll explore our tested approach to identifying, evaluating, and solving complex business challenges in today’s rapidly changing AI-driven landscape. We’ll discuss our approach to use case prioritization and definition, an AI engagement model that emphasizes rapid iteration and measurable outcomes. We’ll also touch upon a few use cases that applied our cross-functional approach enabling fast learning and course correction. The session concludes with a discussion of how we measure ROI and how we measure AI effectiveness. This presentation offers practical insights for leaders seeking to transform their organization’s problem-solving capabilities while ensuring tangible returns on AI investments.
What You’ll Learn:
Key steps of successful AI application include identifying and prioritizing use cases; AI engagement models that emphasis rapid iteration; measuring the ROI of AI
Presenter:
Ikhtear Bhuyan, Data Security, IBM
About the Speaker:
Technical Executive and Security Architect with more than 16+ years of experience in the IT field working in software development, solution architecture, and presales technical support. I work directly with clients, business partners, and consulting firms across Canada and the Caribbean to assess, plan, design, and build cybersecurity solutions focused on SIEM, SOAR, EDR, Data Security (Governance, Risk, Privacy and Compliance), Identity and Access Management, and SOC technologies for different industries and standards (ISO, CIS, PCI, HIPAA, NERC-CIP, NIST, FFIEC, SOC 2 etc.). With a deep understanding of both business and technical security challenges, I provide strategic, thought leadership advice to organizations. Proven track record of driving technical wins and serving as a trusted advisor to Fortune 500 companies.
Track: Executive
Technical Level: 300 – Advanced
Abstract:
As organizations rapidly adopt AI models to enhance business processes and accelerate time-to-market for products and services, cybersecurity and business leaders face new challenges. The rise of generative AI offers resilience and innovation, but it also introduces emerging risks that must be managed effectively.
In this session, we will explore strategies for securing the AI pipeline, with a focus on protecting data from the preparation stage through model development and final deployment into production applications and services. Attendees will gain insights into how comprehensive data security can be achieved by leveraging industry best practices and advanced technical approaches.
Key focus areas:
– Protecting sensitive data in AI models across databases (Db2, Big Data, NoSQL, Vector DBs).
– Preventing unintentional exposure by monitoring data copies across applications and cloud environments.
– Detecting data leakage to safeguard PII, PCI, and intellectual property.
– Ensuring compliance with regulations like LGPD by monitoring data transfers.
– Reducing compliance challenges to meet evolving AI and data regulations.
What You’ll Learn:
– Understand the adoption of GenAI in the market and how to defend against new types of attacks and vulnerabilities by GenAI models in a secure and trustworthy manner
– Understanding of how to secure the data, secure the AI models and secure the usage of the models across the AI pipeline
– Industry best practices and technical approaches to secure the AI pipeline by leveraging the right data protection principles
Presenter:
Tanushree Nori, Principal Data Scientist, Vimeo
About the Speaker:
Tanushree Nori is a Principal Data Scientist at Vimeo, where for the past 4½ years she’s built LLM-powered features—Video Insights, Chapters, Summaries, Highlights, and multilingual Dubbing—that help viewers unlock more value from every upload. Her earlier work on cloud-storage optimization now saves the platform about $1 million annually and was showcased at Demuxed 2024 in San Francisco. When she isn’t dissecting LLM evals for fun, Tanushree is dancing—bringing her Indian classical foundation into hip-hop and house for a trippy, vibrant fusion.
Track: AI for Productivity Enhancements
Technical Level: 300 – Advanced
Abstract:
Imagine every video greeting viewers in their own language—no studio booth, no red-eye caption sprints. Vimeo’s new pipeline turns a single upload into time-locked captions and natural-sounding dubs, almost as fast as the video plays.
1. Gemini Flash 2.0 handles translating transcripts fast enough that you can watch progress in real-time.
2. Careful chain-of-thought prompting coaxes phoneme details of translations, so we can contract roomy German syllables or subtly expand packed Mandarin ones before the subtitles and dubs wander off-beat.
3. Our chunking strategy that pins every subtitle segment to its timestamp, keeping drift under 10 ms.
4. Spot a rare error in the subs? Segment-level re-edit lets you fix a single line; only that slice gets re-translated (and re-dubbed if so wished).
5. A creative and thorough eval framework to run translation experiments.
We’ll share the system design involved, prompting tricks, the timing math, and a few war stories when subs and dubs went rogue—plus the metrics and eval methodology that convinced us the system was ready for production.
What You’ll Learn:
A reasonably detailed how-to on turning a single video into accurate subtitles and dubs in many languages—using LLM prompt tricks, phoneme-aware timing, and a chunk-based pipeline that stays fast, editable, and production-ready.
Presenter:
Arash Salehzadeh, Data Scientist, Dropbox
About the Speaker:
I am a Data Scientist at Dropbox. My work focuses mainly on building NLP applications to analyze text data at scale. Prior to Dropbox I worked as a data scientist at Bayer Pharmaceuticals and TD Bank
Track: AI for Productivity Enhancements
Technical Level: 200 – Intermediate
Abstract:
Generative Large Language Models (LLMs) have transformed the landscape of Natural Language Processing (NLP). However, smaller transformer-based models such as BERT and its variants remain crucial in both supervised and unsupervised tasks. This talk explores how LLMs and traditional models can work together to create more effective NLP workflows. We will discuss how LLMs can significantly reduce the cost and effort of data labeling, enabling faster and more efficient training of classification models. Additionally, we examine unsupervised problems and discuss how LLMs can be integrated with traditional text clustering methods to extract meaningful insights from unstructured data at scale.
What You’ll Learn:
– How to efficiently create labeled datasets via Active Learning and enhance the process with LLMs
– How LLMs can simplify and scale text clustering frameworks by summarizing and extracting key insights from clusters. In addition, we explore techniques to improve the accuracy of text clustering using LLMs.
While GenAI applications are increasing rapidly, they are not replacing the traditional NLP frameworks. Instead they can complement each other to build more powerful, scalable and efficient NLP systems.
Presenter:
Phani Dathar, Ph.D, Director, Graph Data Science, Neo4j
About the Speaker:
Phani is a Director of Graph Data Science at Neo4j. He is a computational scientist and holds a PhD in Nanotechnology and Computational Materials Science. After a decade of experience in computational sciences research in both industry and academia, he transitioned to a career in data science and machine learning and over the past ten years, worked as a consultant, architect and advisor in the AI/ML space. Currently, he is with Neo4j helping prospects and customers design, architect and develop applications using graph technology, graph analytics and GraphRAG.
Track: AI for Productivity Enhancements
Technical Level: 200 – Intermediate
Abstract:
Graph-based Retrieval-Augmented Generation aka GraphRAG combines the rich knowledge representation in knowledge graphs with the advanced natural language generation capabilities of LLMs making it an essential component in building contextual, accurate and reliable GenAI applications. Integrating multiple retrieval mechanisms and autonomous agents, we can build more advanced enterprise-grade search and decision-making AI applications. In this presentation, we will demonstrate the application of GraphRAG using knowledge graphs built on Neo4j, LLM APIs and the agentic approach. Attendees will gain insights into the practical applications of GraphRAG and learn how to build GenAI applications that are smarter, adaptive, and more efficient.
What You’ll Learn:
Discover how GraphRAG leverages knowledge graphs, LLMs, and autonomous agents to build smarter, more reliable GenAI applications. Learn practical techniques for enterprise-grade search and decision-making using Neo4j and the agentic approach.
Presenter:
Gautham Anil, Senior Data Scientist, Vimeo
About the Speaker:
Tanushree Nori is a Principal Data Scientist at Vimeo, where for the past 4½ years she’s built LLM-powered features—Video Insights, Chapters, Summaries, Highlights, and multilingual Dubbing—that help viewers unlock more value from every upload. Her earlier work on cloud-storage optimization now saves the platform about $1 million annually and was showcased at Demuxed 2024 in San Francisco. When she isn’t dissecting LLM evals for fun, Tanushree is dancing—bringing her Indian classical foundation into hip-hop and house for a trippy, vibrant fusion.
Track: AI for Productivity Enhancements
Technical Level: 300 – Advanced
Abstract:
Imagine every video greeting viewers in their own language—no studio booth, no red-eye caption sprints. Vimeo’s new pipeline turns a single upload into time-locked captions and natural-sounding dubs, almost as fast as the video plays.
1. Gemini Flash 2.0 handles translating transcripts fast enough that you can watch progress in real-time.
2. Careful chain-of-thought prompting coaxes phoneme details of translations, so we can contract roomy German syllables or subtly expand packed Mandarin ones before the subtitles and dubs wander off-beat.
3. Our chunking strategy that pins every subtitle segment to its timestamp, keeping drift under 10 ms.
4. Spot a rare error in the subs? Segment-level re-edit lets you fix a single line; only that slice gets re-translated (and re-dubbed if so wished).
5. A creative and thorough eval framework to run translation experiments.
We’ll share the system design involved, prompting tricks, the timing math, and a few war stories when subs and dubs went rogue—plus the metrics and eval methodology that convinced us the system was ready for production.
What You’ll Learn:
A reasonably detailed how-to on turning a single video into accurate subtitles and dubs in many languages—using LLM prompt tricks, phoneme-aware timing, and a chunk-based pipeline that stays fast, editable, and production-ready.
Presenter:
Paul Beaton, Senior Manager, Data & Analytics, Rogers Communications Inc.
About the Speaker:
Paul Beaton is an accomplished data and analytics leader with a passion for transforming data into actionable insights that drive business impact. As Senior Manager of Data and Analytics at Rogers Communications, Paul leads a high-performing team of analysts and engineers, spearheading innovative initiatives in pricing strategy, operational efficiency, and enterprise data transformation.
Track: AI for Data Preparation and Processing
Technical Level: 300 – Advanced
Abstract:
This research explores the efficacy of employing Large Language Models (LLMs) for imputing missing data, a prevalent challenge in recommender systems. I evaluate the impact of LLM-based imputation on Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) across a regression model, as outlined by Zhicheng et al. 2024. The performance of LLM imputation is rigorously compared against traditional methods, including case-wise deletion, zero imputation, mean imputation, KNN, and multivariate methods. Expanding upon the aforementioned study, I further investigate the lift in predictive accuracy achieved by LLM imputation under varying degrees of missing data: 10%, 15%, and 20%. My findings demonstrate the potential of LLMs to enhance data imputation, offering improved accuracy and robustness in recommender systems, particularly as the sparsity of data increases.
What You’ll Learn:
LLMs might be more effective at imputing data than traditional methods, particularly when a dataset has a lot of missing values
Presenters:
Annie Lee En-Shiun, Lee Language Lab & Assistant Professor, Ontario Tech University & University of Toronto | Mason Shipton, Programming Analyst, Ontario Teachers’ Pension Plan | Alice Luong, Programmer, Ontario Government | Labib Rahman, Senior Product Manager – Interoperability, Zocdoc | York Hay Ng, Research Project Lead Department of Computer Science, University of Toronto | Khasir Hean, Founder in Residence, Antler
About the Speakers:
Annie En-Shiun Lee is an Assistant Professor at OntarioTech University and status-only at the University of Toronto. She leads the Lee Language Lab (L³), where her research focuses on multilinguality and language diversity. Her work has appeared in venues such as Nature Digital Medicine, ACL, and ACM Computing Surveys. She has experience bridging academia and industry, previously holding roles at VerticalScope and Stradigi AI, and serves as demo co-chair for NAACL.
Mason Shipton led the development of the URIEL+ knowledge base and spearheaded the creation of a new typological resource for writing systems. His work has improved NLP performance across languages and contributed to two ACL conference papers, including one that won the Audience Award in 2024. Their research interests lie at the intersection of multilingual NLP, linguistic diversity, and inclusive language technology.
Alice Luong is a Computer Science and Concurrent Education student at York University and a member of the Lee Language Lab. She has expressed interest in education, language learning, and the use of tools like Google Translate to support multilingual education.
Labib Rahman is a researcher with interests in multilingual NLP and inclusivity. He authored the “Disability Language Guide,” approved by the Stanford Disability Initiative, and has participated in design thinking and innovation workshops.
York Hay Ng is a Project Developer at UofT Blueprint and a contributor to the URIEL+ project, which enhances linguistic inclusion and usability in typological and multilingual knowledge bases.
Khasir Hean is a machine learning engineer with a background in computer science and linguistics, and a founder in residence at Antler.
Track: Future Trends
Technical Level: 300 – Advanced
Abstract:
URIEL+ is an enhanced language knowledge base offering typological, phylogenetic, and geographical features for over 4,000 languages. This talk introduces URIEL+’s capabilities, including customizable distance calculations and better performance in downstream tasks. We also present ExploRIEL, an intuitive interface for easy access to URIEL+, making linguistic analysis more accessible. We explore how feature-importance techniques like PCA optimize URIEL+’s data to enhance task performance, providing a powerful tool for building more inclusive language technologies.
What You’ll Learn:
Attendees will learn how URIEL+ and its tools can be used to efficiently integrate rich linguistic features into NLP pipelines, enabling more inclusive, scalable, and high-performing multilingual systems – without needing deep linguistic expertise or manual tuning.
Presenter:
Bonnie Li, Research Engineer, Google DeepMind
About the Speaker:
Bonnie is a researcher at Google DeepMind working on frontier AI models and building generally intelligent agents. She worked on the Gemini Thinking models and Genie 2, and is interested in how RL can unlock new capabilities in large-scale models. Previously she worked at Nvidia and co-founded a deep tech startup backed by Khosla Ventures.
Track: Future Trends
Technical Level: 100 – Beginner
Abstract:
The dream of general AI agents—capable of learning, adapting, and acting across any task or environment—is within reach. This talk traces a path toward this vision through world models and reinforcement learning. Genie 2 introduces a new paradigm of foundation world models – generating 3D interactive worlds from any text or image. LMAct extends large language models into interactive agents by grounding them in environments. Reinforcement Learning unlocks new capabilities in LLMs by enabling LLMs to autonomously optimize rewards. Together, these developments set the stage for a new generation of agents—capable of reasoning, acting, and self-improving across diverse domains.
What You’ll Learn:
– How world models work
– How RL improves LLMs in specific tasks
Presenter:
Melica Mirsafian, Senior Applied Scientist, Thomson Reuters
About the Speaker:
Melica Mirsafian is a Senior Applied Scientist at Thomson Reuters Labs, where she leads research and development of cutting-edge Machine Learning and Generative AI solutions for highly regulated industries. Her expertise spans deploying compliant GenAI systems for U.S. Public Records and pioneering ML innovations for Thomson Reuters’ flagship legal research platform, Westlaw. With a focus on balancing technical innovation and regulatory requirements, Melica specializes in creating trusted AI systems that transform how legal and financial professionals access, analyze, and leverage critical information.
Track: GenAI Deployments In Regulated Industries
Technical Level: 300 – Advanced
Abstract:
In the high-stakes world of financial due diligence and public records analysis, there’s no room for error. Yet the promise of GenAI to transform massive 500+ page reports into actionable 3-5 page summaries was too compelling to ignore. This presentation chronicles Thomson Reuters Labs’ journey deploying the CLEAR Risk Analysis Summary tool—a GenAI solution that navigates the treacherous waters of regulatory compliance, data licensing restrictions, and the inherent challenges of language models in environments where accuracy isn’t just preferred, but legally mandated.
I’ll share our battle-tested approaches for taming hallucinations when business reputations and financial decisions hang in the balance, implementing precise citation protocols for every claim, and developing specialized prompting techniques that maintain neutrality while still highlighting genuine risks. You’ll learn how we balanced the integration of licensed data with real-time web information while ensuring FedRAMP compliance, and how our user-centric development approach led to a successfully deployed product that’s already transforming workflows for financial institutions and due diligence professionals.
Whether you’re considering GenAI deployment in financial services, healthcare, legal, or any regulated industry where precision matters, this case study offers practical strategies for navigating the technical, ethical, and regulatory minefields that stand between concept and successful deployment.
What You’ll Learn:
1. Hallucination Management in High-Stakes Environments: Practical techniques for preventing multiple types of AI hallucinations in regulated industries – from factual inaccuracies to entity confusion between similar businesses – where errors carry significant legal, financial, and reputational consequences.
2. Navigating Data Licensing and Compliance in the GenAI Era: Strategies for deploying effective GenAI solutions while respecting restrictive data licensing frameworks and regulatory requirements, including approaches when vendor restrictions limit AI use of their data.
3. Balancing Neutrality with Effective Risk Communication: Techniques for controlling LLM tone and preventing unwanted advice-giving while still effectively communicating business risks, ensuring outputs maintain the precise language standards required in due diligence contexts.
4. Multi-Source Integration with User-Centric Development: How we blend licensed data, internal records, and real-time web sources while addressing challenges of verification and web manipulations, all guided by early user engagement that accelerates adoption in conservative sectors and builds trust through responsive development and demonstrable results.
Presenter:
Amanda Milberg, Principal Solutions Engineer, DoubleWord (TitanML rebranded)
About the Speaker:
Amanda is a Principal Solutions Engineer at DoubleWord with a strong interest in Generative AI business solutions. Amanda has a proven track record of assisting large institutions in business transformation efforts in the advanced analytics space and an innate ability to explain deep technical concepts to a broad audience. Amanda has a bachelor’s degree in Computer Science and Mathematics from Colgate University.
Track: GenAI Deployments In Regulated Industries
Technical Level: 200 – Intermediate
Abstract:
Generative AI and Agentic AI hold the potential to revolutionize everyday business operations. However, for highly regulated enterprises, security and privacy are non-negotiable and shared LLM API services do not provide a transparent solution. In this session, we will explore the open-source landscape and identify various applications where owning your own stack can lead to enhanced data privacy and security, greater customization, and cost savings in the long run.
Our talk will take you through the entire process, from idea to implementation, guiding you through selecting the right model, deploying it on a suitable infrastructure, and ultimately building a robust AI agent. By the end of this session, attendees will gain practical insights to enhance there ability to develop high value Generative AI applications. You will leave with a deeper understanding on how to empower your organization with self-hosted solutions that prioritize control, customization, and compliance.
What You’ll Learn:
Participants will gain a clear understanding of the options available for using large language models in highly regulated industries, particularly through self-hosting solutions. Additionally, attendees will have the opportunity to witness a live demo, transforming the session from a purely conceptual experience into a truly hands-on talk.
Presenter:
Matthieu Lemay, Co-Founder & AI Strategist, Lemay.ai
About the Speaker:
Matt Lemay is a leading expert in AI governance, compliance, and machine learning deployment for highly regulated industries. As the Co-Founder of Lemay.ai, he specializes in designing AI systems that align with ISO 42001, the EU AI Act, and other global regulatory frameworks. With deep expertise in finance, healthcare, aerospace, and defence, Matt helps organizations navigate the challenges of safe, transparent, and ethical AI implementation.
Areas of Expertise:
AI Regulation & Governance – Ensuring compliance with ISO 42001, the EU AI Act, and global AI policies.
AI in Regulated Industries – Practical experience deploying AI in high-stakes sectors like MedTech, finance, and defence.
Machine Learning Risk Management – Strategies for bias mitigation, explainability, and security in AI systems.
AI Strategy & Policy – Helping organizations adapt to emerging AI regulations and compliance challenges.
Matt is a certified ISO 42001 lead auditor advocating for responsible AI development. He speaks at global industry events, contributing to discussions on AI ethics, policy, and the future of AI governance. His work has influenced AI adoption policies in North America and Europe, making him a sought-after speaker for leaders, policymakers, and AI practitioners.
Upcoming Engagements:
Swiss Biotech Day 2025 – Speaking on AI compliance in MedTech.
Aeromart Montréal – Discussing AI risk management in aerospace and defence.
Toronto Machine Learning Summit – Presenting on ISO 42001 & the EU AI Act.
Halifax Energy Summit – Exploring AI’s role in energy efficiency and sustainability.
Dubai MedTech Conference – Addressing AI in healthcare and regulatory compliance.
Track: GenAI Deployments In Regulated Industries
Technical Level: 100 – Beginner
Abstract:
Navigating AI Compliance – ISO 42001, the EU AI Act, and the Future of Regulated AI
As artificial intelligence becomes increasingly integrated into critical sectors such as healthcare, finance, aerospace, and defence, the demand for standardized AI governance has never been higher. ISO 42001, the first international standard for AI management systems, alongside the EU AI Act, represents a shift towards regulatory oversight that prioritizes safety, transparency, and accountability in AI systems.
In this session, Matt Lemay, Co-Founder of Lemay.ai, will explore the intersection of machine learning and regulatory compliance, offering insights into how businesses can align their AI innovations with evolving global standards. The talk will cover:
– Key provisions of ISO 42001 and the EU AI Act and their implications for AI practitioners.
– Challenges in implementing compliant AI systems, including bias mitigation, security, and ethical considerations.
– Lessons from highly regulated industries, such as aerospace, MedTech, and finance, on deploying AI safely.
– Best practices for AI risk management and governance, ensuring models remain explainable, auditable, and compliant.
– Future trends in AI policy and compliance, helping businesses prepare for the next wave of AI regulation. “
What You’ll Learn:
– Understanding ISO 42001 & the EU AI Act – How these frameworks shape AI governance and compliance.
– Risk & Compliance in AI Development – Addressing bias, transparency, and accountability in machine learning models.
– AI in Regulated Industries – Lessons from deploying AI in healthcare, finance, defence, and aerospace under strict regulations.
– Building Trustworthy AI – Strategies for ensuring safety, security, and ethical AI deployment.
– The Future of AI Regulation – How global policies evolve and what companies must prepare for.
Presenters:
Hoora Fakhrmoosavy, Senior Manager Research Data Science & Gen AI, BMO | Preetinder Singh, Manager, Generative AI, BMO
About the Speakers:
Dr. Hoora Fakhrmoosavy is a distinguished AI researcher and Senior Manager of Research and Data Science – Gen AI at BMO Financial Group. With a deep expertise in Natural Language Processing (NLP), Generative AI, and Data Science, she has led many projects focused on AI-driven innovation in financial services. Her work includes NLP projects, retrieval-augmented generation (RAG) systems, AI-powered chatbots, automated decision-making, and customer analytics.
Beyond her corporate role, Dr. Fakhrmoosavy is Professor mentoring the next generation of AI professionals. She is passionate about bridging the gap between cutting-edge AI research and real-world applications, ensuring AI technologies drive meaningful business and societal impact.
Dr. Fakhrmoosavy has been recognized with multiple Spotlight Awards for her outstanding leadership and contributions to AI-driven transformation. She is also actively working in AI governance for NLP projects, ensuring the ethical and responsible deployment of AI systems in enterprise environments.
Holding a Ph.D. in AI and Data Science, she has expertise in large-scale machine learning, foundation models, NLP, and AI-driven automation. As an advocate for AI innovation, she continues to push the boundaries of NLP, Generative AI, and Data Science research to solve complex challenges and shape the future of AI-powered solutions.
Preetinder Singh is a results-driven Data Science Specialist at BMO U.S., with a strong background in machine learning, artificial intelligence, and enterprise analytics. He specializes in leveraging cutting-edge AI technologies to develop scalable solutions that enhance business decision-making and operational efficiency.
At BMO, Preetinder has played a key role in designing and implementing advanced AI-driven applications, including:
– Retrieval-Augmented Generation (RAG) Systems: Developed RAG models that enhance traditional AI-generated responses by integrating real-time, context-specific data retrieval, improving accuracy and reliability.
– Multimodal AI Chatbots: Built intelligent chatbots capable of processing both text and image-based inputs, enabling a more comprehensive and interactive customer service experience.
– Fine-Tuning Large Language Models (LLMs): Customized and fine-tuned LLMs to optimize their performance for specific financial and business applications, ensuring efficiency and relevance in enterprise AI solutions.
With a strong foundation in data engineering, model development, and cloud computing, Preetinder is committed to pushing the boundaries of AI innovation. His expertise lies in bridging the gap between research and practical implementation, ensuring that data-driven strategies translate into tangible business impact.
Outside of work, he actively explores advancements in generative AI, deep learning, and cloud-based AI solutions, continuously expanding his knowledge to stay at the forefront of technological evolution.
Track: AI for Productivity Enhancements / GenAI Deployments In Regulated Industries
Technical Level: 300 – Advanced
Abstract:
Introduction: In the evolving landscape of banking, operational efficiency and accurate information retrieval are critical for our employees. To address this, we led the development of a proof-of-concept (POC) Agentic AI chatbot, leveraging generative AI and retrieval-augmented generation (RAG) to enhance productivity of our employees. This chatbot is highly versatile, adapting its functionality based on user queries to provide comprehensive support.
Innovation and Approach Unlike traditional chatbots, this AI-powered system dynamically adjusts to user needs, acting as a multi-functional assistant. Built using Amazon Bedrock, it employs cutting-edge foundation models to ensure accurate and context-aware responses. By incorporating validation mechanisms, it ensures high-quality responses.
Key Features
– Multifunctional Capabilities: Depending on user queries, the chatbot can function as:
– Code Generator: Assisting with script generation and automation support.
– KPI Chatbot: Extracting and analyzing key performance indicators for decision-making.
– Translator: Offering multilingual translation for internal communication and document processing.
– Topic Modeling & Summarization: Identifying key themes in documents and generating concise summaries.
– Insights Generator: Extracting trends and insights from structured and unstructured data.
Impact and Results: As a proof of concept, the chatbot is demonstrating significant potential in enhancing operational efficiency by reducing response times, improving accuracy, and enabling banking professionals to handle complex tasks with greater confidence and increasing their productivity.
We are eager to share the journey of building and validating this innovative chatbot POC at this AI summits. Our talk will offer valuable insights into:
– Architecture & Functionality of an Agentic multifunctional AI chatbot
– Overcoming challenges in AI adoption within regulated environments
– Lessons learned from the POC and pathways for scaling AI-driven solutions
By showcasing this transformative initiative, We aim to inspire and educate professionals on leveraging AI to drive operational efficiency in banking.
What You’ll Learn:
We like to show how Agentic AI powered chatbot can significantly boost employees productivity. Also the importance of designing AI systems that adopt to diverse business needs, from code generation to KPI analysis. The other area we will share is the strategies for successfully implementing AI within a regulated industry like banking and the lesson we learned during the development.
Presenter:
Amin Atashi, Senior Machine Learning Engineer, The Globe and Mail
About the Speaker:
Amin Atashi is a dedicated AI researcher and engineer specializing in generative AI. He focuses on hosting and scaling generative AI solutions that drive innovation in digital media, with a particular emphasis on transforming the news industry. Drawing on his background in optimization models and sensor fusion, Amin develops practical, scalable approaches that address real-world challenges. He has shared his insights at events like the Enterprise AI Summit Canada and the Generative AI Summit Toronto, always striving to make advanced AI solutions more accessible and impactful.
Track: GenAI Deployments In Regulated Industries
Technical Level: 300 – Advanced
Abstract:
Deploying generative AI bots in the real world is an exciting yet complex journey. In this talk, I’ll walk through the practical challenges of building, scaling, and maintaining a production-grade GenAI bot—covering everything from prompt engineering and hallucination control to infrastructure, cost management, and monitoring. Drawing from firsthand experience deploying a GenAI bot for a national media platform, I’ll share lessons learned, pitfalls to avoid, and strategies that helped turn a prototype into a reliable, user-facing product. Whether you’re just exploring GenAI or working toward deployment, this talk will offer actionable insights and hard-earned takeaways.
What You’ll Learn:
– Practical strategies for transitioning generative AI bots from prototypes to scalable, production-ready systems.
– An understanding of the key infrastructure challenges—such as hosting, latency, and reliability—and how to overcome them.
– Real-world case studies and lessons learned that illustrate how to manage and optimize AI deployments.
– Insights into the iterative process of refining AI models for robust, user-focused applications.
– How these strategies can be applied across industries to unlock the full potential of generative AI.
Presenters:
Mehdi Rezagholizadeh, Principal Member of Technical Staff, AMD | Vikram Appia, Director Software Development, AMD
About the Speakers:
Mehdi is a Principal Member of the Technical Committee at AMD. Before joining AMD, he was a Principal Research Scientist at Huawei Noah’s Ark Lab Canada, where he worked since 2017 and served as the leader of the Canada NLP team for over six years. He focuses on efficient deep learning for NLP, computer vision, and speech, developing streamlined solutions for training, architecture, and inference.
Mehdi holds about 20 patents and has authored over 50 publications in leading conferences and journals, including TACL, NeurIPS, AAAI, ACL, NAACL, EMNLP, EACL, Interspeech, and ICASSP. Additionally, he has actively contributed to the academic and industrial communities by organizing prominent workshops, such as the NeurIPS Efficient Natural Language and Speech Processing (ENLSP) workshops (2021-2024) and by serving on technical committees for ACL, EMNLP, NAACL, and EACL, including as Area Chair and Senior Area Chair for NAACL 2024.
Vikram currently leads the Efficient GenAI team within the AI group at AMD. The teams charter is to enable efficient inferencing and training at scale and release open source models and recipes for community to maximize performance on AMD GPUs. His team focuses on training and inferencing for various GenAI applications across LLM, image/video generation and multi-modal models.
Prior to joining AMD, Vikram spent about 7 years at Rivian Automotive. His team was responsible for development and execution of both the on-board and offboard (auto-labeling) Perception stack for all Rivian vehicle programs. In his prior role at Texas Instruments, Dallas, he was a technical lead on the Perception and Analytics R&D labs.
He received his MS and PhD in ECE from Georgia Institute of Technology, Atlanta. He has authored over 40 US patents, and over 20 articles in refereed conferences and IEEE journals.
Track: Hardware Platforms
Technical Level: 400 – Expert
Abstract:
Large foundation models—spanning large language models (LLMs), vision models, and multi-modal models—have revolutionized both academic research and industrial applications in AI. The computational power of GPUs has played a significant role in the success of these models, impacting their development, training, and inference. As the scope of these foundation models continues to expand, the choice of model architecture, training methods, training data, and hardware computational resources becomes increasingly vital. This presentation explores various training efforts, such as pre-training, fine-tuning, and post-training, using AMD Instinct GPUs. We will delve into our public training dockers, highlighting key features designed to enhance user experience and improve accessibility. The journey begins with pre-training methodologies, where we will review a snapshot of model performance metrics and demonstrate the benefits of leveraging Multi-GPU setups. Next, we will cover fine-tuning solutions, including Full Weight Fine-tuning and Parameter Efficient Tuning (PEFT) using Megatron-LM and HF-PEFT, showcasing how the larger HBM memory of the MI300X can lead to improved training performance and accuracy. Finally, we will address post-training strategies, including the innovative process of distilling multi-head attention (MHA) into more efficient solutions such as Mamba and Multi-Head Latent Attention (MLA) layers, aimed at optimizing model efficiency and deployment readiness. Our talk will provide practical insights and frameworks for implementing these advanced techniques alongside AMD Instinct GPUs.
What You’ll Learn:
Attendees will learn about the entire deep learning workflow from pre-training to post-training while leveraging the capabilities of AMD Instinct GPUs, particularly the MI300X. They will gain insights into effective training strategies, including Full Weight Fine-tuning and Parameter Efficient Tuning (PEFT), and understand the benefits of Multi-GPU setups for enhancing model performance. Additionally, the presentation will cover innovative post-training optimization techniques, such as distilling multi-head attention into more efficient structures, equipping attendees with practical frameworks to apply in their own AI projects.
Presenter:
Soumye Singhal, Research Scientist, NVIDIA
About the Speaker:
Soumye Singhal is a Research Scientist at NVIDIA, focusing on LLM post-training and alignment for Nemotron models. Recently he has contributed to the development of Llama-Nemotron reasoning models and Nemotron-Hybrid models. His research focuses on enhancing LLM performance through inference-time compute scaling and preference optimization techniques. Prior to joining NVIDIA, he completed his Master’s at Mila under Aaron Courville and his undergraduate studies at IIT Kanpur.
Track: Inference Scaling
Technical Level: 300 – Advanced
Abstract:
This talk introduces Llama-Nemotron, an open-source family of reasoning models delivering state-of-the-art reasoning capabilities with industry-leading inference efficiency. Available in three sizes—Nano (8B), Super (49B), and Ultra (253B)—these models surpass existing open reasoning models such as DeepSeek-R1, offering substantial improvements in inference throughput and memory efficiency.
The presentation will focus primarily on the specialized training methodology underlying these models. This includes a two-stage post-training pipeline: supervised fine-tuning (SFT) using carefully curated synthetic datasets to effectively distill advanced reasoning behaviors, and large-scale reinforcement learning (RL) with curriculum-driven self-learning to enable models to exceed teacher performance.
Additionally, the talk will briefly highlight innovations such as neural architecture search (NAS) for enhanced model efficiency, targeted inference-time optimizations, and a dynamic toggle for switching reasoning on or off, emphasizing their practical importance in real-world enterprise deployments.
What You’ll Learn:
Attendees will learn about effective methods for training powerful reasoning models using reasoning-focused supervised fine-tuning (SFT) and large-scale reinforcement learning (RL), enabling inference-time scaling and efficiency.
Presenter:
Alex Grbic, Vice President of Software Engineering, Untether AI
About the Speaker:
Alex Grbic is an experienced software and semiconductor executive, who brings valuable software development experience for complex software products to the Untether AI team. Previous to Untether AI, Alex enjoyed a storied career at Altera (and later Intel) delivering software development flows for heterogeneous computing and AI acceleration, and served as CTO of Deloitte’s Artificial Intelligence practice in Canada.
Track: Inference Scaling
Technical Level: 300 – Advanced
Abstract:
Optimizing AI inference requires more than just high-performance hardware—it demands a tightly integrated software stack that maximizes efficiency at every level. This talk will dive into the critical role of model optimization techniques, software-driven performance tuning, and compiler advancements in accelerating AI workloads. We will explore how quantization, and workload partitioning impact performance-per-watt-per-dollar, and how modern inference compilers streamline deployment on purpose-built AI accelerators.
What You’ll Learn:
Attendees will gain a deeper understanding of the trade-offs in AI inference optimization and walk away with practical insights into maximizing efficiency on specialized silicon.
Presenter:
Haytham Abuelfutuh, Union.ai
About the Speaker:
Haytham Abuelfutuh is a co-founder and CTO of Union.ai. He co-authored the popular Flyte.org ML workflow orchestration system. Haytham has been designing distributed systems and cloud applications throughout his 15 year tenure at Microsoft, Google and Lyft.
Track: Inference Scaling
Technical Level: 300 – Advanced
Abstract:
In this talk, we will discuss the challenges of running ultra-low latency Large Language Model (LLM) inference at scale, and approaches to overcoming these challenges. We will cover the unique challenges of LLM inference, such as large model sizes and key-value caching (KV cache). We will also discuss the obstacles teams may face when scaling LLM inference to handle large volumes of requests, including: the need for specialized hardware like GPUs and TPUs; efficiently scaling up; and the need for new routing architectures. Finally, we will share some of the solutions that we’ve adopted at Union to optimize the performance of inference workloads and present it in a cloud- and platform-agnostic way, so that you can implement it in your own infrastructure.
What You’ll Learn:
-Scaling LLMs, which almost always require low latency, is different than scaling other AI processes like offline workflows, and requires different considerations
-Without strategic technical approaches, scaling LLMs can quickly spiral into uncontrolled infrastructure costs and people hours
-To avoid these pitfalls, teams should consider:
–Supporting scale-to-zero inference endpoints, to avoid incurring expensive GPU costs when no one is interacting with your AI app.
–Solving for the first pitfall (cost) leads to a trade-off of cost for longer cold-start latency times, which will require further optimizations, such as container streaming, model sharding, and loading models directly to GPU memory.
–Using open source state-of-the-art inference frameworks like SGLang and vLLM will get you free performance boosts, versus investing time in rolling your own inference engine or using multi-purpose training and inference frameworks.
Presenter:
Shashank Shekhar, Independent Researcher
About the Speaker:
Shashank Shekhar is an independent researcher and consultant who has worked with startups and companies in helping them build and scale data pipelines, machine learning models, as well as evaluation systems. Some of the companies he has consulted for include Vector Institute, Cohere, Erode AI, NextAI, Shell. Prior to this, he was the founder of Dice Health where he built real time speed and language AI solutions for healthcare providers – steering the company from inception to profitability. Even before, he was a researcher on scaling laws, reasoning and interpretability at Meta AI, Vector Institute, and Indian Institute of Science. His research has been cited over 1800+ times, and won various awards including the Best Paper award at NeurIPS 2022.
Track: Inference Scaling
Technical Level: 400 – Expert
Abstract:
DeepSeek has revolutionized the AI landscape with their groundbreaking DeepSeek V-3 and R-1 models. Behind the impressive performance of these models is several ingenious optimizations in both the algorithmic and computational aspects of the attention mechanism. We will set the stage for FlashMLA with an analysis of attention mechanisms in large language models. We’ll examine the algorithmic bottlenecks inherent in traditional attention implementations and introduce DeepSeek’s Multi-Head Latent Attention (MLA) as an algorithmic solution to these scaling challenges.
Building on this algorithmic foundation, we’ll pivot to compute-specific performance constraints that limit attention implementations and consequently, inference speed. We will discuss FlashAttention, a GPU aware algorithm that addresses these limitations through innovative memory access patterns. The presentation culminates in an in-depth look at how DeepSeek ingeniously combines these complementary concepts in their FlashMLA implementation, resulting in dramatically accelerated LLM inference without sacrificing model quality.
What You’ll Learn:
After this talk, attendees will be able to answer the following questions:
1. How does the complexity of attention mechanisms create a fundamental scaling bottleneck as context length increases, and what are the practical implications for training and deployment?
2. What are the tradeoffs between memory footprint and computational efficiency when implementing KV caching, and how do these tradeoffs influence system design decisions?
3. In what ways do various attention variants like MLA fundamentally transform the attention computation paradigm?
4. Why is the distinction between compute-bound versus memory-bound algorithms crucial for optimizing performance on modern GPU architectures, and how does this reframe our approach to attention implementations?
5. How can hardware-aware algorithm design (e.g. for attention) dramatically outperform naive implementations even without changing the mathematical operation being performed?
6. What memory access pattern inefficiencies does online softmax computation elegantly solve that traditional implementations struggle with?
7. How does Flash Attention’s approach to memory management and I/O optimization speed up attention computation while maintaining mathematical equivalence?
8. How does FlashMLA combine the algorithmic benefits of Multi-head Latent Attention with the hardware-optimized implementation techniques of Flash Attention?
Presenter:
Xin Wang, Director of Machine Learning, d-Matrix Corporation
About the Speaker:
Dr. Xin Wang is the Director of Machine Learning at d-Matrix, an AI accelerator company trying to accelerate generative AI inference to record-breaking efficiency. Prior to his current role, he led a career working with AI semiconductor industry leaders Qualcomm, Intel, and Cerebras. He is active in the field of efficient Machine Learning, algorithm-hardware codesign, and brain-inspired computation.
Track: Inference Scaling
Technical Level: 300 – Advanced
Abstract:
In this talk, we will try to view the decade-old theories and practices of network quantization from a renewed perspective in the age of large language models (LLMs). As scaling laws can reliably predict the return of model quality on the investment of training-time computation, high uncertainty lingers after post-training quantization at inference-time for deployment. What extra factors govern the scaling of LLMs surviving quantization? Can we devise empirical scaling laws that shed light on the effectiveness of LLM quantization? Further, can we gain new theoretical insights into the ease or difficulty in the practice of network compression? We shall attempt some answers through the lens of recent work from my group.
What You’ll Learn:
– Quantization is essential in achieving high inference-time efficiency in practice.
– Training-time scaling laws do not fully predict inference-time effectiveness of quantized LLMs.
– We provide new insights into the scaling of quantized LLMs, also achieving a deeper understanding of network quantization.
Presenter:
N. Taylor Mullen, Senior Staff Engineer Cloud AI, Google
About the Speaker:
N. Taylor Mullen is a Senior Staff Software Engineer at Google building tools for the future AI developer. Before joining Google, Taylor was a key driver in GitHub Copilot’s strategic vision and served as the tech lead for GitHub Copilot in Visual Studio. He brings a deep background in developer tooling, OSS and all things generative AI. A self-proclaimed serial side-projectist, Taylor is passionate about developing products, teams and ideas to build AI that’s more R2-D2 and less Terminator.
Track: Negative Results
Technical Level: 100 – Beginner
Abstract:
The directive is clear: integrate AI. Organizations globally are pushing teams to explore AI’s potential, leading to an unprecedented wave of experimentation. But this enthusiasm often crashes against unseen organizational barriers, creating significant growing pains, often leading to hindered progress and heightened employee burnout. This talk delves into the critical, yet often overlooked, organizational challenges that arise when moving from top-down AI mandates to sustainable, impactful implementation. By illustrating several common organizational pitfalls, the session seeks to raise awareness, empowering attendees with the foresight to recognize and smooth their own path toward impactful AI integration.
What You’ll Learn:
By illustrating several common organizational pitfalls, the session seeks to raise awareness, empowering attendees with the foresight to recognize and smooth their own path toward impactful AI integration.
Presenter:
Leland McInnes, Researcher, Tutte Institute
About the Speaker:
Leland McInnes is a researcher at the Tutte Institute for Mathematics and Computing in Ottawa, Canada. He works in unsupervised learning, and topological techniques for machine learning. Among his contributions are the UMAP algorithm for dimension reduction, and the accelerated HDBSCAN algorithm for clustering. He maintains many open source data science tools, including UMAP, HDBSCAN, PyNNDescent, DataMapPlot and Toponymy.
Track: Traditional ML
Technical Level: 400 – Expert
Abstract:
Unsupervised learning is a diverse field that includes clustering, dimension reduction, anomaly detection, and density estimation. Many of the algorithms in the field are decades old and designed for low-dimensional tabular data. Now, with neural embeddings unlocking unstructured data, we face a world of high-dimensional data where old assumptions and intuitions do not hold. We’ll look at why classical unsupervised learning problems are still incredibly relevant today, why high-dimensional data breaks many of the standard algorithms, and how we can start to move forward and build new algorithms designed from the ground up for the high-dimensional data of the modern world.
What You’ll Learn:
We need to rethink classical unsupervised learning (clustering, anomaly detection, etc.) in light of high-dimensional data representations from neural embedding methods.
Presenter:
Nima Safaei, Senior Data Scientist, Scotiabank
About the Speaker:
Nima has a Ph.D. in system and industrial engineering with a background in Applied Mathematics. He held a postdoctoral position at C-MORE Lab (Center for Maintenance Optimization & Reliability Engineering), University of Toronto, Canada, working on machine learning and Operations Research (ML/OR) projects in collaboration with various industry and service sectors. He was with Department of Maintenance Support and Planning, Bombardier Aerospace with a focus on ML/OR methods for reliability/survival analysis, maintenance, and airline operations optimization. Nima is currently with Data Science & Analytics (DSA) lab, Scotiabank, Toronto, Canada, as senior data scientist. He has more than 40 peer-reviewed articles and book chapters published in top-tier journals as well as one published patent. He also invited to present his findings in some ML top conferences such as GRAPH+AI, NVIDIA GTC, TMLS, NeurIPS and ICML
Track: Traditional ML
Technical Level: 300 – Advanced
Abstract:
In the current literature on multivariate time series forecasting, causal inference is predominantly utilized for feature selection. In this talk, a novel causal model is introduced that can be directly employed for prediction without requiring supplementary forecasting models. The proposed model detects the causal relationships using Xi-Correlation (XiCorr) method, a new nonparametric statistic based on the rank of the data, in conjunction with Dynamic Conditional Correlation (DCC) method to study time varying causality among the variables. The target variable is forecasted via polynomial regression using the immediate causal predictors detected by XiCorr and verified by DCC method. The causal scores explain the effect of each predictor. The model was tested on a Canadian macroeconomic indexes dataset, showing promising backtest results.
What You’ll Learn:
While causal inference is an essential tool for explainability, it cannot be exclusively relied upon for prediction tasks. Instead, in the current context, it serves primarily for feature selection. The objective of this talk is to show how two capabilities, causal explainability and forecasting, can be combined as a single product.
Presenter:
Javad Farshchi, Data Scientist and Software Quality, COLTENE SciCan
About the Speaker:
Javad Farshchi is a Data Scientist specializing in AI-driven predictive maintenance, machine learning in regulated industries, and real-world AI applications in the healthcare and medical equipment industry. With expertise in scalable AI solutions and cloud-based ML deployments, he has played a pivotal role in driving enterprise-wide AI initiatives to enhance productivity and operational efficiency.
A published author of multiple peer-reviewed scientific articles, Javad has contributed to advancements in biosensors, microfluidics, and AI-driven diagnostics. His research has been presented at international conferences and published in leading scientific journals, bridging academia and industry to bring AI innovations into real-world applications.
At COLTENE SciCan, he has successfully led AI initiatives for predictive maintenance, working toward significantly reducing downtime in medical devices and improving reliability in industrial settings. Additionally, he has advocated for and led the integration of generative AI to streamline internal workflows, optimize information retrieval, and enhance data-driven decision-making across the organization.
Track: Traditional ML
Technical Level: 300 – Advanced
Abstract:
In medical devices and other regulated industries, unplanned downtime isn’t just costly—it can compromise patient safety, disrupt clinical workflows, and delay critical procedures. AI-driven predictive maintenance is transforming how organizations detect faults, prevent failures, and ensure reliability in high-stakes environments.
This talk explores how machine learning models are being used to predict failures in medical devices, highlighting the real-world impact of AI on device uptime, compliance, and operational efficiency. Topics covered include:
– Time-series modeling & anomaly detection for early fault detection in medical equipment
– ML techniques for predictive maintenance, including supervised and unsupervised learning
– Challenges and solutions for deploying AI in regulated environments
– Latest FDA guidelines and draft guidance for AI-enabled medical devices
– Lessons learned from real-world AI applications in healthcare technology
Attendees will walk away with practical insights into how AI is reshaping reliability, reducing downtime, and improving safety in regulated industries like healthcare—all while navigating the evolving regulatory landscape.
What You’ll Learn:
– AI-driven predictive maintenance enhances reliability: Machine learning models help detect faults early, reducing downtime and improving equipment performance.
– Regulatory challenges must be addressed: AI in medical devices requires compliance with evolving FDA regulations and careful risk management.
– Understanding FDA guidance for AI-enabled medical devices: Insights into the latest FDA draft guidelines and their impact on AI adoption in regulated industries.
– Best practices for AI in predictive maintenance: Effective model selection, validation, and monitoring strategies for real-world applications.
– The future of AI in reliability & safety: How AI is shaping next-gen predictive analytics, maintenance strategies, and compliance frameworks.
Presenter:
Erfan Pirmorad, Senior AI/ML Scientist, Vanguard
About the Speaker:
Erfan Pirmorad is a Senior AI/ML Scientist at Vanguard, where he leads the development of AI/ML solutions to detect and prevent financial fraud and enhance security across the enterprise. With a strong background in graph-based machine learning, large-scale data systems, and fraud detection, Erfan designs and deploys advanced systems that blend statistical modeling with graph intelligence. His work bridges theory and practice to deliver scalable, explainable, and high-impact solutions in financial risk and security domains.
Track: Traditional ML
Technical Level: 200 – Intermediate
Abstract:
Financial fraud is increasingly complex, evolving beyond isolated events into networks of coordinated activity. Detecting such threats demands models that not only assess individual risk, but also reason over the relationships between entities. At Vanguard, we’ve reimagined fraud detection as a structured reasoning problem, using graph-based machine learning to capture the complex relationships that underlie suspicious behavior.
This talk explores how graph-ML is deployed at scale to uncover hidden connections, detect emerging fraud patterns, and surface relational signals often missed by traditional approaches. We’ll walk through the architecture of a production-ready system that transforms raw signals into a dynamic, evolving graph, enabling proactive discovery and stronger risk mitigation.
Along the way, we’ll highlight how structured reasoning enhances model transparency, supports investigative workflows, and aligns machine learning outputs with business and regulatory needs. Whether you’re building fraud models or exploring graph learning for other domains, this session will offer practical insights into deploying graph-based intelligence in high-stakes environments.
What You’ll Learn:
– A practical framework for modeling fraud as a graph problem, capturing the complex relationships between accounts, devices, and behaviors in a way that surfaces hidden patterns.
– Key considerations for deploying graph machine learning at scale, including architectural choices that enable real-time updates and operational integration within high-stakes environments.
– Insights into how graph intelligence strengthens fraud detection, allowing teams to move from reactive investigations to proactive discovery of emerging fraud networks.
– How structured reasoning over graph data supports transparency and explainability, helping align machine learning insights with investigative workflows and regulatory expectations.
Presenter:
Luis Serrano, Founder & CEO, Serrano.Academy
About the Speaker:
Luis Serrano is the author of Grokking Machine Learning, and the creator of the popular educational YouTube channel Serrano.Academy, with over 175K followers. Luis has worked in artificial intelligence and language models at Cohere, Google, and Apple, and as a quantum AI research scientist at Zapata Computing. He has popular machine learning courses on platforms such as Udacity and Coursera. Luis has a PhD in mathematics from the University of Michigan, a masters and bachelors from the University of Waterloo, and did a postdoctoral fellowship at the University of Quebec at Montreal.
Track: Traditional ML
Technical Level: 100 – Beginner
Abstract:
In this talk, we’ll learn how embeddings and the attention model works in transformers, using an physical analogy of how objects gravitate towards each other in space. This is intended for all audiences, from beginner to expert.
What You’ll Learn:
Embeddings and the attention mechanism are some of the core concepts in large language models. In this talk, we lift the curtain on them, and show that they are actually very simple and intuitive.
Presenter:
Josh Goldstein, Solutions Architect, Weaviate
About the Speaker:
Josh Goldstein is a seasoned search engineer who specializes in building intelligent retrieval systems that bridge the gap between machine learning and meaning. With over a decade of experience across enterprise search, MLOps, and production-grade infrastructure, Josh has architected large-scale solutions that help people find what matters. When he’s not coding, you can find him playing racket sports, running a marathon, or regretting mechanical bulls.
Track: Vertical Enterprise AI Agents In Production
Technical Level: 200 – Intermediate
Abstract:
As Enterprises are rapidly embracing AI agents to automate workflows, augment decision-making, and transform customer experiences. These agentic systems grow in complexity and so does the overhead for users and developers. Bespoke data agents can help developers build and get use cases faster into production
This talk demonstrates how purpose-built data agents solve critical challenges in data discovery, integration, and governance that limit enterprise AI effectiveness. Through real-world case studies, developers and users will understand the power of bespoke agents like a data agent provides value to customers faster.
What You’ll Learn:
The value of bespoke agents (data agents) that can expedite time to production reducing the need to build out parts of complex agentic workflows so they can focus on the value of the overall agentic use case
Presenter:
Cong Wei, PhD Student, University of Waterloo
About the Speaker:
Cong Wei is a second-year PhD student in Computer Science at the University of Waterloo, supervised by Prof. Wenhu Chen, and a recent research intern at Meta. His research focuses on generative AI and multimodal LLMs, with a particular interest in diffusion models for simulation and digital humans. He has published at top-tier conferences such as ECCV, CVPR, and ICLR.
Track: Multimodal LLMs
Technical Level: 300 – Advanced
Abstract:
Recent advancements in video generation have achieved impressive motion realism, yet they often overlook character-driven storytelling—a crucial component for automated film and animation generation. We introduce Talking Characters, a more realistic task that involves generating full-body character animations directly from speech and text. Unlike traditional talking head generation, Talking Characters aims to produce the full portrait of one or more characters, extending beyond the facial region.
In this work, we propose MoCha, the first model of its kind for generating talking characters. To ensure precise synchronization between video and speech, we introduce a speech-video window attention mechanism that effectively aligns audio and visual tokens. To address the lack of large-scale speech-labeled video datasets, we propose a joint training strategy that leverages both speech-labeled and text-labeled videos, significantly improving generalization across diverse character actions.
We also design structured prompt templates with character tags, enabling—for the first time—multi-character conversations with turn-based dialogue. This allows AI-generated characters to engage in context-aware interactions with cinematic coherence. Extensive qualitative and quantitative evaluations, including human preference studies and benchmark comparisons, show that MoCha sets a new standard in AI-generated cinematic storytelling, achieving superior realism, expressiveness, controllability, and generalization.
What You’ll Learn:
Automated filmmaking and digital humans represent the future of storytelling — and MoCha takes a meaningful step toward making that future reality
Presenter:
Weiming Ren, PhD Student, University of Waterloo
About the Speaker:
Weiming Ren is a second year Ph.D. student at the Cheriton School of Computer Science, University of Waterloo, supervised by Prof. Wenhu Chen. His research interests include designing efficient model architectures and data curation pipelines to enhance large multimodal models (LMMs) for image and video understanding, as well as developing novel algorithms for controllable video generation, image and video editing, and image restoration.
Track: Multimodal LLMs
Technical Level: 300 – Advanced
Abstract:
State-of-the-art transformer-based large multimodal models (LMMs) struggle to handle hour-long video inputs due to the quadratic complexity of the causal self-attention operations, leading to high computational costs during training and inference. Existing token compression-based methods reduce the number of video tokens but often incur information loss and remain inefficient for extremely long sequences. In this work, we explore an orthogonal direction to build a hybrid Mamba-Transformer model (VAMBA) that employs Mamba-2 blocks to encode video tokens with linear complexity. Without any token reduction, VAMBA can encode more than 1024 frames (640×360) on a single GPU, while transformer-based models can only encode 256 frames. On long video input, VAMBA achieves at least 50% reduction in GPU memory usage during training and inference, and nearly doubles the speed per training step compared to transformer-based LMMs. Our experimental results demonstrate that VAMBA improves accuracy by 4.6% on the challenging hour-long video understanding benchmark LVBench over prior efficient video LMMs, and maintains strong performance on a broad spectrum of long and short video understanding tasks.
What You’ll Learn:
We develop a novelhybrid Mamba-Transformer model and show that hybrid models can achieve strong results for long video understanding tasks.
Presenter:
Nima Eshraghi, Machine Learning Engineer, Walmart Global Tech
About the Speaker:
Nima Eshraghi is a Senior Machine Learning Engineer at Walmart Global Tech, specializing in Multi-Modal Learning, Generative AI, and Computer Vision for augmented reality, image generation, and personalization in retail industry. Prior to that, he obtained a PhD from the University of Toronto in Electrical and Computer Engineering and has published in top conferences like ICML.
Track: Multimodal LLMs
Technical Level: 200 – Intermediate
Abstract:
In today’s retail landscape, high-quality product imagery is essential for driving customer engagement. However, traditional photoshoots are expensive and time-consuming, requiring products to be shipped to studios and meticulously staged. This process limits flexibility and delays content production, especially when adapting images for different seasons, styles, or marketing campaigns. This talk explores how Generative AI, powered by diffusion models, is transforming retail personalization by enabling the creation of customized, high-fidelity product images on demand—without the need for physical photoshoots. We will dive into advanced diffusion-based techniques, showcasing how they allow retailers to generate new images of specific products with fine-grained control. These methods empower brands to seamlessly adapt product visuals for different settings, atmospheres, and promotions.
By leveraging these AI-driven techniques, retailers can significantly reduce production costs, accelerate content creation, and enhance personalization, leading to a more dynamic and engaging shopping experience.
What You’ll Learn:
Attendees will learn how Generative AI with diffusion models replaces costly photoshoots by generating personalized, high-quality images, offering flexibility across different seasons and settings. They will discover how Generative AI reduces production costs and accelerates content creation. This approach provides scalability and enables dynamic personalization for retail imagery.
Presenter:
Hanieh Arjmand, PhD, Senior Machine Learning Researcher, Lydia.ai
About the Speaker:
Hanieh Arjmand is a Senior Machine Learning Researcher at Lydia.ai, where she designs and implements advanced machine learning models to tackle complex challenges in healthcare and insurance. She holds a PhD in Biomedical Engineering from the University of Toronto and brings deep expertise in applying AI to biomedical and health data. Throughout her academic and professional career, Hanieh has led diverse, data-driven research initiatives that drive innovation, support better clinical decision-making, and improve health outcomes.
Track: Multimodal LLMs
Technical Level: 300 – Advanced
Abstract:
This talk introduces a novel multimodal framework for disease prediction that integrates structured Electronic Health Records (EHR) and wearable time series data into a unified embedding space optimised for interpretation by Large Language Models (LLMs). While multimodal LLMs have shown promise in vision, audio, and text, applying them to healthcare presents unique challenges, including temporal dynamics, heterogeneous formats, and the need for clinical interpretability.
To address this, the system uses modality-specific encoders to transform each input stream into compact latent representations. These are integrated into a shared embedding space, allowing the LLM to reason jointly across modalities. By training the entire system end to end, including the LLM itself, the model learns rich, context-aware representations that link current behavioural signals to broader clinical trajectories. The architecture also supports auxiliary context, such as demographics or prompt instructions, embedded directly into the LLM’s input space, enabling dynamic adaptation to specific tasks or patient profiles.
Evaluation on UK Biobank data (n ≈ 70K) shows that the system outperforms single-modality baselines and that wearable data meaningfully influence predictions when integrated with EHR (correlation r = 0.771). While demonstrated on two modalities, the framework is inherently modular and can be extended to include additional data sources, such as nutrition or imaging, by introducing corresponding encoders.
This work illustrates how LLMs can evolve into adaptive, multimodal engines for real-time, patient-centric care, capable of synthesising diverse health data to support earlier interventions, continuous monitoring, and personalised clinical decision-making.
What You’ll Learn:
LLMs can move beyond text by reasoning over structured and temporal health data—enabling real-time, personalised clinical decision-making through modular, end-to-end multimodal architectures.
Presenters:
Ahmad Pesaranghader, Applied Reseach Scientist, CIBC | Jamal Kawach, Applied Reseach Scientist, CIBC | Yaqi Han, Quantum ML Research Scientist, CIBC
About the Speakers:
Ahmad Pesaranghader is a research scientist at CIBC. He obtained a Ph.D. in Computer Science from Dalhousie University, focusing on Machine Learning and Big Data. With years of experience in academia and industry, he has worked with various AI models and data modalities, including text, images, and biomedical data. In his free time, Ahmad enjoys exploring the city, photographing landmarks, visiting coffee shops, listening to music and science podcasts, playing guitar, reading, cooking, and experimenting with AI art.
Jamal Kawach is an Applied AI Research Scientist at CIBC. He is a mathematician with extensive research experience in mathematics and its applications to large networks, graph theory, and related structures. He obtained his Ph.D. in mathematics from the University of Toronto in 2021 before working as a postdoctoral researcher at the Computer Science Institute at Charles University, Prague. In addition to his work in academic research, he also has substantial teaching experience at the University of Toronto, where he teaches multivariable calculus and collaborates with teams of instructors to make mathematics accessible, relevant, and engaging.
Yaqi Han is a Quantum ML Research Scientist at CIBC. She holds a PhD in Astrophysics from the University of Florida and a BSc in Physics from Peking University. Prior to joining CIBC, Yaqi worked as a computational scientist at York University on large-scale systems like galactic halos and analyzing data for dark matter detection. Previously Yaqi was a Quantum Stream Fellow at Creative Destruction Lab working on Quantum ML algorithms. In her work, Yaqi developed solutions with language models, vision models, and other ML optimization techniques. Yaqi is passionate about untangling complex systems—whether in quantum algorithms, neural networks, or the fabric of the universe—and turning theoretical insights into scalable solutions. Outside of work, Yaqi enjoys reading, baking, and playing Dungeons & Dragons (would 100% nerd out).
Track: GenAI AI Ethics And Governance Within The Organization
Technical Level: 200 – Intermediate
Abstract:
As large language models (LLMs) grow increasingly larger, it is no longer possible to train them effectively with the memory capacity of single or a few GPUs. Practitioners must leverage multiple parallelization strategies simultaneously to achieve efficient training at scale. These strategies rely on strategic data and parameter sharding across devices and efficient collective communication operations to synchronize gradients and activations. This hands-on workshop will cover fundamental parallelism dimensions—data, tensor, and pipeline parallelism—and how to compose them effectively for training billion-parameter models. We will also cover some recent LLM specific parallelization techniques such as context parallelism. Through live coding and practical exercises, participants will implement each strategy from first principles, understand their trade-offs, and learn to optimize communication patterns and memory usage for maximum training throughput across distributed hardware.
What You’ll Learn:
– Understanding data and model sharding, and collectives involved in implementing sharded training
– Core principles of data, tensor, pipeline, and context parallelism for distributed LLM training
– How to compose multiple parallelism strategies and optimize communication patterns within and across devices
– Practical advice for memory management, and performance profiling in distributed settings
Presenters:
Greg Loughnane, Co-Founder & CTO, AI Makerspace | Chris Alexiuk, Co-Founder & CTO, AI Makerspace
About the Speakers:
Dr. Greg” Loughnane is the Co-Founder & CEO of AI Makerspace, where he is an instructor for their AI Engineering Bootcamp. Since 2021, he has built and led industry-leading Machine Learning education programs. Previously, he worked as an AI product manager, a university professor teaching AI, an AI consultant and startup advisor, and an ML researcher. He loves trail running and is based in Dayton, Ohio.
Chris “The Wiz” Alexiuk is the Co-Founder & CTO at AI Makerspace, where he is an instructor for their AI Engineering Bootcamp. During the day, he is also a Developer Advocate at NVIDIA. Previously, he was a Founding Machine Learning Engineer, Data Scientist, and ML curriculum developer and instructor. He’s a YouTube content creator YouTube who’s motto is “Build, build, build!” He loves Dungeons & Dragons and is based in Toronto, Canada.
Track: GenAI Deployments In Regulated Industries
Technical Level: 400 – Expert
Abstract:
While 2025 might be the year of agents for AI Engineers, it’s the year of practical RAG for enterprise and AI Engineering leaders.
In other words, RAG is table stakes; it’s a best-practice. If your organization isn’t even experimenting with RAG today, you’re behind.
The good news is that best-practice tools and techniques exist. That means that you and your team can pick up open-source Commercial-Off-The-Shelf (COTS) tools to build your first RAG application today.
🙋 But wait, what is the “Best-Practice RAG Application Stack?”
In this event, we share the minimum viable production-ready LLM app stack for building and evaluating your next RAG Application. Then, we’ll share how to baseline it and start improving it. Finally, we’ll comment on what you should think about to ensure that it will work well within your existing enterprise production setup and for your customers or stakeholders.
We’ve been testing out frameworks and tools with our students in [The AI Engineering Bootcamp](https://aimakerspace.io/the-ai-engineering-bootcamp/), our consulting customers, and on our YouTube channel for years now.
For 2025, we believe **there is a correct stack**.
Join us to discover what it is, and why!
We’ll explore:
– 🎺 Our pick for the best **orchestration framework**: LangGraph
– 🎺 Our pick for the best **monitoring & Visibility**: LangSmith
– ↗️ Our pick for the best **vector database**: QDrant
– 📊 Our pick for the best way to enhance retrieval out of the box: Cohere’s Rerank
– 📐 Our pick for the best **evaluation framework**: RAGAS
– 🚀 Our pick for the best **model serving** endpoint solution: Together AI
– 🤖 Our pick for the best **LLM** and **embedding model**: Join us live to find out!
In this event, we’ll also break down the phases of moving from prototype to production in enterprise, including:
– **Phase I: On-Prem Demo** (POC/MVP) with Executive/VP/Director buy-In
– **Phase II: Refined On-Prem Demo** with Engineering buy-In
– **Phase III: Data Preparation & Quality Validation** with buy-in from architects, data practitioners, and security
– Phase IV: **Beta Testing** with customer/stakeholder buy-in
– Phase V: **Scaling** a User-Friendly Product with product/design buy-in
And answer the question “what happens when we need to know move from Phase I into the organization and into a Cloud Service Provider (CSP)?”
– Our pick for best **Cloud Service Provider Integration**: Join us live to find out!
Finally,we’ll discuss the benefits of using this approach with LangGraph applications, as well as mentioning some other leading partnerships worth noting in the industry that prioritize speed into production (e.g., CrewAI on AWS).
Of course, we’ll build, ship, and share a production-grade RAG application step-by-step!
Join us live to dig into the details and get your questions answered, from concepts to code!
What You’ll Learn:
– How to construct a RAG application in 2025 according to best-practices
– Why there is a right answer to the “best” components for RAG apps in general
– How to think about building the best RAG app components into your existing operations
– How to ship your LangGraph RAG prototype to production according in 2025 best-practices
🤓 Who should attend the event:
– Aspiring AI Engineers who want to build, ship, and share production-grade LLM applications
– AI Engineering leaders who want to build the best possible RAG applications
Presenter:
Shashank Shekhar, Independent Researcher
About the Speaker:
Shashank Shekhar is an independent researcher and consultant who has worked with startups and companies in helping them build and scale data pipelines, machine learning models, as well as evaluation systems. Some of the companies he has consulted for include Vector Institute, Cohere, Erode AI, NextAI, Shell. Prior to this, he was the founder of Dice Health where he built real time speed and language AI solutions for healthcare providers – steering the company from inception to profitability. Even before, he was a researcher on scaling laws, reasoning and interpretability at Meta AI, Vector Institute, and Indian Institute of Science. His research has been cited over 1800+ times, and won various awards including the Best Paper award at NeurIPS 2022.
Track: Inference Scaling
Technical Level: 300 – Advanced
Abstract:
As large language models (LLMs) grow increasingly larger, it is no longer possible to train them effectively with the memory capacity of single or a few GPUs. Practitioners must leverage multiple parallelization strategies simultaneously to achieve efficient training at scale. These strategies rely on strategic data and parameter sharding across devices and efficient collective communication operations to synchronize gradients and activations. This hands-on workshop will cover fundamental parallelism dimensions—data, tensor, and pipeline parallelism—and how to compose them effectively for training billion-parameter models. We will also cover some recent LLM specific parallelization techniques such as context parallelism. Through live coding and practical exercises, participants will implement each strategy from first principles, understand their trade-offs, and learn to optimize communication patterns and memory usage for maximum training throughput across distributed hardware.
What You’ll Learn:
– Understanding data and model sharding, and collectives involved in implementing sharded training
– Core principles of data, tensor, pipeline, and context parallelism for distributed LLM training
– How to compose multiple parallelism strategies and optimize communication patterns within and across devices
– Practical advice for memory management, and performance profiling in distributed settings
Presenters:
Amna Jamal, National Data and AI Expert, IBM Canada | Julia Olmstead, AI ML Ops Technical Specialist – Canadian Signature Coverage: Ontario Government and Wes, IBM Canada | Kyle Sava, Data & AI Technical Specialist – Select, IBM Tech | Alissa Furet, AI/MLOps Technical Specialist, IBM Canada
About the Speakers:
Amna Jamal is a seasoned Data and AI expert at IBM, with over eight years of experience in data management, data science, and AI. As Watsonx Technical Leader for Canada, she drives innovation at the intersection of data and AI, helping organizations unlock their full potential.
Specializing in Information Management, DataOps, and ModelOps, Amna delivers customized solutions that optimize business processes and enhance revenue streams. Amna is a trusted advisor to senior leadership, guiding strategic decision-making and solving technical challenges. She leads a team of technical specialists, ensuring they meet client requirements and business objectives. Her ability to align diverse
Julia Olmstead supports the design and architecture of AI governance, MLOps, and automation solutions through deep expertise and strategic insight for public sector and enterprise clients at IBM Canada. With a focus on human-computer interaction and trustworthy AI, she helps design platforms that balance user needs with the demands of evolving AI technologies. She also ensures these platforms align with regulatory standards—enabling organizations to adopt AI and machine learning responsibly and effectively. Julia has also spearheaded national hackathons to bring cutting-edge AI capabilities to universities across Canada. She is passionate about bridging innovation with accountability and champions inclusive, ethical AI through design work, academic outreach, and mentorship.stakeholders around a shared vision has resulted in transformative solutions.
Kyle Sava is an AI/MLOps Technical Specialist at IBM Canada with three years of experience with the company, helping organizations implement and scale AI solutions across various industries. With a dual degree in Computer Science and Statistics from the University of British Columbia and over 10 years of coding experience, Kyle brings exceptional technical depth to complex AI implementation challenges. He specializes in designing and architecting AI frameworks, MLOps automation solutions, and helping clients navigate the full AI lifecycle from concept to production. His expertise enables organizations to deploy scalable, ethical, and business-focused AI systems that drive measurable outcomes. Kyle also teaches AI and mathematics part-time, further demonstrating his commitment to advancing the field of artificial intelligence.
Alissa Furet helps organizations scale and govern AI initiatives through IBM’s Watsonx platform. As an AI & ML Ops Technical Specialist, she works closely with clients to design and deploy trusted AI solutions that support model development, validation, and monitoring in regulated environments. Through strategic engagements and hands-on expertise, Alissa enables teams to accelerate their AI adoption while ensuring model transparency, accountability, and measurable business value.
Track: Agents Zero To Hero
Technical Level: 200 – Intermediate
Abstract:
In this immersive workshop, participants will learn to build and deploy an Agentic AI application prototype tailored to an enterprise use case. The bootcamp will focus on three primary tracks:
Intelligent Assistant, where participants will train an assistant for customer service support using pre-loaded and real time data
HR Automation, where attendees will develop an HR Assistant to respond to employee inquiries, and
Business Automation, where agents will be utilized to automate a core business process through a structured approach of discovery, deconstruction, and development.
This hands-on bootcamp is designed to equip attendees with the skills and knowledge needed to harness the power of Agentic AI and drive innovation within their enterprises.
What You’ll Learn:
Learn: develop your skills and best practices on AI Agents
Use case & data definition: bring to life one use-case from our list of top industry use cases
Implementation: apply your skills hands-on to build an Agentic AI prototype alongside with our AI experts
Deployment: demonstration of how it would look in real life and the trust considerations needed to go to production
Presenter:
W. Ian Douglas, Developer Advocate, Block Open Source Developer Platforms
About the Speaker:
Ian was been working in engineering and API architecture for most of his career in tech. He’s been focusing on Developer Advocacy roles and technical education for the past 9 years, and loves to teach new skills at meetups, in workshops and talks. He’s currently working at Block on the Open Source Developer Programs team, and learning all about how AI can interact with APIs and MCP.
Track: Agents Zero To Hero
Technical Level: 200 – Intermediate
Abstract:
Tired of your AI sleeping on the job? Time to wake it up with some real-world data. In this hands-on workshop, you’ll transform a basic REST API into a powerful tool your AI agent can actually USE. No more static responses – we’re building bridges between AI and APIs using Model Context Protocol (MCP).
What you’ll build:
– Your first MCP wrapper (spoiler: it’s simpler than you think)
– A bridge between web APIs and AI agents that actually works
– A live demo that proves your AI agent isn’t just making things up
The 45-minute coding journey breaks down into:
– Quick dive into MCP’s API superpowers
– Roll up your sleeves and build an MCP wrapper
– Watch your AI agent flex its new API muscles
Bring your laptop with Python installed – we’re coding this together. While we’ll have sample code on hand if you need it, the real fun is in building it yourself. Basic Python skills and REST API knowledge will help, but if you can write a “for” “loop and know what an API endpoint is, you’re ready to roll.
What You’ll Learn:
We’ll be learning how MCP relates to APIs, specifically around RESTful APIs, and how to build an MCP server for a RESTful API, so any AI Agent can access dynamic data.
Presenters:
Pranav Arya, Manager, AI, TELUS | Liz Lozinsky, Engineering Manager, Fuel iX – Telus Digital
About the Speakers:
Pranav is a dynamic AI leader who brings together expertise in both engineering and product development. His strategic approach focuses on translating business opportunities into actionable AI solutions, having successfully led numerous critical AI initiatives across TELUS. As a driving force behind the GenAI transformation at TELUS, he continues to shape the organization’s artificial intelligence landscape with innovative solutions.
Within the AI Accelerator at TELUS, Pranav leads the AI Growth Team which pushes the boundary of Machine Learning and AI within the organization by introducing new technologies within the TELUS AI ecosystem.
Track: Multimodal LLMs
Technical Level: 200 – Intermediate
Abstract:
In this workshop session, we’ll explore FueliX, a cutting-edge GenerativeAI SaaS platform developed by Telus, Telus Digital Experiences, and Willow Tree. Participants will discover how this model-agnostic platform is revolutionizing enterprise productivity and creativity workflows. The presentation will demonstrate FueliX’s versatile approach to AI implementation, showcasing its practical applications in real-world scenarios. Whether you’re a technology leader, developer, or business strategist, this session will provide valuable insights into leveraging FueliX’s capabilities for your organization’s AI initiatives.
What You’ll Learn:
Participants will understand how to leverage FueliX’s model-agnostic architecture to implement flexible AI solutions in their organizations, enabling them to build scalable productivity and creativity workflows that aren’t constrained by specific AI model limitations
Presenters:
Prashanth Rao, AI Engineer, Kuzu | Chang She, CEO & Co-Founder, LanceDB
About the Speakers:
Prashanth is an AI & DevRel engineer at Kùzu Inc., a graph database startup based in Waterloo, Ontario. He has a background in NLP, machine learning & data engineering and has held multiple roles at these capacities in various industries. Along with his advocacy work with Kuzu, Prashanth is passionate about evangelizing open source software, exploring the latest developments in AI and data infra, and regularly communicates about them on his blog, thedataquarry.com.
Chang She is the CEO and cofounder of LanceDB, the multimodal data lake for AI. A serial entrepreneur, Chang has been building DS/ML tooling for nearly two decades and is one of the original contributors to the pandas library. Prior to founding LanceDB, Chang was VP of Engineering at TubiTV, he focused on personalized recommendations and ML experimentation.
Track: Advanced RAG
Technical Level: 200 – Intermediate
Abstract:
In this workshop, we will explore an underexplored dimension of GraphRAG–the integration of image– and gently introduce the audience to the idea of Multimodal GraphRAG, which is the next frontier of RAG that brings image data to the forefront of graph-based reasoning and retrieval. Attendees will gain insights into how multimodal Graph RAG integrates the semantic richness of images and text with the contextual reasoning power of graphs, providing a comprehensive, explainable, and actionable approach to solving complex data challenges.
What You’ll Learn:
We will demonstrate that blending of semantic context and data (text, visual, etc) enables reasoning across multiple levels of abstraction and provides explainability in GraphRAG architectures.
Presenters:
Karthik Guruswamy, Advanced AI Strategist, Teradata | Gary Class, Director, Industry Strategy for Financial Services, Teradata
About the Speakers:
Karthik Guruswamy is a AI Strategist at Teradata. His primary role is to innovate with customers in the field on various AI initiatives, especially running AI language models in-database. Karthik is based out of San Francisco Bay Area and has worked with AI startups such as H2O.AI. He was also a co-founder of a few startups in Silicon Valley. In his spare time, Karthik bikes, hikes in the weekends and sometimes you can see him working on improving his golf game in bay area golf course.
Gary is an accomplished industry strategist with extensive experience in financial services, where he has made significant contributions to advanced analytics and AI. Gary spent over three decades at Wells Fargo Bank as the Director of Advanced Analytics at the forefront of innovation during the transformational era of “anytime, anywhere” banking. His visionary leadership has shaped the landscape of financial services through innovation, data-driven insights, and strategic thinking.
Track: AI for Productivity Enhancements
Technical Level: 300 – Advanced
Abstract:
Teradata is known for its massively parallel processing database that powers large-scale enterprise applications. Its Vantage platform is available both on-prem and on the cloud, and supports SQL and AI/ML Analytics with a shared-nothing, linearly scalable paradigm along with various compute design patterns. Users can use an SQL client or Python/R libraries to access the database.
Most recently, Teradata has enabled running pre-trained Huggingface Embedding Encoder and Seq2Seq language models inside the platform using existing CPUs only as an interesting option, allowing users to embed that capability into existing ETL workflows and BI Dashboards – all with workload management! The workshop provides a primer on how the language models can be easily installed in the database, run with GPU-like inference performance by exploiting the parallel CPU harness, and integrate with various CX applications and novel use cases and workflows.
What You’ll Learn:
You don’t need GPU access for inferencing for small language models. Enable it with a parallel CPU harness like Teradata, alongside BI workloads
Presenter:
Niels Bantilan, Chief Machine Learning Engineer, Union.ai
About the Speaker:
Niels is the Chief Machine Learning Engineer at Union.ai; a core maintainer of Flyte, an open source workflow orchestration tool; and creator of Pandera, a data validation and testing tool for dataframes. His mission is to help data science and machine learning practitioners be more productive.
He has a Masters in Public Health Informatics, and prior to that a background in developmental biology and immunology. His research interests include reinforcement learning, NLP, ML in creative applications, and fairness, accountability, and transparency in automated systems.
Track: Advanced RAG
Technical Level: 300 – Advanced
Abstract:
So you’ve built and deployed a RAG proof-of-concept at your organization with an off-the-shelf framework in a few days, and it looks like it’s working…now what? Just as in previous generations of AI and ML, the journey of a RAG application doesn’t end once you’ve deployed the model: it requires constant iteration to maintain and improve its performance. Unlike traditional ML deployments, however, AI applications like RAG involve more complex operational and infrastructure requirements, such as heterogeneous compute, vector stores, bootstrapping evaluation datasets, and LLM model hosting.
In this workshop, you’ll learn what it takes to maintain and iterate on RAG applications, from prototype to production, by building a chat assistant using the Union.ai platform.
Attendees will take away the core concepts and techniques required to systematically improve their RAG applications while adopting best software engineering practices.
What You’ll Learn:
-How to maintain and iterate on RAG applications, from prototype to production (via the example of building a chat assistant)
-Software engineering best practices when working with RAG applications
-How to simplify MLOps in the RAG context (i.e., LLMOps) to avoid wasting time managing infrastructure and hardware, while also controlling compute costs and maintaining visibility
Presenter:
Abhijeet Mazumdar, Head of AI, AsteroidX
About the Speaker:
Abhijeet Mazumdar is a seasoned AI leader and inventor with four patents in applied AI. As Head of MLOps & Cloud at Binoloop, he oversaw the delivery of FedRAMP‑compliant AI software for the U.S. Department of Defense, architected end‑to‑end ML pipelines on Kubernetes, and drove a $3 million ARR increase through production‑grade agentic and generative AI products.
Previously, as CTO & Head of AI at Clearspot AI, Abhijeet built a no‑code AI platform for drone software deployment and led a 30‑member interdisciplinary team to multimillion‑dollar ARR. At Intel, he designed and deployed NexGPT, an internal LLM platform with over 60k users that halved development time, and at GE Vernova he developed AI‑driven control algorithms that boosted wind turbine performance by 15% and generated over $30 million in customer value.
In “Retrieval Reimagined – Master LLMs and Embeddings using Local OSS,” Abhijeet draws on his hands‑on experience—from the open‑source KRS Kubernetes MCP server to large‑scale RAG and embedding pipelines—to show how ML enthusiasts and developers can leverage local OSS tools for secure, scalable, high‑performance retrieval‑augmented systems
Track: Opensource Model Finetuning
Technical Level: 300 – Advanced
Abstract:
So you’ve built and deployed a RAG proof-of-concept at your organization with an off-the-shelf framework in a few days, and it looks like it’s working…now what? Just as in previous generations of AI and ML, the journey of a RAG application doesn’t end once you’ve deployed the model: it requires constant iteration to maintain and improve its performance. Unlike traditional ML deployments, however, AI applications like RAG involve more complex operational and infrastructure requirements, such as heterogeneous compute, vector stores, bootstrapping evaluation datasets, and LLM model hosting.
In this workshop, you’ll learn what it takes to maintain and iterate on RAG applications, from prototype to production, by building a chat assistant using the Union.ai platform.
Attendees will take away the core concepts and techniques required to systematically improve their RAG applications while adopting best software engineering practices.
What You’ll Learn:
-How to maintain and iterate on RAG applications, from prototype to production (via the example of building a chat assistant)
-Software engineering best practices when working with RAG applications
-How to simplify MLOps in the RAG context (i.e., LLMOps) to avoid wasting time managing infrastructure and hardware, while also controlling compute costs and maintaining visibility
Presenter:
Malikeh Ehghaghi, Machine Learning Research Scientist, Vector Institute
About the Speaker:
Malikeh is a machine learning researcher at the Vector Institute and an incoming PhD student at the University of Toronto, where she will be supervised by Professor Colin Raffel. She also co-hosts the Women in AI Research (WiAIR) podcast and actively advocates for women in technology. She holds an MScAC degree in Computer Science from the University of Toronto and brings over five years of industry research experience from companies including Winterlight Labs, Cambridge Cognition, and Arcee AI. Her research interests span Modular Machine Learning, Model Merging, Efficient LLMs, and topics in Interpretability and Fairness. Her work has been published in leading conferences such as EMNLP, COLING, ACL, MICCAI, and AAAI.
Track: Opensource Model Finetuning
Technical Level: 200 – Intermediate
Abstract:
The traditional belief that scaling up model parameters and data volume is the sole path to enhanced performance in large language models (LLMs) is being challenged by innovative strategies that prioritize efficiency and cost-effectiveness. DeepSeek’s success story serves as a testament to the potential of thoughtful data engineering and meticulous model design in achieving superior AI performance without incurring prohibitive costs. This presentation delves into an overview of state-of-the-art data-centric and model-centric strategies for training language models, aiming to achieve optimal performance at minimal costs. We first talk about the rise of small language models (SLMs) as cost-efficient alternatives for dense large language models. On the data-centric side, we explore techniques such as data mixing, filtering, or deduplication to enhance dataset quality. On the model-centric front, we cover advanced approaches including pruning, distillation, parameter-efficient finetuning, quantization, and model merging to streamline model architectures without compromising performance. Together, these approaches demonstrate that strategic data preparation and model training can produce superior language models without the massive financial investments traditionally considered necessary for scaling AI systems.
What You’ll Learn:
Scaling up parameters and data isn’t the sole path to building high-performing large language models. Reaching strong performance across domains also depends on strategic data curation and architectural design. This talk explores alternative pathways to achieving these results.
Presenter:
Frederic Marier, Senior Director Quantitative Risk Modeling and RESL Advanced Analytics, CIBCe
About the Speaker:
Frederic Marier is a Senior Director at CIBC specialising in quantitative risk modeling. Over his tenure at CIBC he has pioneered the use of Machine Learning models in Credit Risk, Fraud and AML.
He is a graduate of the John Molson School of Business (B. Comm) and Queen’s Business School (Master Management Analytics).
Track: Traditional ML
Technical Level: 300 – Advanced
Abstract:
Our current model operations process uses static models that are exposed to model, process and data drift. Dynamic models are one of the best mitigant to these risks, however, they pose additional challenges from a monitoring, governance and regulation point of view. Through collaboration we developed a model operation cycle that balances automation and controls, allowing us to reap the benefits of dynamic models while maintaining our model risk posture.
What You’ll Learn:
Through an applied example you will learn how to implement a dynamic model operational process. The presentation will focus on the cross-functional collaboration needed within a regulated environment and give an overview of the systems and techniques leveraged. We will introduce unique testing concepts that can be used to mitigate model risks while maintaining model risk posture
Presenter:
Loubna Ben Allal, Research Engineer, Lead Hugging Face, SmolLM
About the Speaker:
Loubna Ben Allal is a Research Engineer in the Science team at Hugging Face, where she leads the training of small language models (SmolLMs) and their data curation. Previously, she worked on large language models for code and was a core member of the BigCode team behind The Stack datasets and StarCoder models for code generation.
Track: Opensource Model Finetuning
Technical Level: 300 – Advanced
Abstract:
On-device language models are revolutionizing AI by making advanced models accessible in resource-constrained environments. In this talk, we will explore the rise of small models and how they are reshaping the AI landscape, moving beyond the era of scaling to ever-larger models. We will also cover SmolLM, a series of compact yet powerful LLMs, focusing on data curation, and ways to leverage these models for on-device applications.
What You’ll Learn:
Small, well-trained language models—built through smart design and thoughtful data curation—can deliver impressive performance, making them ideal for on-device use.
Presenter:
Pierre-Luc Vaudry, Senior AI Scientist, National Bank of Canada
About the Speaker:
Pierre-Luc Vaudry is a Senior AI Scientist at National Bank of Canada. With a strong background in artificial intelligence and machine learning, Pierre-Luc has been instrumental in driving innovative projects within the organization, working on chatbots, search engines, and complaint management automation.
Pierre-Luc has worked in AI R&D in the industry since 2017, after completing a PhD in Computer Science. His expertise spans various domains of AI, including deep learning, natural language processing, and natural language generation. His passion for AI is evident through his published data-to-text papers, industry conference talks on AI-driven email security, and the GenAI trainings he created for employees of the Bank.
Track: GenAI Deployments In Regulated Industries
Technical Level: 300 – Advanced
Abstract:
The complaint management automation project aims to harness the power of cost-effective Generative AI (GenAI) to enhance compliance and improve operational efficiency in the management of customer complaints. By leveraging advanced AI technologies, the project seeks to automate the detection of potentially systemic issues within complaints, ensuring comprehensive coverage and timely resolution, and the summarization of complaints for regulatory purposes.
The current manual process involves analyzing merely a sample of complaints, which is both time-consuming and prone to errors. With the implementation of GenAI, the project now achieves 100% coverage of complaints without incurring additional costs. Using a combination of GenAI, traditional ML, and automation, the AI models automatically detect non-compliant complaints and identify potential candidates for systemic issues. This information is then presented in a convenient format for human validation and investigation. This approach not only streamlines the complaint management process but also mitigates the risk of regulatory violations, which can result in serious sanctions, including significant fines.
To achieve this, the project includes the cost-effective use of GenAI to generate fictitious complaint examples based on systemic issue criteria, allowing for a more accurate comparison with actual complaint texts. The AI will assign a similarity score to each complaint, facilitating the identification of systemic issues and enabling targeted interventions by human experts.
Additionally, the project incorporates the automation of complaint summarization using GenAI to produce regulatory reports. This involves generating concise and uniform summaries of complaints and their resolution, ensuring that only relevant information is included. In addition, a human is always involved to validate and enrich the generated summaries when necessary. The automation of complaint summarization reduces operational risks and enhances the efficiency of the complaint management process.
By integrating GenAI into the complaint management system, the project aims to enhance compliance, reduce operational costs, and improve overall efficiency, ultimately leading to better customer satisfaction and regulatory adherence.
What You’ll Learn:
GenAI solutions are available to assist with regulatory compliance in a straightforward and economical manner. Namely, GenAI can be used once to generate synthetic data to fuel traditional machine learning or services can be paid on a per use basis with a custom prompt, without the need for a dedicated architecture.
Presenter:
Manas Madine, Senior Machine Learning Engineer/Applied Scientist, AMD
About the Speaker:
Manas Madine is Sr MLE/ Applied Scientist at AMD where he developed debug triage tools and hierarchical attention mechanisms for tokenization, leveraging advanced deep learning methods to enhance root cause analysis in x86 processor simulations. He also built scalable natural language analytics systems using Azure OpenAI Gateway and MongoDB, where he experimented with GPT-4, Llama-based models, and reinforcement learning from human feedback. Before this he was a Computer Science graduate student at the University of Massachusetts Amherst with a strong background in software engineering and machine learning. He received his Bachelor’s degree in Computer Science and Engineering from the Indian Institute of Technology Kharagpur.
Before joining Umass, Manas was a Software Engineer at PayPal, where he led compliance-critical projects, migrated legacy C++ systems to RESTful frameworks, and improved email reconciliation through Kafka-based enhancements. As a Cloud Software Developer Intern at Bidgely, he created a cloud-agnostic schema-management solution with AWS S3 and API layers. Manas’s research interests include out-of-distribution robustness in large language models, semantic rewriting for bridging distribution gaps, and time-series analysis for healthcare applications. In addition to authoring papers on topics such as OOD sentiment analysis and heart sound segmentation, he holds a patent related to AI-based railway crossing monitoring. He is proficient in C++, Python, Java, and various AI frameworks, and has been recognized for his innovative work through multiple awards, including a Generative AI Hackathon honor at PayPal and an appreciation letter from Bidgely’s Senior Vice President.
Track: AI for Productivity Enhancements
Technical Level: 300 – Advanced
Abstract:
LLM advancements have primarily benefited natural language tasks and certain programming language domains. However, the hardware development sector has not fully reaped these benefits due to a lack of high-quality pretraining resources for assembly code. This gap raises the question: **How can we leverage breakthroughs in language technology to streamline hardware development, especially for GPUs that, in turn, accelerate large language models (LLMs)?**
Building on existing progress in language modeling and code generation, the next logical step is to expand these capabilities into hardware-centric data such as assembly code, micro-architecture logs, and real-time debugging workflows. A key obstacle is that fully retraining or pretraining large models from scratch is both costly and time-consuming. Therefore, a more practical approach is to adapt (or “trick”) existing LLMs into generating valid, high-quality assembly code. In doing so, we must overcome challenges like suboptimal tokenization and confusion caused by hexadecimal notation, both of which can mislead model attention and context tracking.
To address these issues, we propose a novel **hierarchical attention mechanism** specifically designed for hardware-oriented data. This approach restructures the model’s attention layers to more effectively handle unique tokens and numeric formats, mitigating the limitations of standard tokenization. Drawing on experience in debugging triage, simulation log analysis, and advanced error-handling techniques in x86 processor pipelines, our method refines token representations and elevates the model’s capacity for granular context capture. As a result, LLMs become better equipped to generate precise assembly code, thereby accelerating the hardware development lifecycle and fostering a virtuous cycle where improved hardware directly fuels further advances in large-scale AI systems.
What You’ll Learn:
1. Expanding LLM Utility Beyond Software: Attendees should realize that large language models can extend beyond general coding tasks to hardware-centric challenges—such as assembly code generation, micro-architecture debugging, and simulation log analysis—offering transformative benefits to the hardware industry.
2. Adapting Rather Than Retraining: The high costs and complexity of retraining models from scratch underscore the need for practical adaptation techniques. By cleverly repurposing existing LLMs, teams can achieve meaningful progress without prohibitive overhead.
3. Importance of Novel Tokenization & Attention Mechanisms: Hexadecimal notation and low-level instructions pose significant tokenization challenges. An enhanced, hierarchical attention mechanism can address these issues, emphasizing how careful token management and context structuring lead to higher-quality assembly code output.
4. Holistic Impact on Hardware Development: Automating and improving assembly code generation, bug triage, and performance optimization via specialized LLMs sets off a positive feedback loop: faster hardware development contributes to even more powerful AI accelerators, fueling future innovations in machine learning.
5. Cross-Disciplinary Collaboration: Finally, attendees should understand the value of bridging expertise in AI, systems engineering, and hardware design. By uniting these domains, the industry can unlock a new wave of productivity and innovation in both hardware and large-scale AI.
Presenter:
Marcelo Lotif, Senior Software Developer, Vector Institute
About the Speaker:
Seasoned Software Engineer with 17 years of experience in the market, 8 of those as a Machine Learning Engineer in both startups at various stages and big tech companies like Apple.
Track: MLOps for Smaller Teams
Technical Level: 200 – Intermediate
Abstract:
This talk will go over the fundamentals of inferencing pipelines, their types and differences over other kinds of pipelines, and present a Infrastructure-as-code (IaC) reference implementation developed for the very successful AI Deployment Bootcamp that was ran with Vector Institute’s industry partners in late 2024. The reference implementation uses Terraform and Python code for GCP and AWS backends and is designed to be used as a kickstarter for inferencing pipelines with MLOps and SWD best practices in mind.
What You’ll Learn:
How to design effective inferencing pipelines and use the provided reference implementation as a kickstarter for your project.
Presenters:
Dhari Gandhi, Associate AI Project Manager, Vector Institute for Artificial Intelligence | Lucas Hartman, Applied AI Design & Delivery Intern, Vector Institute for Artificial Intelligence | Shabnam Hassani, GenAI and NLP Technical Specialist, Vector Institute for Artificial Intelligence
About the Speakers:
Dhari Gandhi is an Associate AI project manager at Vector Institute with a strong background in managing AI projects, data science, data governance, and technical strategy. She currently supports applied AI initiatives at the Vector Institute, where she collaborates with academic, industry, and government partners to advance the design and delivery of trustworthy and impactful AI solutions. Dhari is a graduate of the Schulich School of Business Master of Management in Artificial Intelligence (MMAI) program and is a certified Project Management Professional (PMP). She combines technical fluency with cross-sector project management experience. She was recently recognized as a Rising Star in AI and finalist for the “Young Role Model of the Year” at the Women in AI Awards North America 2025.
Lucas Hartman is an Applied AI Design & Delivery Intern at the Vector Institute, where he supports sponsor-driven AI education initiatives and projects focused on scalable deployment of responsible AI applications. He is also a Research Assistant at Western University, where his work explores reinforcement learning strategies for electric vehicle charging optimization. Lucas holds a Bachelor’s degree in Software Engineering from Western University and is currently pursuing his Master’s in Engineering Science with a focus on Artificial Intelligence. He is a recipient of the Vector Scholarship in Artificial Intelligence and a Best Paper Award winner at the IEEE International Conference on Machine Learning and Applications.
Shabnam Hassani is a PhD researcher in Computer Science at the University of Ottawa, specializing in the application of Natural Language Processing and Machine Learning to regulatory compliance and requirements engineering. She currently serves as a Technical Specialist at the Vector Institute, where she supports Canadian startups in AI integration. Shabnam has extensive experience working on fine-tuning large language models (LLMs), developing NLP systems for legal-tech, and leading AI governance initiatives. Previously, she was a Machine Learning Associate at the Vector Institute and New Software, where she contributed to cutting-edge AI projects in compliance and summarization. Her work bridges research and real-world application, with a strong focus on ethical AI and responsible innovation.
Track: AI Ethics And Governance Within The Organization
Technical Level: 200 – Intermediate
Abstract:
This talk presents a multi-level, actionable guide for governing Generative AI (GenAI) systems across diverse organizational contexts. Drawing on literature, comparative governance frameworks (NIST, MIT, Turing Institute), and practical insights from an industry roundtable consisting of industry professionals, the talk highlights real-world risk mitigation strategies. Attendees will learn how to operationalize Responsible GenAI governance through role-based implementation, sector-specific risk tools, and scalable practices like sandbox testing and continuous monitoring. The talk will also demo the Responsible GenAI Governance Guide (ResAI)—a chatbot-powered website designed to help organizations scale governance literacy and decision-making.
What You’ll Learn:
How to translate abstract GenAI risk frameworks into practical enterprise governance strategies. Why scalable governance must integrate strategy, compliance, and technical implementation. How organizations can map, mitigate, and monitor GenAI risks across the model lifecycle. Exposure to the GenAI Risk Mapping tool and the Responsible GenAI Governance Guide (ResAI).
Presenter:
Kirit Thadaka, Senior PMM, Gretel (Acquired by NVIDIA)
About the Speaker:
Kirit Thadaka is a Staff Product Manager at Gretel (just acquired by NVIDIA), leading Gretel Data Designer—the world’s first compound AI system for generating high-quality synthetic data. Previously, he was a Senior Product Manager at AWS, overseeing Amazon SageMaker’s ML experimentation capabilities. With over a decade of AI/ML experience across startups and large enterprises, Kirit has deep expertise in technical leadership, core research and development, and solution architecture. He specializes in building innovative AI/ML platforms that help organizations harness the full potential of artificial intelligence.
Track: GenAI Deployments In Regulated Industries
Technical Level: 400 – Expert
Abstract:
AI-powered financial tools—from tax automation to fraud detection—require access to sensitive data that’s often locked away due to privacy regulations. But what if you could train, test, and scale these systems without ever handling real user data?
In this session, Kirit Thadaka will walk through how to generate realistic, privacy-safe 1040 tax form data using synthetic data pipelines. You’ll learn how to simulate personal and financial details (e.g., income, filing status, deductions) with statistical fidelity, structure it for downstream applications, and comply with strict data protection standards. The talk will also explore how synthetic data unlocks automation use cases in finance, reduces time spent on data provisioning, and prevents inadvertent exposure of sensitive information.
What You’ll Learn:
How to design synthetic data pipelines for structured financial forms
Techniques for balancing realism and privacy in personal data
Applications in tax AI, software QA, and fraud model training
Presenter:
Ian Colbert, AI Research Staff, AMD
About the Speaker:
Ian Colbert is an AI Research Staff at Advanced Micro Devices (AMD), based in San Jose, California. He received his Ph.D. in Machine Learning & Data Science from the Electrical and Computer Engineering Department at UC San Diego. During his 7 years at AMD, he has published several papers in top-tier AI conferences such as ICCV and ICML and regularly gives guest lectures on quantization of deep neural networks and LLMs at various universities, including UC San Diego and University of Toronto.
Track: Hardware Platforms
Technical Level: 100 – Beginner
Abstract:
The quality of deep neural networks has scaled with the size of their training datasets and model architectures. To reduce the rising cost of querying these increasingly large networks, researchers and practitioners have explored a handful of techniques in both hardware and software. One of the most impactful developments has been low-precision quantization, in which a neural network is constrained to require narrower data formats during storage and/or computation. In this talk, I will present an overview of quantization as it pertains to neural networks. I will review the fundamental concepts that state-of-the-art techniques have built on top of. Then, I will introduce Brevitas, AMD’s PyTorch quantization library. Finally, we will discuss exciting active research areas at AMD.
What You’ll Learn:
Inference costs of deep learning (e.g., LLMs) continue to increase. In this talk, I will provide an overview of the fundamentals of neural network quantization and an overview of Brevitas, our open-source library for quantization (+328k downloads, +1.3k stars).
Presenter:
Qaish Kanchwala, Senior Manager, Machine Learning Engineering, The Weather Company
About the Speaker:
Qaish Kanchwala is a Senior Manager, Machine Learning Engineering at The Weather Company, where he leads both the ML Engineering and LLM Platform teams. He oversees the development, deployment, and optimization of machine learning and large language model (LLM) platforms, enabling AI-driven solutions for customer advertising, personalization, and health condition predictions.
With deep expertise in MLOps, scalable AI infrastructure, and enterprise AI adoption, Qaish ensures the seamless integration of cutting-edge AI solutions within complex business environments. His leadership has been instrumental in driving innovation and operational excellence across AI initiatives.
Track: GenAI Deployments In Regulated Industries
Technical Level: 300 – Advanced
Abstract:
As enterprises increasingly integrate AI into their operations, Large Language Models (LLMs) are emerging as powerful tools for automation, customer engagement, and decision-making. However, building and deploying an LLM platform at an enterprise scale comes with unique challenges, including model selection, infrastructure design, data security, governance, and cost optimization.
In this session, we will explore the end-to-end journey of creating an enterprise-grade LLM platform—from initial architecture decisions to deployment strategies and ongoing maintenance. We will discuss best practices for leveraging open-source vs. proprietary solutions, ensuring compliance with data privacy regulations, and optimizing performance for real-world applications. Additionally, we will highlight how the weather company has used LLMs to solve use cases within the weather company.
What You’ll Learn:
Using open source tools to build a LLM platform and foster experimentation to provide value
Presenter:
Nitesh Soni, Global Head of Data Science (Commercial), Sanofi
About the Speaker:
Nitesh Soni is a distinguished leader in Data Science and AI, boasting over 18 years of diverse experience across the pharmaceutical, finance, consulting, and research industries. Since joining Sanofi, Nitesh has been instrumental in driving commercial excellence to enhance patient outcomes by strategizing, developing and operationalizing high-priority, scalable AI products. He leverages data, expert AI, and Generative AI (GenAI) to create solutions that are not only best-in-class but also compliant and responsibly built.
Throughout his career, Nitesh has excelled in driving digital transformation within large, complex global organizations. He is renowned for his ability to build and lead diverse, high-performing global data science teams.
A highlight of Nitesh’s research career includes being part of the groundbreaking team that discovered the Higgs boson particle in 2012, a milestone that underscores his commitment to advancing scientific knowledge and innovation.
Track: GenAI Deployments In Regulated Industries
Technical Level: 300 – Advanced
Abstract:
At Sanofi, we are fully committed to transforming the practice of medicine to improve patient outcomes by harnessing the power of advanced digital technologies and artificial intelligence (AI) at scale. Within our commercial function, we are leveraging the full potential of AI and Generative AI (GenAI) to enhance customer engagement through hyper-personalized experiences, optimize sales and marketing efforts, and improve market access by proactively anticipating market trends.
This presentation will provide an overview of our approach to building and operationalizing AI solutions at scale, while ensuring the ethical and safe use of AI. We are dedicated to fostering innovation to drive commercial excellence and deliver value to patients, healthcare providers, and stakeholders worldwide.
What You’ll Learn:
– Understanding our strategic approach to integrating AI within the organization.
– Insights into the journey from pilot projects to full-scale deployment in a highly regulated, large-scale organization.
-An overview of the key machine learning (ML) and deep learning algorithms, along with the tech stack we utilize.
-Strategies to increase AI adoption rates within the business and methods to measure its direct impact.
-Addressing ethical concerns and ensuring controlled, responsible AI development.
-How we adjust and integrate new AI advancements, such as Generative AI, into our existing frameworks.
Presenter:
Amit Jaspal, Engineering Manager AI, Meta Platforms
About the Speaker:
I’m a senior technology leader with 14+ years of experience, currently leading ecommerce recommendation teams at Meta Platforms, Inc. My work spans high-impact ML solutions, including privacy-compliant pipelines, GPU-optimized infrastructure, and sequence models that have directly driven revenue growth.
Previously at LinkedIn and D.E. Shaw, I’ve consistently delivered at-scale innovations across ads, feed ranking, and financial systems. I’ve led multi-org initiatives, scaled teams by 50%, and driven results through both strategic leadership and hands-on expertise.
I’m also an active contributor to the research community—invited speaker at WSDM and Kumo workshops, and reviewer for top conferences like KDD, SIGIR, and UMAP.
Track: Traditional ML
Technical Level: 400 – Expert
Abstract:
Popularity bias, a well-known challenge in recommender systems, occurs when these systems disproportionately favor popular items, potentially limiting the visibility of niche content while impacting both users and item providers.In this talk we will begin by defining popularity bias and exploring its multifaceted origins, stemming from inherent data imbalances, algorithmic tendencies, and self-reinforcing feedback loops within dynamic recommendation processes. We will then delve into various mechanisms proposed to control this bias, ranging from pre-processing and in-processing model modifications to post-processing techniques, including dynamic debiasing strategies and novel approaches like False Positive Correction. Finally, we will critically examine the common assumption that popularity bias is always detrimental, discussing scenarios where recommending popular items might be beneficial and considering the complex trade-offs between bias mitigation and overall user experience in real-world applications.
What You’ll Learn:
What Popularity Bias Is and Why It Happens
Understand how popularity bias arises from data imbalance, algorithmic tendencies, and feedback loops in recommender systems.
Techniques to Mitigate the Bias
Learn about key strategies across pre-processing, in-processing, post-processing, dynamic debiasing, and advanced methods like False Positive Correction.
Bias Isn’t Always Bad
Discover when recommending popular items can be beneficial and how context shapes whether bias is truly harmful.
Navigating Trade-Offs
Gain insight into the trade-offs between reducing bias and optimizing user experience, engagement, and content diversity.
Presenter:
Hamza Farooq, CEO & Founder, Traversaal.ai | Adjunct Stanford
About the Speaker:
Hamza Farooq is an AI Startup founder, educator, researcher, and practitioner with years of experience in cutting-edge AI development. He has worked with global organizations, governments, and top universities, including Stanford and UCLA, to design and deploy state-of-the-art AI solutions. Hamza is the author of Building LLM Applications from Scratch and the founder of Traversaal.ai, a company specializing in Enterprise Knowledge Management and AI guardrails.
Known for his engaging teaching style and deep technical expertise, Hamza has trained thousands of students and professionals to master AI concepts and build production-ready applications.
Track: Agents Zero To Hero
Technical Level: 200 – Intermediate
Abstract:
This workshop is designed to provide participants with a comprehensive understanding of designing and building AI agents from the ground up. Moving beyond reliance on pre-built frameworks like CrewAI or Autogen, this session emphasizes learning the core mechanics of agent development to enable fully customizable solutions.
Led by Hamza Farooq, a seasoned AI expert and educator, the workshop is both technically rigorous and highly practical. Participants will gain hands-on experience in building intelligent agents capable of autonomous decision-making, task orchestration, and real-world problem-solving.
By the end of the workshop, attendees will walk away with the knowledge and tools needed to develop robust, scalable, and production-grade AI agents tailored to their specific use cases.
What You’ll Learn:
1. Learn Core Fundamentals: Understand the architecture and foundational concepts of AI agents, including reasoning frameworks, decision trees, and multi-agent orchestration.
2. Build Agents from Scratch: Gain hands-on experience in coding AI agents from the ground up, bypassing pre-built frameworks for maximum customization.
3. Implement Advanced Techniques: Explore cutting-edge approaches like semantic chunking, task decomposition, and performance optimization for agents.
Presenter:
Niels Bantilan, Chief Machine Learning Engineer, Union.ai
About the Speaker:
Niels is the Chief Machine Learning Engineer at Union.ai; a core maintainer of Flyte, an open source workflow orchestration tool; and creator of Pandera, a data validation and testing tool for dataframes. His mission is to help data science and machine learning practitioners be more productive.
He has a Masters in Public Health Informatics, and prior to that a background in developmental biology and immunology. His research interests include reinforcement learning, NLP, ML in creative applications, and fairness, accountability, and transparency in automated systems.
Track: Agents Zero To Hero
Technical Level: 200 – Intermediate
Abstract:
There’s been a lot of excitement about AI agents recently, and with emerging tools like Deep Research, Claude Code, and Cursor, many organizations are investing in the development of agentic applications for their particular problem domains, from customer service to scientific research. But how does one build an AI agent, much less deploy it to production in a reliable way?
This talk will discuss when to consider building a full stack agent yourself and dive into an overview of the core infrastructure and software components needed to host your own LLMs, MCP server, vector store, frontend UI, and evaluation harness. Through Union, I’ll briefly demonstrate how to orchestrate the full life cycle of an agentic app, starting from loading and caching open weights models to reasoning about evaluations.
What You’ll Learn:
-Agentic applications exist on a spectrum and can have some or none of the following properties: reasoning, tool use, and memory
-You don’t have to build the full stack of your agent from the ground up, but you need to make sure it has, at a minimum, an effective UI and an eval harness
-Then, you can pick and choose which parts you want to own and which parts you want to delegate to a provider
Presenter:
Kaustubh Prabhakar, Member of Technical Staff, OpenAI
About the Speaker:
Member of Technical Staff at OpenAI working on Memory and Personalization
Track: Future Trends
Technical Level: 200 – Intermediate
Abstract:
TBD
What You’ll Learn:
Role of Memory and Personalization in AI systems
Presenter:
Chang She, CEO & Co-Founder, LanceDB | Ryan Vilim, Member of the Technical Staff, character.ai
About the Speaker:
Chang She is the CEO and cofounder of LanceDB, the multimodal data lake for AI. A serial entrepreneur, Chang has been building DS/ML tooling for nearly two decades and is one of the original contributors to the pandas library. Prior to founding LanceDB, Chang was VP of Engineering at TubiTV, where he focused on personalized recommendations and ML experimentation.
Ryan is a Member of the Technical Staff at character.ai where he bridges the divide between the data world and the machine learning world. At Character he helps build the datasets and pipelines that power our core multimodal and chat products. Prior to Character he led machine learning at Bowery Farming and was a software engineer focusing on computer vision at Sidewalk Labs. Toronto is near and dear to Ryan’s heart as he earned a Ph.D. in Physics from the University of Toronto in 2015.
Track: Data Preparation and Processing
Technical Level: 300 – Advanced
Abstract:
Modern AI applications frequently require diverse retrieval methods and complex analytical queries across multimodal data such as text, video, audio, and images. Traditionally, teams must rely on multiple specialized data stores, introducing data duplication, synchronization complexities, and increased infrastructure costs.
In this joint talk, LanceDB and Character AI showcase a unified multimodal data lake solution built upon the innovative open-source Lance columnar format. LanceDB uniquely supports diverse retrieval and analytical workloads within a single cohesive system, offering scalability and superior cost-performance benefits. Character AI leverages LanceDB’s architecture to handle complex filtering, metadata queries, and retrieval tasks across text, audio transcripts, and video captions, significantly streamlining multimodal data exploration and analytics.
We will discuss how this unified data approach reduces complexity, optimizes infrastructure efficiency, and empowers sophisticated, multimodal AI applications.
What You’ll Learn:
AI applications face retrieval challenges, driving the rise of vector databases. However, AI workflows demand more—feature store retrieval and analytical queries are just as essential. This often leads to AI data being stored in separate silos and queried using separate systems, increasing cost and complexity.
This is why we worked on eliminating this complexity by unifying vector search, feature retrieval, and SQL-based analytics within a single system, built on the open-source Lance columnar format—the new standard for AI data. This architecture redefines the performance-scale-cost curve, delivering hyper-scalable search at 10X better cost-efficiency. By combining advanced search indices, 100X faster random access, and optimized caching, LanceDB breaks the impossible triangle of performance, scale, and cost.
Presenter:
Dr. Walid Amamou, CEO, UbiAI Inc
About the Speaker:
From nanotechnology research to developing AI applications, my career journey has been defined by a drive to solve complex problems.
After earning a Ph.D. in Materials Science—where I studied graphene spintronics and successfully published several articles in top-tier journals—I joined Intel to develop next-generation semiconductor devices, where we successfully ramped up a unique technology from scratch.
Today, I am the founder of UBIAI, a leading platform for training language AI models that serves hundreds of customers, including Fortune 500 companies across multiple industries such as finance, healthcare, and supply chain.
My goal is to create applications that accelerate scientific discovery by leveraging AI.
Track: GenAI Deployments In Regulated Industries
Technical Level: 300 – Advanced
Abstract:
While general-purpose Large Language Models (LLMs) have demonstrated impressive capabilities, their limitations become apparent in high-accuracy, domain-specific applications. In healthcare, for example, a study evaluating AI-generated clinical responses found that OpenAI’s GPT produced hallucinations, struggling to interpret complex medical terminology reliably. Similarly, early legal applications revealed that ChatGPT could fabricate citations, as in a widely publicized case where an attorney unknowingly submitted non-existent AI-generated references. These reliability gaps can lead to serious risks—ranging from misdiagnosis and misinformation to privacy violations and legal missteps. Such outcomes highlight the need for a more targeted approach when deploying LLMs in specialized domains. In this talk, we will explore how fine-tuning LLMs is becoming essential for enterprises seeking to adapt general models to specific tasks or industries. Beyond improving accuracy and reliability, fine-tuned models can reduce computational costs and offer stronger privacy safeguards, making them a strategic advantage for mission-critical applications.
What You’ll Learn:
The core message of this talk is that general-purpose LLMs, while powerful, are not reliable enough for high-stakes or domain-specific use cases without adaptation. Attendees will learn why fine-tuning is essential to improve accuracy, reduce risks like hallucinations or misinformation, and meet the specific requirements of industries such as healthcare, legal, or finance. They’ll also gain an understanding of how fine-tuned models can lead to better performance, lower costs, and enhanced data privacy—making them a strategic asset for enterprise adoption.
Presenter:
Aashu Singh, Senior Staff Software Engineer, Meta
About the Speaker:
Aashu Singh is a Senior Staff Software Engineer at Meta, where he leads cutting-edge initiatives in Multimodal Large Language Models for content understanding across Facebook and Instagram recommendation systems. With over 9 years at Meta and significant expertise in machine learning, Aashu specializes in developing AI solutions that bridge the gap between content comprehension and personalized recommendations.
In his current role within the MRS Content Understanding team, Aashu is pioneering approaches that leverage multimodal LLMs to enhance relevance throughout Meta’s recommendation stack. His work transforms how machine learning systems interpret and process content across different modalities to deliver more intuitive and personalized user experiences.
Aashu has co-authored several publications in the field of multimodal AI, including “Transfer between Modalities with MetaQueries” (2025) and “CompCap: Improving Multimodal Large Language Models with Composite Captions” (2024).
Previously, Aashu made substantial contributions to Meta’s advertising ecosystem, developing sophisticated models for Ads Retrieval/Ranking and Dynamic Ads, with a particular focus on modeling user interests to capture both long-term and short-term engagement patterns.
Track: Multimodal LLMs
Technical Level: 200 – Intermediate
Abstract:
This talk explores how the integration of Large Language Model reasoning capabilities is fundamentally transforming industry-scale recommendation systems, shifting from traditional pattern-matching approaches to cognitive frameworks that deliver unprecedented personalization and transparency.
Recent research demonstrates that LLM reasoning, particularly through Chain-of-Thought (CoT) prompting, significantly enhances recommendation quality by addressing the inherent subjectivity and personalization challenges of recommendation tasks. Rather than simply matching users to items based on statistical correlations, LLMs now employ sophisticated reasoning processes to assess user preferences and generate appropriately ranked recommendations. They also provide transparent explanations that elucidate their suggestions in natural language, dramatically improving system clarity.
The presentation will delve into recent industry implementations and research developments from early 2025 and look at how they have leveraged LLM-powered reasoning to enhance recommendation quality through techniques such as query understanding, metadata enrichment, and innovative evaluation frameworks. We’ll examine how multi-agent LLM orchestration enables collaborative reasoning processes where models share insights to generate more accurate conclusions, and explore key paradigms emerging in current research.
The talk will conclude with insights on how LLM reasoning is opening new frontiers for personalization without requiring curated gold references or human raters, pointing toward a future where recommendation systems don’t just predict what users want but understand why they want it—ultimately creating more meaningful and effective user experiences.
What You’ll Learn:
Application of Multi-modal LLM reasoning in industrial scale recommendation systems
Presenter:
Eliezer Bernart, Staff Data Scientist, TELUS
About the Speaker:
Eliezer has over 10 years of experience in Software Development and Computer Science, spanning both industry and academia. His expertise is primarily in the domain of Data Science, where he focuses on scaling solutions and supporting technical development. At TELUS, on AI Accelerator Eliezer contributes to the team by developing solutions at the code level and designing architectures for various AI applications.
Track: MLOps for Smaller Teams
Technical Level: 300 – Advanced
Abstract:
Containers streamline AI Development by providing consistent environments and efficient dependency management across development and production systems. Containers can enable you to self host many solutions for tracking of your experiments metrics or your LLM performance. In this workshop you will learn in practice how to leverage the power of containers for speeding up your AI Dev Experience process!
What You’ll Learn:
How to use Containers in the development environment to have access without cost to the most recent tools in Machine Learning and GenAI. Some of the examples are MLFlow, Langfuse and other platforms.
Presenter:
Phani Dathar, Ph.D, Director, Graph Data Science, Neo4j
About the Speaker:
Phani is a Director of Graph Data Science at Neo4j. He is a computational scientist and holds a PhD in Nanotechnology and Computational Materials Science. After a decade of experience in computational sciences research in both industry and academia, he transitioned to a career in data science and machine learning and over the past ten years, worked as a consultant, architect and advisor in the AI/ML space. Currently, he is with Neo4j helping prospects and customers design, architect and develop applications using graph technology, graph analytics and GraphRAG.
Track: Advanced RAG
Technical Level: 400 – Expert
Abstract:
Join us for the Workshop on Graph Technology and GraphRAG, designed for developers, data scientists, architects, and application owners.
The focus of our workshop is to dive into the world of graph technology, exploring how knowledge graphs can effectively model both unstructured and structured data. Neo4j graph data science algorithms provide additional insights to the knowledge graph, enhancing its capacity to derive new relationships and uncover hidden patterns. We will also discuss GraphRAG (Graph powered retrieval augmented generation) where knowledge graphs provide the context for LLMs to reduce hallucinations and provide accurate and relevant responses.
The associated notebooks, code examples and blogs provide hands-on experience building a practical application using the Neo4j database and related tooling to implement GraphRAG. This session is ideal for developers, data scientists, and architects eager to leverage the power of graphs in their projects.
What You’ll Learn:
Graph Technology: Overview of graph databases & Cypher
KnowledgeGraphs + GenAI: Learn how to power trustworthy GenAI with Neo4j
Best Practices: Integrate Neo4j into your workflow
The associated notebooks, code examples and blogs provide hands-on experience building a practical application using the Neo4j database and related tooling to implement GraphRAG. This session is ideal for developers, data scientists, and architects eager to leverage the power of graphs in their projects.
Presenter:
Eddie Mattia, AI Engineer, Outerbounds
About the Speaker:
Eddie Mattia is a data scientist with a background in applied math, and experience working in a variety of customer-facing and R&D roles. He currently works at Outerbounds to help customers and open-source practitioners build AI systems and products.
Track: GenAI Deployments In Regulated Industries
Technical Level: 200 – Intermediate
Abstract:
You never fine-tune an LLM just once. You need a setup that supports continuous iteration: curating data, refining prompts, optionally fine-tuning or post-training models, and evaluating and deploying changes, all in a reliable loop. In this talk, we’ll break down how to build an LLM factory: a modular, developer-friendly setup you can run in your own environment to power custom AI agents and differentiated AI-powered products and services which set you apart from the competition.
What You’ll Learn:
You need a closed loop system to continuously build and iterate on differentiated AI products which go beyond off-the-shelf APIs.
Presenter:
Simba Khadder, Founder & CEO, Featureform
About the Speaker:
Simba Khadder is the Founder & CEO of Featureform. Simba has built the ML infrastructure and models that powered personalization for over 100M monthly active users. He instilled his learnings into Featureform’s feature platform and agentic enrichment platform. Featureform unlocks your enteprise data to agents through MCP. He’s also an avid surfer, a mixed martial artist, a published astrophysicist for his work on finding Planet 9, and he ran the SF marathon in basketball shoes.
Track: Vertical Enterprise AI Agents In Production
Technical Level: 200 – Intermediate
Abstract:
Agents need access to real enterprise data, but today, most are basically summarizers of docs and help centers. In this talk, we’ll explore how the Model Context Protocol (MCP) enables secure, governed access to internal systems like Postgres, Redis, and Iceberg. We’ll walk through real-world patterns for agentic enrichment, challenges in productionizing data access, and why every enterprise will need an MCP-native interface. Learn how Featureform built the first production-grade MCP server and what it takes to make agents enterprise-ready.
What You’ll Learn:
Enterprise agents are only as powerful as the data they can safely access. MCP is emerging as the standard for enabling this access, but making it work in real-world environments requires more than a spec. Attendees will learn how governed, semantic, real-time data access is essential for production-ready agents, and why the infrastructure layer beneath MCP matters just as much as the interface.
Business Leaders: C-Level Executives, Project Managers, and Product Owners will get to explore best practices, methodologies, principles, and practices for achieving ROI.
Engineers, Researchers, Data Practitioners: Will get a better understanding of the challenges, solutions, and ideas being offered via breakouts & workshops on Natural Language Processing, Neural Nets, Reinforcement Learning, Generative Adversarial Networks (GANs), Evolution Strategies, AutoML, and more.
Job Seekers: Will have the opportunity to network virtually and meet over 30+ Top Al Companies.
Ignite what is an Ignite Talk?
Ignite is an innovative and fast-paced style used to deliver a concise presentation.
During an Ignite Talk, presenters discuss their research using 20 image-centric slides which automatically advance every 15 seconds.
The result is a fun and engaging five-minute presentation.
You can see all our speakers and full agenda here