Reproducibility & Data Version Control
for LangChain & LLM/OpenAI Models

FREE Virtual Workshop
Nov. 29th,1PM EST

Proudly Sponsored By

Presenter: Amit Kesarwani
Director of Solution Engineering
lakeFS by Treeverse

8th Annual Toronto Machine Learning Summit (TMLS) 2024

Celebrating Canadian Applied AI Innovation

July 10th - Virtual Workshops & Talks
July 11th - In-person Workshops
July 12th & 15th - In-person talks & Networking
July 13th - Optional Hackathon (add on)
July 14th - Hike at High Park

July 12th-15th In-Person
at RBC WaterPark Place

Drop us a line: info@torontomachinelearning.com
Sponsorships: faraz@torontomachinelearning.com

Why Attend

A Unique Experience

For 8 years TMLS has hosted a unique blend of cutting-edge research, hands-on workshops, & vetted industry case studies reviewed by the Committee for your team’s expansion & growth.

We emphasize community, learning, and accessibility.

Join Today

Explore the Uncharted Frontiers of Generative AI

Big Ideas Showcase

See groundbreaking innovations and meet the innovators pushing technological boundaries in Gen-AI.

Explore & Network

Explore real-world case studies and cut through the hype, gain valuable insights into the latest advancements, trends and advances around its deployment in production environments in this rapidly evolving field. Network with fellow practitioners and business leaders.

How Does it Work?

Virtual Talks & Workshops
Join 13 talks and hands-on workshops virtually

July 10th
9:30 AM – 5:05 PM EST

Up Skill via Workshops
In-person bonus hands-on workshops

101 College St,
Toronto, ON M5G 1L7

July 11th
9:30 AM – 4:15 PM EST

Network via Community App
Introduce yourself and meet others attending the conference
Plot Your Schedule and Attend the Summit!
See 60+ talks, case-studies in various tracks and Industries
88 Queens Quay W,
Toronto, ON M5J 0B6

July 12th & 15th
8:45 AM – 4:50 PM

Paid Add-On Hackathon

“Finetuning Open Source LLMs for Fun and Profit” hackathon

You can register separately – no summit ticket purchase necessary

$50 for non-summit ticket purchases

192 Spadina Ave.,
Toronto, ON M5T 2C2
July 11th
9 AM – 8 PM EST
Hike at High Park
Join us at High Park for an afternoon to network and wind-down with fellow attendees

High Park North Gates

1873 Bloor St W, Toronto, ON M6R 2Z3

July 14th
11 AM – 2 PM EST

Event Speakers

Suhas Pai

CTO, Hudson Labs
Talk Title: Making RAG (Retrieval Augmented Generation) Work

Angela Xu

Angela Xu

Director, Risk Control and Fraud Analytics, CIBC
Talk Title: Revolutionizing Fraud Prevention: Harnessing AI and ML to Safeguard Banking from Fraud

Yannick Lallement

Chief AI Officer, Scotiabank
Talk Title: Gen AI in Banking: Lessons Learned

Kiarash Shamsi

Kiarash Shamsi

ML Researcher, Wealthsimple
Talk Title: LLMs for Revolutionizing Credit Risk Assessment

Bhuvana Adur Kannan

Bhuvana Adur Kannan

Lead - Agent Performance & ML Platform, Voiceflow
Talk Title: Building and Evaluating Prompts on Production Grade Datasets

Sasha Luccioni

Sasha Luccioni

AI and Climate Leader, Hugging Face
Talk Title: Connecting the Dots Between AI Ethics and Sustainability

Patrick Tammer

Patrick Tammer

Senior Investment Director, Scale AI
Talk Title: Successfully Integrating Ai in Your Strategy and Business Operations – Lessons Learnt from Investing

Patricia Thaine

Patricia Thaine

Co-Founder & CEO, Private AI
Talk Title: Building Customer Trust in the Generative AI Era

Amir Hossein Karimi

Assistant Professor, University of Waterloo
Talk Title: Advances in Algorithmic Recourse: Ensuring Causal Consistency, Fairness, & Robustness

Mahdi Torabi Rad

Mahdi Torabi Rad

President, MLBoost
Workshop: Uncertainty Quantification with Conformal Prediction: A Path to Reliable ML Models

Gayathri Srinivasan

Senior AI/ML Product Manager, Wattpad
Talk Title: Optimizing Recommendations on Wattpad Home

Estefania Barreto

Estefania Barreto

ML Engineer, Recursion Pharmaceuticals
Talk Title: Industrializing ML Workflows in Drug Discovery

Winston Li

Winston Li

Founder, Arima
Talk Title: Dynamic Huff's Gravity Model with Covariates for Site Visitation Prediction

Royal Sequeira

Royal Sequeira

Machine Learning Engineer, Georgian
Workshop: Optimizing Large Language Model Selection for Efficient GenAI Development

Jullian Yapeter

Jullian Yapeter

Machine Learning Scientist, Signal 1
Talk Title: AI for Hospitals at Scale

Everaldo Aguiar

Everaldo Aguiar

Senior Engineering Manager, PagerDuty
Panel: RAGs in Production: Delivering Impact Safely and Efficiently

Rohit Saha

Machine Learning Scientist, Georgian
Workshop: Leveraging Large Language Models to Build Enterprise AI

Prashanth Rao

AI Engineer, Kùzu, Inc.
Workshop: Kùzu - A Fast, Scalable Graph Database for Analytical Workloads

Jaime Tatis

Jaime Tatis

VP-Chief Insights Architect, TELUS
Talk Title: How Is GenAI Reshaping the Business?

Greg Loughnane

Co-Founder, AI Makerspace
Workshop: Open-Source Agentic RAG with LangChain & Mistral-7B

See Full Agenda | Reserve your spot today!

Sponsors & Partners

Platinum Sponsor
Sponsors
Exhibitors

Interested in Partnering? Email Faraz at faraz@torontomachinelearning.com

Who Attends

Data Practitioners
0 %
Researchers/Academics
0 %
Business Leaders
0 %

2023 Event Demographics

Highly Qualified Practitioners*
0 %
Currently Working in Industry*
0 %
Attendees Looking for Solutions
0 %
Currently Hiring
0 %
Attendees Actively Job-Searching
0 .0%

2023 Technical Background

Expert
17.5%
Advanced
47.3%
Intermediate
21.1%
Beginner
5.6%

2023 Attendees & Thought Leadership

Speakers
0 +
Company Sponsors
0 +

Business Leaders: C-Level Executives, Project Managers, and Product Owners will get to explore best practices, methodologies, principles, and practices for achieving ROI.

Engineers, Researchers, Data Practitioners: Will get a better understanding of the challenges, solutions, and ideas being offered via breakouts & workshops on Natural Language Processing, Neural Nets, Reinforcement Learning, Generative Adversarial Networks (GANs), Evolution Strategies, AutoML, and more.

Job Seekers: Will have the opportunity to network virtually and meet over 30+ Top Al Companies.

Why

TMLS

TMLS is a community response addressing the need to unite academic research, industry opportunities and business strategy in an environment that is safe, welcoming and constructive for those working in the fields of ML/AI.

See our team and learn more about the Toronto Machine Learning Society here.

Tickets

This event has ended
This event is no longer available.
Talk Title: Making RAG (Retrieval Augmented Generation) Work

Presenter:
Suhas Pai, CTO, Hudson Labs

About the Speaker:
Suhas Pai is a NLP researcher and co-founder/CTO at Hudson Labs a Toronto based startup. At Hudson Labs, he works on text ranking, representation learning, and productionizing LLMs. He is also currently writing a book on Designing Large Language Model Applications with O’Reilly Media. Suhas has been active in the ML community, being the Chair of the TMLS (Toronto Machine Learning Summit) conference since 2021 and also NLP lead at Aggregate Intellect (AISC). He was also co-lead of the Privacy working group at Big Science, as part of the BLOOM open-source LLM project.

Talk Track: Applied Case Studies

Talk Technical Level:  6/7

Talk Abstract:
The RAG (Retrieval Augmented Generation) paradigm drives a large proportion of LLM-based applications. However, getting RAG to work beyond prototypes is a challenging ordeal. In this talk, we will go through some of the common pitfalls encountered when implementing RAG along with techniques to alleviate them. We will showcase how robustness can be built into the design of the RAG pipeline and how to balance them against factors like latency and cost.

What You’ll Learn:
What can go wrong with RAG?

Techniques to alleviate RAG shortcomings – specifically, tightly coupled models, layout and context-aware fine-tuned embeddings, retrieval text refinement, query expansion, and interleaved retrieval.

Talk Title: Revolutionizing Fraud Prevention: Harnessing AI and ML to Safeguard Banking from Fraud

Presenters:
Angela Xu, Director, Risk Control and Fraud Analytics, CIBC | Kemi Borisade, Senior Fraud Data Analyst, CIBC

About the Speakers:
Angela Xu brings over 15 years of strategic data analytics experience in premier financial institutions to the Toronto Machine Learning Summit Conference. As a seasoned technical expert and strategic thinker, Angela has demonstrated success in developing and implementing innovative strategies. With a Master’s degree in Statistics from the Georgia Institute of Technology in Atlanta, USA, and another Master’s degree in Computer Science from China, Angela possesses a diverse skill set that she leverages to drive initiatives to tangible results.

Currently leading the Risk Control & Fraud Analytics team at CIBC, Angela focuses on regulatory breach reporting and fraud strategies for secured and unsecured lending products such as mortgages, loans, and lines of credit. Her leadership is characterized by a commitment to generating innovative ideas, influencing stakeholders, and delivering real value to both her organization and its clients.

Passionate about leveraging cutting-edge technologies to solve complex problems, Angela is dedicated to applying the latest advancements in machine learning and data analytics to add value to her company and enhance the experiences of its clients.

Talk Track: Business Strategy or Ethics

Talk Technical Level:  3/7

Talk Abstract:
In 2023, the Canadian Anti-Fraud Centre reported staggering losses of over CAD $550 million due to fraudulent activities, underscoring the urgent need for advanced security measures. At CIBC, we confront the dynamic challenges of this evolving landscape head-on by embracing cutting-edge tools, technologies, and methodologies.
Our journey is marked by formidable obstacles, including the limitations of rule-based fraud strategies, the delicate balance between sales and risk mitigation, inadequate tools for documentation validation, and the pressing demand for rapid fraud assessment. To address these challenges, our team embarked on a transformative path, leveraging next-generation self-learning Machine Learning models supplemented with custom thresholds. This approach enhances fraud detection capabilities, minimizes false positives, optimizes sales strategies, and fortifies client protection.
Furthermore, through strategic partnerships, we’ve embraced solutions such as Optical Character Recognition (OCR) to streamline documentation validation processes. Exploring the integration of graph databases, Natural Language Processing (NLP), and foundational models, we aim to unlock new frontiers in fraud prevention.
The culmination of our efforts heralds a new era in security, where the synergy of advanced AI and ML technologies promises unparalleled efficiency and efficacy in combating fraud. Join us as we unveil the future of fraud prevention in Canadian banking.

Additional notes

Dear Organizers and Evaluators,

I hope this letter finds you well. Over the years, I have had the privilege of attending the Toronto Machine Learning Summit Conference, and each time, I have found immense value in the exchange of ideas and the learning opportunities it provides. It has been a platform where I have personally benefited and grown in my understanding of machine learning and its applications.

This year, I am excited to contribute to the conference by sharing insights into the latest trends and technologies in fraud detection within financial institutions. My presentation aims to raise awareness among the audience about the critical importance of fraud prevention measures for both institutions and individuals alike. By exploring the advancements in machine learning and artificial intelligence, I hope to inspire discussions on innovative strategies to safeguard company assets and personal finances.

Fraud prevention is a pressing concern in today’s interconnected world, and I believe that through collaboration and knowledge-sharing at events like the Toronto Machine Learning Summit Conference, we can collectively work towards more effective solutions. I am eager to engage with fellow attendees, exchange perspectives, and explore new avenues for leveraging technology in the fight against fraud.

What You’ll Learn:
After attending this presentation, you will gain a comprehensive understanding of the prevailing fraud challenges within the financial industry. You will also acquire foundational knowledge of next-generation near real-time self-learning Machine Learning models, along with insights into their fundamental concepts. Additionally, you’ll explore advanced cutting-edge technologies utilized in fraud detection, equipping you with valuable insights into the evolving landscape of financial security.

Talk Title: Gen AI in Banking: Lessons Learned

Presenter:
Yannick Lallement, Chief AI Officer, Scotiabank

About the Speaker:
Yannick Lallement is the VP & Chief AI Officer at Scotiabank, where is works on developing the use of AI/ML technologies throughout the Bank. Yannick holds a PhD in artificial intelligence from the French National Institute of Computer Science. Prior to joining Scotiabank, Yannick worked on a series of AI/ML projects for different public and private organizations.

Talk Track: Applied Case Studies

Talk Technical Level:  2/7

Talk Abstract:
I will present Scotiabank’s Gen AI journey so far, from collecting ideas across the bank in an inventory all the way to our first use cases in production, and share what we learned along the way on how Gen AI applies to the industry (examples will be about banking, but lessons will be applicable across).

What You’ll Learn:
How Gen AI can effectively be useful, how to find the right use cases, how to deploy it at scale.

Talk Title: LLMs for Revolutionizing Credit Risk Assessment

Presenters:
Josh Peters, Data Science Manager, Wealthsimple | Kiarash Shamsi, ML Researcher, Wealthsimple

About the Speakers:
Josh is a Data Science Manager at Wealthsimple. For the last 2 years, he has led the development of the company’s first credit risk models and created the data pipelines to support new credit products.

Prior to Wealthsimple, Josh spent 7+ years working on Data Science problems in the insurance, banking and fraud spaces through his time at Accenture and Airbnb.

Josh’s educational background is in Finance, Statistics and Computer Science.

Kiarash Shamsi is a Ph.D. student at the University of Manitoba, and currently working as financial ML researcher at Wealthsimple. He has published as a first author in conferences such as NeurIPS, ICLR, and ICBC. His research interests are Large language models, temporal graph learning, graph neural networks, topological data analysis, and blockchain data analysis and systems.

Talk Track: Applied Case Studies

Talk Technical Level: 4/7

Talk Abstract:
The session on leveraging Large Language Models (LLMs) in revolutionizing credit risk assessment will commence with an introduction to the potential impact of LLMs on the finance industry. This will be followed by an exploration of the key benefits of LLM integration, including the enhancement of risk assessment accuracy, the utilization of alternative data sources, and the automation of credit processes. The discussion will delve into real-life case studies and examples to illustrate the practical applications of LLMs in credit risk assessment. Additionally, the session will address potential challenges and ethical considerations surrounding the use of LLMs in this context. The talk will conclude with insights on the future of credit risk assessment with LLMs, leaving room for engaging discussions and Q&A.

What You’ll Learn:
Improved Risk Assessment
LLMs can analyze vast amounts of unstructured data, such as financial records, transaction histories, and market trends, to provide more comprehensive and accurate risk assessments. By processing and generating human-like text, LLMs can uncover insights and patterns that traditional credit risk models may miss.

Enhanced Contextual Understanding
LLMs can provide a deeper contextual understanding of borrower profiles and financial data. They can analyze text-based information, like loan applications and customer interactions, to gain a more holistic view of a borrower’s creditworthiness.

Handling Nonlinear Relationships
LLMs can capture complex nonlinear relationships within credit data, enabling them to make more accurate credit risk predictions compared to traditional linear models.

Improved Fraud Detection
LLMs can analyze transaction patterns and identify anomalies that may indicate fraudulent activities, enhancing an institution’s ability to detect and prevent fraud.

Automating Credit Risk Processes
LLMs can automate the credit risk analysis process, generating credit approvals, pricing recommendations, and repayment terms. This can lead to faster decision-making, reduced manual effort, and minimized human error.

Leveraging Alternative Data
LLMs can integrate alternative data sources, such as social media profiles and online behavior, to assess credit risk for borrowers with limited or no credit history. This allows for more comprehensive and inclusive credit risk evaluations.

Enhancing Portfolio Management
By analyzing market trends and customer behavior, LLMs can assist in optimizing credit portfolios, improving risk management, and enhancing overall lending strategies.
Overall, the integration of LLMs in credit risk assessment has the potential to revolutionize the industry by providing more accurate, efficient, and inclusive credit risk evaluations, ultimately leading to better lending decisions and improved financial outcomes.

Talk Title: Building and Evaluating Prompts on Production Grade Datasets

Presenters:
Bhuvana Adur Kannan, Lead – Agent Performance & ML Platform, Voiceflow | Yoyo Yang, Machine Learning Engineer, Voiceflow

About the Speakers:
Bhuvana heads the Conversational Agent performance and ML platform at Voiceflow, aiming to improve conversational agent performance for customers. She has prior experience working on enterprise data systems for major Canadian banks and financial institutions.

Yoyo is a Machine Learning Engineer at Voiceflow, a conversational AI company. She works on various facets of machine learning systems, from model training and prompt tuning to backend architecture and real-time inference. Yoyo has been working in ML and Data Science for the past five years. She is committed to transforming ideas into robust, scalable solutions and continually pushing the boundaries of what’s possible.

Talk Track: Applied Case Studies

Talk Technical Level: 6/7

Talk Abstract:
Constructing prompts per task can be challenging given the many unknowns of running them in production. In this talk, we’ll cover how we created several production style datasets for two types of LLM tasks to productize prompt based features. We’ll walk through the methodology, techniques and lessons learned from developing and shipping prompt based features. The techniques in this talk will be widely applicable, but focused on conversational AI.

What You’ll Learn:
How to approach dataset creation, iterate on prompts and measure success of releases in production.

Talk Title: Connecting the Dots Between AI Ethics and Sustainability

Presenter:
Sasha Luccioni, AI and Climate Leader, Hugging Face

About the Speaker:
Dr. Sasha Luccioni is the AI & Climate Lead at Hugging Face, a global startup in responsible open-source AI, where she works on creating tools and techniques for quantifying AI’s societal and environmental costs. She is also a founding member of Climate Change AI (CCAI) and a board member of Women in Machine Learning (WiML).

Talk Track: Business Strategy or Ethics

Talk Technical Level: 2/7

Talk Abstract:
AI ethics and sustainability considerations have typically been considered separately : work that aims to estimate the carbon footprint of AI models does not typically address their contribution towards shifting the balance of power and amplifying inequalities, and that which aims to evaluate the societal impacts of AI models focuses on aspects such as bias and fairness consistently overlooks their water and energy use. In this panel, we will discuss how the two subjects are related and intertwined, especially in the context of generative AI technologies, which come with many challenges in terms of ethics and the environment.

What You’ll Learn:
– Key ethical challenges in AI (bias, fairness, representativity, copyright)
– Environmental impacts of AI (energy, water, natural resources)
– Current state of the art in research on both
– How to make informed trade-offs between potential benefits of (generative) AI technologies while remaining cognizant of their ethical and environmental impacts

Talk Title: Successfully Integrating Ai in Your Strategy and Business Operations – Lessons Learnt from Investing

Presenter:
Patrick Tammer, Senior Investment Director, Scale AI

About the Speaker:
Patrick Tammer is a Senior Investment Director and Policy Advisor to the Canadian Government at Scale AI, Canada’s global AI innovation cluster. He currently manages a $125M portfolio of AI industry innovation projects. He is also an AI researcher at the Harvard Kennedy School. Prior to his current role, he spent 4 years as a strategy consultant with BCG. LinkedIn

Talk Track: Business Strategy or Ethics

Talk Technical Level: 2/7

Talk Abstract:
Drawing from a portfolio of over 100 AI and big data projects, I aim to share actionable guidance on how businesses can harness AI to drive innovation, efficiency, and competitive advantage. Attendees will learn how to:
1. Navigate the AI Landscape: I will present findings from Scale AI’s flagship report “”The State of AI in Canada”” (https://www.scaleai.ca/aiatscale-2023/) to provide a comprehensive overview of how Canada compares globally in AI advancements.
2. Identify and Collaborate with Ecosystem Partners: I will provide strategies for identifying the right partners across academia, startups, and AI solution providers to foster innovation and growth.
3. Structure Successful AI Initiatives: Sharing lessons learned from Scale AI’s extensive project portfolio, I will outline how to effectively structure internal AI initiatives for maximum impact.
4. Develop AI Talent: Insights on crafting a forward-thinking AI talent strategy will be discussed, enabling organizations to build essential in-house capabilities.
5. Access Non-Dilutive Funding: Information on leveraging government non-dilutive funding to de-risk investments in AI technologies will be highlighted, offering a pathway to innovative project financing.

Additional notes

While our project portfolio is cross-industry, I am happy to tailor my presentation to specific industries of interest

What You’ll Learn:
Why? What’s the knowledge gap:
The session addresses the critical gap of integrating cutting-edge AI and big data technologies into mainstream business operations. It aims to equip leaders with the knowledge and tools necessary to navigate the complexities of AI adoption and to leverage these technologies for strategic advantage.

Learning Format and Audience Engagement Details:
The session is designed to be a concise, high-impact presentation lasting 15-30 minutes. It will include a combination of case study insights, strategic frameworks, and interactive Q&A, crafted to engage a diverse audience of C-suite executives, IT professionals, and strategic decision-makers.

Target Audience:
Tailored for senior decision-makers, this presentation will benefit those looking to effectively deploy AI and big data technologies to reshape their business landscapes. It promises valuable insights for anyone involved in technology strategy and implementation.

Talk Title: Building Customer Trust in the Generative AI Era

Presenter:
Patricia Thaine, Co-Founder & CEO, Private AI

About the Speaker:
Patricia Thaine is the Co-Founder & CEO of Private AI, a Microsoft-backed startup who raised their Series A led by the BDC in November 2022. Private AI was named a 2023 Technology Pioneer by the World Economic Forum and a Gartner Cool Vendor. She is also a Computer Science PhD Candidate at the University of Toronto (on leave) and a Vector Institute alumna. Her R&D work is focused on privacy-preserving natural language processing, with a focus on applied cryptography and re-identification risk. She also does research on computational methods for lost language decipherment. Patricia is a recipient of the NSERC Postgraduate Scholarship, the RBC Graduate Fellowship, the Beatrice “Trixie” Worsley Graduate Scholarship in Computer Science, and the Ontario Graduate Scholarship. She is the co-inventor of one U.S. patent and has ten years of research and software development experience, including at the McGill Language Development Lab, the University of Toronto’s Computational Linguistics Lab, the University of Toronto’s Department of Linguistics, and the Public Health Agency of Canada.

Talk Track: Business Strategy or Ethics

Talk Technical Level:  3/7

Talk Abstract:
As one of the hottest technological advancements of the moment, ChatGPT has gained both attention and criticism for its privacy implications. This session explores the ethical challenges posed by generative AI and outlines strategies to establish trust in this evolving landscape. Covering topics such as privacy, bias, and GDPR compliance, attendees will gain practical insights to navigate the ethical complexities of the generative AI era and contribute to building a trustworthy AI future.

What You’ll Learn:
– Understand the ethical challenges of generative AI, including privacy concerns, bias, and GDPR compliance.
– Learn strategies for identifying and mitigating bias in AI-generated content, promoting fairness and inclusivity.
– Gain insights into safeguarding user privacy through data minimization, anonymization, and more.

Talk Title: Advances in Algorithmic Recourse: Ensuring Causal Consistency, Fairness, & Robustness

Presenter:
Amir Hossein Karimi, Assistant Professor, University of Waterloo

About the Speaker:
Dr. Amir-Hossein Karimi is an Assistant Professor in the Electrical & Computer Engineering department at the University of Waterloo where he leads the Collaborative Human-AI Reasoning Machines (CHARM) Lab. The lab’s mission is to advance the state of the art in artificial intelligence and chart the path for trustworthy human-AI symbiosis. In particular, the group is interested in the development of systems that can recover from or amend poor experiences caused by AI decisions, assay the safety, factuality, and ethics of AI systems to foster trust in AI, and effectively combine human and machine abilities in various domains such as healthcare and education. As such, the lab’s research explores the intriguing intersection of causal inference, explainable AI, and program synthesis, among others.

Amir-Hossein’s research contributions have been showcased at esteemed AI and ML-related platforms like NeurIPS, ICML, AAAI, AISTATS, ACM-FAccT, and ACM-AIES, via spotlight and oral presentations, as well as through a book chapter and a highly regarded survey paper in the ACM Computing Surveys. Before joining the University of Waterloo, Amir-Hossein gained extensive industry experience at Meta, Google Brain, and DeepMind and offered AI consulting services worth over $250,000 to numerous startups and incubators. His academic and non-academic endeavours have been honoured with awards like the Spirit of Engineering Science Award (UofToronto, 2015), the Alumni Gold Medal Award (UWaterloo, 2018), the NSERC Canada Graduate Scholarship (2018), the Google PhD Fellowship (2021), and the ETH Zurich Medal (2024).
Talk Track: Research or Advanced Technical

Talk Technical Level: 5/7

Talk Abstract:
Explore the intersection of causal inference and explainable AI applied for fair and robust algorithmic recourse in AI applications across healthcare, insurance, and banking. This session highlights the role of causal consistency in correcting biases and ensuring transparent model decisions.

What You’ll Learn:
– Foundations of Causal Inference: Understand the basics and importance of causal reasoning in AI.
– Integrating Causality in AI Systems: Practical approaches for embedding causal methods to improve fairness and accountability.
– Case Studies: Insights from healthcare, insurance, and banking on implementing causal tools for better decision-making.
– Future Trends: Emerging technologies and methodologies in algorithmic recourse that are setting the stage for more reliable AI systems.

Workshop: Uncertainty Quantification with Conformal Prediction: A Path to Reliable ML Models

Presenter:
Mahdi Torabi Rad, President, MLBoost

About the Speaker:
Mahdi Torabi Rad, Ph.D. is a computational scientist, engineer, self-trained software developer, mentor, and YouTube content creator with over 10 years of experience in developing mathematical, statistical, and machine-learning models, as well as computer codes to predict complex phenomena. He has published in top-tier journals of Physics, Engineering, and ML and has extensive experience as an ML Lead in various DeepTech startups. Mahdi is also the YouTuber behind the channel MLBoost, known for its popular videos on ML topics, including Conformal Prediction, which have garnered tens of thousands of views in less than a year.

Talk Track: Workshop

Talk Technical Level: 5/7

Talk Abstract:
In today’s high-stakes applications ranging from medical diagnostics to industrial AI, understanding and quantifying uncertainty in machine learning models is paramount to prevent critical failures. Conformal prediction, also known as conformal inference, offers a practical and robust approach to create statistically sound uncertainty intervals for model predictions. What sets conformal prediction apart is its distribution-free validity, providing explicit guarantees without relying on specific data distributions or model assumptions.

This hands-on workshop reviews the core concepts of conformal prediction, demonstrating its applicability across diverse domains such as computer vision, natural language processing, and deep reinforcement learning. Participants will gain a deep understanding of how to leverage conformal prediction with pre-trained models like neural networks to generate reliable uncertainty sets with customizable confidence levels.

Throughout the workshop, we’ll explore practical theories, real-world examples, and Python code samples, including Jupyter notebooks for easy implementation on real data. From handling structured outputs and distribution shifts to addressing outliers and models that abstain, this workshop equips attendees with the tools to navigate complex machine learning challenges while ensuring model reliability and trustworthiness.

What You’ll Learn:
– What sets conformal prediction apart from other methods of uncertainty quantification?
– The principles and theory behind conformal prediction for uncertainty quantification in machine learning
– Techniques for creating statistically rigorous uncertainty sets/intervals using conformal prediction
– How to apply conformal prediction to pre-trained machine learning models, such as neural networks, for reliable uncertainty quantification
– Hands-on experience with implementing conformal prediction in Python using libraries like scikit-learn and NumPy
– Examples showcasing the application of conformal prediction in diverse domains such as financial forecasting, natural language processing (NLP), and computer vision

Prerequisite Knowledge (if required)
Basic understanding of machine learning concepts, including model training and evaluation.
Familiarity with Python programming and libraries such as NumPy, Pandas, and scikit-learn

Talk Title: Optimizing Recommendations on Wattpad Home

Presenters:
Gayathri Srinivasan, Senior AI/ML Product Manager, Wattpad | Abhimanyu Anand, Data Scientist, Wattpad

About the Speakers:
Gayathri Srinivasan is an accomplished AI product manager at Wattpad, specializing in personalized rankings and recommendations. With over 7 years of diverse product management experience across various industries, including startups, scale-ups, and enterprises, she brings a wealth of knowledge and expertise to her role.

Abhimanyu is a Data Scientist at Wattpad, an online social storytelling platform, where he leads the development of recommender systems for content recommendations. He holds an M.Sc. in Big Data Analytics from Trent University, with a specialization in natural language processing. He has developed and implemented robust AI solutions throughout his career across diverse domains, including internet-scale platforms, metals and mining, oil and gas, and e-commerce.

Talk Track: Applied Case Studies

Talk Technical Level: 4/7

Talk Abstract:
At Wattpad, the world’s leading online storytelling platform, recommendation systems are pivotal to our mission of connecting readers with the stories they love. The Home Page is the primary gateway to Wattpad’s diverse content and experiences. As the platform has evolved, we’ve introduced new content types and classes of stories to meet various business objectives, such as user engagement, merchandising, and marketing. This evolution necessitated recalibrating our homepage recommender system to effectively balance multiple business goals. In this talk, we will discuss how we integrated these objectives into the home recommender stack using probabilistic algorithms derived from the domain of reinforcement learning. Additionally, we will share the challenges we encountered during this transition, such as data sparsity and the cold start problem, along with insights into our development of novel graph neural network architectures tailored for recommendation systems and the new datasets we developed to overcome these hurdles.

What You’ll Learn:
The audience will learn about:
1. How Recommendation systems are used at scale for content recommendation.
2. Challenges associated with recommendation systems like data sparsity, balancing multiple objectives, etc.
3. How we solved these problems at Wattpad using novel graph-based models, multi-objective ranker, etc.

Talk Title: Industrializing ML Workflows in Drug Discovery

Presenter:
Estefania Barreto, ML Engineer, Recursion Pharmaceuticals

About the Speaker:
Estefania Barreto-Ojeda is an ML Engineer at Recursion, where she builds and automates machine learning pipelines for drug discovery. A physicist by training, she has a PhD in Biophysical Chemistry from the University of Calgary where she participated in Google Summer of Code as an open source software developer . She has given talks at several major data conferences, including PyData. Estefania is a full time automation fan, an occasional open-source contributor, and a seasonal bicycle lover.

Talk Track: Research or Advanced Technical

Talk Technical Level:  5/7

Talk Abstract:
Recursion is committed to industrialize drug discovery by addressing the complexities of Machine Learning (ML) workflows head-on. A critical step in the drug discovery process is predicting compounds’ properties such as Absorption, Distribution, Metabolism, and Excretion (ADME), Potency and Toxicity among others, which allows the evaluation of a drug candidate for safety and efficacy, crucial for regulatory approval. In order to leverage its large volume of diverse and regularly updated chemical assays datasets, Recursion has engineered standardized and automated solutions to train and deploy predictive models on a weekly basis, thus accelerating the drug discovery process in early stages. In this talk, we will offer a comprehensive overview of our industrialized workflows to develop and deploy ML compound property predictors. Insights into Recusion’s strategy for data management, model training and deployment using both cloud and supercomputing resources will be shared.

What You’ll Learn:
During this presentation, attendees will gain understanding of our structured approach for creating and implementing machine learning models to predict compound properties in an industrial setting. We will explore Recusion’s approach to managing data, training models, and deploying them utilizing a combination of cloud services and supercomputing resources

Workshop: Optimizing Large Language Model Selection for Efficient GenAI Development

Presenters:
Royal Sequeira, Machine Learning Engineer, Georgian | Aslesha Pokhrel, Machine Learning Engineer, Georgian | Christopher Tee, Software Engineer, Georgian

About the Speakers:
Royal is a Machine Learning Engineer and is a part of Georgian’s R&D team. He helps Georgian’s portfolio companies develop product features and in accelerating GTM strategies. His expertise is in Natural Language Processing with broader experience in Multimodal Machine Learning, Computer Vision, Information Retrieval, and Reinforcement Learning. He has publications in top-tier conferences such as ACL, EMNLP, WSDM, and SIGIR. In the past, he has worked at Ada Support, LG Toronto AI Lab, and Microsoft Research India. In 2018, he founded Sushiksha, a mentorship organization that has mentored hundreds of medical and engineering students across rural India with both technical and soft skills. In his free time, he reads books, likes to learn new languages, and enjoy a hot chai with his friends.

Aslesha is a Machine Learning Engineer at Georgian, helping portfolio companies leverage ML solutions in various business use cases. She graduated from the University of Toronto with a Master’s in Applied Computing and a Bachelor’s in Computer Science and Physics. Her background includes significant research in deep learning and representation learning in various data modalities including language, time series and tabular data, which she now applies to driving innovation and efficiency.

Christopher is a tech enthusiast with a passion for code optimizations, efficient machine learning solutions and MLOps. He has extensive experience in building high-performance machine learning pipelines and orchestrating the lifecycle of machine learning models.
During his spare time, Christopher enjoys cycling and skiing.

Talk Track: Workshop

Talk Technical Level: 5/7

Talk Abstract:
When developing a Generative AI use case, developers face a variety of choices, particularly with the proliferation of foundational and open-source models. The decision process to choose the suitable large language model (LLM) for a given use case, however, may involve fine-tuning, crafting tailored prompts, cost considerations, and evaluations, which can become cumbersome without a modular design approach. In this workshop, we will explore various tools such as DSPy and frugalGPT to help pick the best LLM given the usecase. This will be a hands-on session focusing on the practical applications.

What You’ll Learn:
The main goal of the workshop is to provide attendees with a hands-on experience with tools such as DSPy and frugalGPT to build modular pipeline to choose the best LLM for specific needs based on performance, cost, and scalability.

Prerequisite Knowledge (if required)
Install the following libraries before the workshop: ollama, dspy-ai, frugalGPT

Talk Title: AI for Hospitals at Scale

Presenter:
Jullian Yapeter, Machine Learning Scientist, Signal 1

About the Speaker:
Jullian is a Machine Learning Scientist at Signal 1. His focus is at the intersection of model development and ML infrastructure / MLOps. He is an engineer with a BASc. in Mechatronics Engineering from the University of Waterloo, and a M.S. in Computer Science from the University of Southern California. He was a research assistant at the CLVR Lab at USC, working on large-scale Offline RL under Prof. Joseph Lim. Jullian has industry experience working on AI / Computer Vision systems at Disney Imagineering, IBM, and a few different start-ups. Overall, he’s passionate about improving people’s lives through technology.

Talk Track: Research or Advanced Technical

Talk Technical Level:  5/7

Talk Abstract:
An exploration into the technical processes employed at Signal 1 that enables the training and deployment of machine learning models across various hospital settings, including zero-shot learning applications in patient deterioration prediction that generalizes even to unseen hospitals.

This talk will also cover the specifics of our microservice architecture which underpins our system’s capability to consistently deliver timely and effective inference results, enabling scalable, data-driven decisions in patient care.

Attendees will gain insights into the practical challenges and solutions encountered in developing AI applications that can seamlessly integrate into and impact real-world clinical settings.

Whether you’re interested in the nuances of model development, deployment, or the practical implications of AI in healthcare, this session will offer valuable technical knowledge and perspectives.

We invite you to join this technical discourse at the intersection of AI and healthcare, contributing to a dialogue that’s shaping the future of AI applications in medical settings.

What You’ll Learn:
– An overview of the practical challenges in deploying ML in hospitals, such as generalization and scalability
– How we at Signal 1 tackle some of these challenges
– Discussions about some of the problems we’re still working on

Talk Title: RAGs in Production: Delivering Impact Safely and Efficiently

Presenters:
Everaldo Aguiar, Senior Engineering Manager, PagerDuty | Wendy Foster, Data Products Leader, Shopify | Margaret Wu, Senior Data Scientist, Advanced Analytics and AI, CIBC

About the Speakers:
Everaldo started his Data Science journey as a Data Science for Social Good Fellow at the Center for Data Science and Public Policy at UChicago. Today he is a Senior Engineering Manager at PagerDuty where he leads both the Data Science and Data Engineering teams, and a faculty member at the Khoury College at Northeastern University. Prior to that he was a Data Science Lead at Shopify’s Growth organization. Everaldo is originally from Brazil and Seattle has been home to him for 6 years.

Wendy with over 10 years of experience leading data organizations at scale, Wendy Foster divides her time between data start-up advising and applied data science education; supporting the next wave of data leaders and innovation in this rapidly evolving space.

Talk Track: Panel Discussion

Talk Technical Level:  4/7

Talk Abstract:
“Urgent” and “unplanned” are among the least favorite words in any productive team’s dictionaries. Unexpected issues disrupt roadmaps, delay important work, lead to burnout, and hurt customer trust.

Here at PagerDuty we’ve been leveraging AI to help our customers experience fewer incidents and resolve the ones they do have faster. This often involves giving them streamlined access to information they need about our product, their individual setups, and an efficient way for them to get answers to complex answers on the fly.

As technologies evolved and we rolled out our generative AI infrastructure, RAGs became an excellent candidate for those use-cases. They allow for an easy-to-automate process of building “”knowledge bases”” and using those to power powerful chat-like applications, but productionalizing them in a safe manner is often more challenging than building these RAG systems themselves.

In this panel we’ll discuss some of these challenges, how we’ve been tackling them, as well as existing areas of open research we’re excited to pursue in the coming months.

What You’ll Learn:
Attendees will learn how to tackle some common (and uncommon) challenges that come with bundling RAG models into their own products. We’ll cover a few corner cases that were completely unexpected as well as automation processes that we designed to ensure that complex parts of our systems could be maintained with minimal engineering effort.

Workshop: Leveraging Large Language Models to Build Enterprise AI

Presenters:
Rohit Saha, Machine Learning Scientist, Georgian | Kyryl Truskovskyi, Machine Learning Scientist, Georgian | Benjamin Ye Machine Learning Scientist, Georgian | Angeline Yasodhara Machine Learning Engineer, Georgian

About the Speakers:
Rohit is a Machine Learning Scientist on Georgian’s R&D team, where he works with portfolio companies to accelerate their AI roadmap. This includes scoping research problems to building ML models to moving them into production. He has over 5 years of experience developing ML models across Vision, Language and Speech modalities. His latest project entails figuring out how businesses can leverage Large Language Models (LLMs) to address their needs. He holds a Master’s degree in Applied Computing from the University of Toronto, and has spent 2 years at MIT and Brown where he worked at the intersection of Computer Vision and domain adaptation.

Kyryl is a seasoned ML professional, currently based in Canada. With a rich 9-year background in ML, he has evolved from hands-on coding to architecturing key ML business solutions.

Ben is a Machine Learning Engineer at Georgian, where he helps companies to implement the latest techniques from ML literature. He obtained his Bachelor’s from Ivey and Master’s from Penn. Prior to Georgian, he worked in quantitative investment research.

Angeline is a Machine Learning Scientist at Georgian, collaborating with companies to accelerate their AI product development. Before joining Georgian, she was a research assistant at the Vector Institute, working at the intersection of machine learning and healthcare, focusing on explainability and causality. From explainability, time series, outlier detection to LLMs, she applies the latest techniques to enhance product differentiation.

Talk Track: Workshop

Talk Technical Level:  3/7

Talk Abstract:
Generative AI is poised to disrupt multiple industries as enterprises rush to incorporate AI in their product offerings. The primary driver of this technology has been the ever-increasing sophistication of Large Language Models (LLMs) and their capabilities. In the first innings of Generative AI, a handful of third-party vendors have led the development of foundational LLMs and their adoption by enterprises. However, development of open-source LLMs have made massive strides lately, to the point where they compete or even outperform their closed-source counterparts. This competition presents an unique opportunity to enterprises who are still navigating the trenches of Generative AI and how best to utilize LLMs to build enduring products. This workshop (i) showcases how open-source LLMs fare when compared to closed-source LLMs, (ii) provides an evaluation framework that enterprises can leverage to compare and contrast different LLMs, and (iii) introduces a toolkit to enable easy fine-tuning of LLMs followed by unit-testing (https://github.com/georgian-io/LLM-Finetuning-Toolkit)

What You’ll Learn:
By the end of this workshop, learn how to create instruction-based datasets, fine-tune open-source LLMs via ablation studies and hyperparameter optimization, and unit-test fine-tuned LLMs.

Prerequisite Knowledge (if required)
Python + Familiarity with concepts such as prompt designing and LLMs

Workshop: Kùzu - A Fast, Scalable Graph Database for Analytical Workloads

Presenter:
Prashanth Rao, AI Engineer, Kùzu, Inc.

About the Speaker:
Prashanth is an AI engineer at Kùzu based in Toronto. In recent years, he’s worked with numerous databases and data modeling paradigms, with a focus on data engineering, analytics and machine learning to power a variety of applications. He enjoys engaging with the data community and blogging @ thedataquarry.com in his spare time.

Talk Track: Workshop

Talk Technical Level: 5/7

Talk Abstract:
In this session, we will introduce Kùzu, a highly scalable, extremely fast, easy-to-use, open source embedded graph database designed for analytical query workloads. Users who are familiar with DuckDB in the SQL world will find Kùzu to be a refreshingly familiar graph analogue. A number of state-of-the-art methods from graph database research are highlighted.

The workshop will include a practical component that showcases how simple and easy-to-use Kùzu is for data scientists and engineers. We will demonstrate popular use cases by transforming a relational dataset (in the form of tables) into a knowledge graph, run Cypher queries on the graph, analyze the dataset using graph algorithms, and train a simple graph neural network using PyTorch Geometric to compute node embeddings and store them in the graph database for downstream use cases. We will end by summarizing how these methods can help build advanced RAG systems that can be coupled with an LLM downstream.

Additional notes

In addition to the workshop where we go into the hands-on concepts of knowledge graphs and how to use them, we’d very much like to have a 30-minute talk that introduces the idea of Kùzu and how it’s different from other graph databases, and the core innovations under the hood. If the organizers feel that the content is better separated into two parts (a separate talk on the main stage and the workshop with the practical component), that’s perfectly fine as well. For this reason, I’ve opted for any of the available presentation times.

What You’ll Learn:
1. What are knowledge graphs
2. The characteristics of competent graph database systems
3. How to work with graphs on real-world data
4. How to query a graph in Cypher
5. How to run graph algorithms for graph data science
6. How to do graph machine learning

The core message that attendees will take away is this: There are times when modeling tabular/relational data as a graph is necessary and useful, e.g., to obtain a more object-oriented model over your records or find indirect connections/paths between the entities in the data. In such cases, using an open source, embedded graph database like Kùzu is a simple and low-barrier-to-entry option to analyze the connected data at a much greater depth via graph data structures.

Prerequisite Knowledge (if required)
Basic Python programming skills (all the background for what knowledge graphs are, and how to work with a graph database will be provided to users who are new to the world of graphs).

Talk Title: How Is GenAI Reshaping the Business?

Presenter:
Jaime Tatis, VP-Chief Insights Architect, TELUS

About the Speaker:
Jaime Tatis is a visionary and technology thought leader with strong business acumen and a proven track record of collaborating with both technical and non-technical teams to drive critical initiatives. Jaime is passionate about building and developing diversely skilled high-performance teams, growing future leaders and driving business efficiency through continuous improvement and innovation.

As the Chief Insights and Analytics Officer at TELUS, a world-leading technology company, Jaime works with partners across the TELUS family of companies leading the advancement of data, AI and analytics strategy and the company’s cultural shift to create cutting-edge customer technology solutions. By thoughtfully providing data insights and analytics, along with next-generation cloud-based architecture to enable world-class Artificial Intelligence and Machine Learning capabilities, Jaime is improving business outcomes and providing best-in-class customer experiences for TELUS.

Talk Track: Business Strategy or Ethics

Talk Technical Level: 2/7

Talk Abstract:
Generative AI offers transformative advantages across all sectors with unlimited possibilities. Audience will learn about AI application and how it enhances efficiency, fosters innovation, and elevates problem-solving with real examples.

What You’ll Learn:
TBA

Workshop: Open-Source Agentic RAG with LangChain & Mistral-7B

Presenters:
Greg Loughnane, Co-Founder, AI Makerspace | Chris Alexiuk, Co-Founder & CTO, AI Makerspace

About the Speakers:
Dr. Greg Loughnane is the Co-Founder & CEO of AI Makerspace, where he is an instructor for their [AI Engineering Bootcamp](https://maven.com/aimakerspace/ai-eng-bootcamp). Since 2021 he has built and led industry-leading Machine Learning education programs. Previously, he worked as an AI product manager, a university professor teaching AI, an AI consultant and startup advisor, and an ML researcher. He loves trail running and is based in Dayton, Ohio.

Chris Alexiuk is the Co-Founder & CTO at AI Makerspace, where he is an instructor for their [AI Engineering Bootcamp](https://maven.com/aimakerspace/ai-eng-bootcamp). Previously, he was a Founding Machine Learning Engineer, Data Scientist, and ML curriculum developer and instructor. He’s a YouTube content creator YouTube who’s motto is “Build, build, build!” He loves Dungeons & Dragons and is based in Toronto, Canada.

Talk Track: Workshop

Talk Technical Level: 4/7

Talk Abstract:
2024 is the year of agents! People and companies are now aiming beyond RAG to more complex applications connected to tools capable of reasoning, taking action, and observing.

A fully autonomous agent runs in a loop, or a cycle, using a Reasoning-Action approach. The tool with the largest community-building LLM applications is LangChain, and with their release of LangChain v0.1, they also released LangGraph, the engine that powers these autonomous agentic cycles.

In this session, we’ll break down the concepts and code you need to understand and build the industry-standard agentic RAG application; namely, one with access to backup internet search.

For the open-source version of the application, we’ll use MIstral-7B as the LLM and Duck Duck Go for backup search!

Per LangChain’s recommended best practices, we’ll still use OpenAI Function Calling to build an OpenAI Functions Agent. This will ensure the best performance while avoiding additional fine-tuning required for an open-source model to serve as the reasoning engine.

Who should attend the event?

– LLM practitioners who want to understand how to build agentic RAG applications
– Aspiring AI Engineers who want to start building at the open-source edge with LangChain
AI Engineering leaders interested in connecting LLMs to RAG apps and other tools simultaneously

What You’ll Learn:
– Understand Agents, agent-like, and agentic behavior as a pattern of reasoning and action toward creating application of increasing complexity
– The fine lines b/t RAG, Agents, and Agentic RAG
– Learn to build virtual assistants with reasoning capabilities using LangChain and Open-Source LLMs

Prerequisite Knowledge (if required)
– Working knowledge of how to run Machine Learning Python code in Jupyter Notebooks
– Practical knowledge of how to use an interactive development environment with version control so that you can engage with our public GitHub repo

Sign Up for TMLS 2023 News Updates

Who Attends

Attendees
0 +
Data Practitioners
0 %
Researchers/Academics
0 %
Business Leaders
0 %

2023 Event Demographics

Highly Qualified Practitioners*
0 %
Currently Working in Industry*
0 %
Attendees Looking for Solutions
0 %
Currently Hiring
0 %
Attendees Actively Job-Searching
0 .0%

2023 Technical Background

Expert
17.5%
Advanced
47.3%
Intermediate
21.1%
Beginner
5.6%

2023 Attendees & Thought Leadership

Attendees
0 +
Speakers
0 +
Company Sponsors
0 +

Business Leaders: C-Level Executives, Project Managers, and Product Owners will get to explore best practices, methodologies, principles, and practices for achieving ROI.

Engineers, Researchers, Data Practitioners: Will get a better understanding of the challenges, solutions, and ideas being offered via breakouts & workshops on Natural Language Processing, Neural Nets, Reinforcement Learning, Generative Adversarial Networks (GANs), Evolution Strategies, AutoML, and more.

Job Seekers: Will have the opportunity to network virtually and meet over 30+ Top Al Companies.

Ignite what is an Ignite Talk?

Ignite is an innovative and fast-paced style used to deliver a concise presentation.

During an Ignite Talk, presenters discuss their research using 20 image-centric slides which automatically advance every 15 seconds.

The result is a fun and engaging five-minute presentation.

You can see all our speakers and full agenda here

Get our official conference app
For Blackberry or Windows Phone, Click here
For feature details, visit Whova