June 12th (Virtual)
June 13th - 14th (In-Person)
The Carlu 444 Yonge St #7
Toronto, ON M5B 2H4, Canada
Drop us a line:
info@torontomachinelearning.com
Join us as we celebrate key learnings, community networking, and the inspiring take aways from 2022
5 Tracks
1️⃣ Cutting Edge Research
2️⃣ Technical Case Studies
3️⃣ Up-skill Workshops
4️⃣ Business Strategy
5️⃣ Ignite Discussions
Come join Leaders in a dynamic and fun series of rapid fire presentations where they present a key idea while slides advance every 15 seconds.
Create your own gathering using our event app, or join on the breaks to meet speakers and peers. Here, projects are discussed and lasting connections are made!
Join our community and hear from peers in a unique learning environment. Listen and learn or actively participate
Pick the workshops. Get Skilled up. Intro to Gtaph ML Application Delivery on Kubernetes, Feature Stores, Application Delivery on Kubernetes, Iterating on NLP Models and more!
Learn from some of the brightest minds from the Vector Institute, Apple, Google Brain, University of Toronto and more
Angeline Yasodhara, Applied Research Scientists, Georgian & Benjamin Ye, Applied Research Scientists, Georgian
Mahmudul Hasan, Lead Data Scientist, TELUS Business Marketing
Dr. Nasim Abdollahi, Machine Learning Researcher & Dr. Farnoosh Khodakarami, Computer Scientist & ML Researcher, Cyclica
Stefanie Molin, Software Engineer / Data Scientist, Bloomberg
Eric Hammel, MLOps Engineer, Rocket Science Development
For 7 years TMLS has hosted a unique blend of cutting-edge research, hands-on workshops, & vetted industry case-studies reviewed by Committee for your teams expansion & growth.
We put an emphasis on community, learning, and accessibility.
Suhas Pai
Chief Technology Officer, TMLS 2022 Chair
Ask about Single Day Passes (Limited)
Business Leaders: C-Level Executives, Project Managers, and Product Owners will get to explore best practices, methodologies, principles, and practices for achieving ROI.
Engineers, Researchers, Data Practitioners: Will get a better understanding of the challenges, solutions, and ideas being offered via breakouts & workshops on Natural Language Processing, Neural Nets, Reinforcement Learning, Generative Adversarial Networks (GANs), Evolution Strategies, AutoML, and more.
Job Seekers: Will have the opportunity to network virtually and meet over 60 Top Al Start-ups and companies during the EXPO & Career Fair.
TMLS is a community response addressing the need to unite academic research, industry opportunities and business strategy in an environment that is safe, welcoming and constructive for those working in the fields of ML/AI.
See our team and learn more about the Toronto Machine Learning Society here.
Email for Brochure: faraz@torontomachinelearning.com
Ignite what is an Ignite Talk?
Ignite is an innovative and fast-paced style used to deliver a concise presentation.
During an Ignite Talk, presenters discuss their research using 20 image-centric slides which automatically advance every 15 seconds.
The result is a fun and engaging five-minute presentation.
You can see all our speakers and full agenda here
Meet and speak with incredible leaders and peers!
We have allocated space for open discussion and interaction via our Ignite Presentation & Discussion Tracks.
It’s often where the magic happens!
We’ll have a series of short themed presentations followed by group discussions in an open and lightly structured format.
Perfect for learning, and sharing your own projects amongst peers!
Presenters: Benjamin Ye, Applied Research Scientists, Georgian & Angeline Yasodhara, Applied Research Scientists, Georgian
About the Speaker:
Applied Research Scientists
Which talk track does this best fit into?
Workshop
Technical level of your talk?
(Technical level: 5/7)
What you’ll learn:
Time series anomaly detection methods and applications
Abstract of Talk:
Traditional methods in time series anomaly detection yield good results for relatively simple tasks, but they often fall short when it comes to harder problems of dealing with long-range dependencies, multivariate time series, and subtle contextual anomalies. We introduce a toolkit incorporating classical and novel machine learning techniques (N-BEATS, Transformers, etc.) as well as recent thresholding methods to overcome these challenges.
We will discuss their benchmark results against different anomaly types for both univariate and multivariate cases. We will walk through how you can use this simple toolkit and easily incorporate these techniques into your application.
Lead Data Scientist, TELUS Business Marketing
With deep expertise in Machine Learning and AI, Mahmudul has over 10 years industry experience of building enterprise level data products to achieve digital transformation, improve customer experience, new revenue opportunity, and cost savings for companies across the globe. He is currently serving as a Lead Data Scientist in TELUS Business Marketing. Mahmudul also designed and developed NLP course content for University of Toronto School of Continuing Studies and also serving as an instructor for the same.
Mahmudul holds a Master’s degree in Management Science from University of Waterloo and a Bachelor’s in Computer Science & Engineering.
Workshop: Introduction to NLP & a Step by Step Implementation of a Real World Use Case from TELUS
Abstract: The workshop will be delivered in two part:
– Part-1: Brief introduction to NLP concepts and ideas which would include
– Basic definitions and use cases
– Why NLP is a different ball game inside AI/ML (major challenges of processing natural language etc.)
– How those challenges are overcame with ML based approach
– Major workflow of building NLP application.
– Part-2: is a detail implementation of a case study with coding details which I have implemented in TELUS. During this part-2, audience will see how a business problem is solved leveraging unstructured text data using NLP algorithms along with necessary tips and tricks which makes a unsupervised learning based project financially successful for the company.
What You’ll Learn: Audience will see how a business problem is solved leveraging unstructured text data using NLP algorithms along with necessary tips and tricks which makes a unsupervised learning based project financially beneficial for the business.
Technical Level: 6
Location: Toronto
Presenter:
Mahmudul Hasan, Lead Data Scientist, TELUS Business Marketing
About the Speaker:
With deep expertise in Machine Learning and AI, Mahmudul has over 10 years industry experience of building enterprise level data products to achieve digital transformation, improve customer experience, new revenue opportunity, and cost savings for companies across the globe. He is currently serving as a Lead Data Scientist in TELUS Business Marketing. Mahmudul also designed and developed NLP course content for University of Toronto School of Continuing Studies and also serving as an instructor for the same.
Mahmudul holds a Master’s degree in Management Science from University of Waterloo and a Bachelor’s in Computer Science & Engineering.
Which talk track does this best fit into?
Workshop
Technical level of your talk?
(Technical level: 6/7)
Are there any industries (in particular) that are relevant for this talk?
Computer Software, Marketing & Advertising, Telecommunications
Who is this presentation for?
Senior Business Executives, Product Managers, Data Scientists/ ML Engineers and High-level Researchers, Data Scientists/ ML Engineers
What you’ll learn:
The audience will have a real world case study of how unsupervised NLP algorithm can be successfully create values for a business, and some tips and tricks which make this kind of project successful for a data scientist
What are the main core message (learning) you want attendees to take away from this talk?
Audience will see how a business problem is solved leveraging unstructured text data using NLP algorithms along with necessary tips and tricks which makes a unsupervised learning based project financially beneficial for the business.
Pre-requisite Knowledge:
Some basic understanding on Data Science
What is unique about this speech, from other speeches given on the topic?
The audience will get an idea of how unstructured data can be converted to generate financially impactful benefits for business. Also will share some tips on how to make this kind of unsupervised learning based project a successful for a big corporation like TELUS.
Abstract of Talk:
The workshop will be delivered in two part:
Part-1: Brief introduction to NLP concepts and ideas which would include
– Basic definitions and use cases
– Why NLP is a different ball game inside AI/ML (major challenges of processing natural language etc.)
– How those challenges are overcame with ML based approach
– Major workflow of building NLP application.
Part-2: is a detail implementation of a case study with coding details which I have implemented in TELUS. During this part-2, audience will see how a business problem is solved leveraging unstructured text data using NLP algorithms along with necessary tips and tricks which makes a unsupervised learning based project financially successful for the company.
Can you suggest 2-3 topics for post-discussion?
1. What are the challenges of implementing a data science project in business?
2. how can you make your AIML project impactful for the business?
Postdoctoral Fellow, University of Toronto / Machine Learning Researcher, Cyclica
Nasim is a Postdoctoral Fellow at University of Toronto and a Machine Learning Researcher Intern at Cyclica, leading a collaborative project between Cyclica, University of Toronto and Vector Institute. She is the vice-chair of Engineering in Medicine and Biology Society of IEEE Toronto section. Nasim obtained her Ph.D. in electrical and computer engineering from University of Manitoba and has M.Sc. and B.Sc. in biomedical engineering. With her passion for developing and applying novel machine learning techniques for improving the quality of health care, she has conducted numerous research projects on enhancing biomedical imaging for breast cancer detection and monitoring. Her current research is focused on graph-based machine learning models that can predict proteins’ biological functions from their 3D atomic structures, with a promise to enhance designing novel medicines. Nasim is an advocate for women in STEM, serves as vice-chair of IEEE Canada Women in Engineering, and was recognized as a “Visionary Emerging Leader”.
Co-Presenter: Dr. Farnoosh Khodakarami
Workshop: Graph Neural Network Modeling in Drug Discovery Using PyTorch
Abstract: Graph Neural Networks (GNNs) have been among the most popular neural network architectures, and as graph is a natural representation for protein and molecule, GNNs have shown big sparks in graph-based ML modeling for drug discovery and protein science. Graph-based ML models can help us in identifying the topology of a protein structure from protein sequence, predicting protein’s biological functions from protein structure as well as identifying protein-protein and protein-drug interactions. In this workshop, we will have an introduction on Graph Neural Network (GNN) and its application in drug discovery followed by a code session on PyTorch Geometric, which is a great PyTorch library for building GNN models for structured data. We will then have a code-base session to walk you through two useful tools built with PyTorch Geometric: TorchDrug and NodeCoder.
What You’ll Learn: Audience will learn about:
– Graph Neural Network (GNN) in drug discovery
– How to build GNN with PyTorch Geometric
– TorchDrug – ML platform for drug discovery
– TorchProtein – a ML library for protein science
– NodeCoder – a graph-based ML framework for predicting proteins’ biological functions
Technical Level: 7
Location: Toronto
Presenters:
Dr. Nasim Abdollahi, Postdoctoral Fellow at University of Toronto, Machine Learning Researcher at Cyclica & Dr. Farnoosh Khodakarami Computer Scientist & ML Researcher, Cyclica
About the Speaker:
Nasim is a Postdoctoral Fellow at University of Toronto and a Machine Learning Researcher Intern at Cyclica, leading a collaborative project between Cyclica, University of Toronto and Vector Institute. She is the vice-chair of Engineering in Medicine and Biology Society of IEEE Toronto section. Nasim obtained her Ph.D. in electrical and computer engineering from University of Manitoba and has M.Sc. and B.Sc. in biomedical engineering. With her passion for developing and applying novel machine learning techniques for improving the quality of health care, she has conducted numerous research projects on enhancing biomedical imaging for breast cancer detection and monitoring. Her current research is focused on graph-based machine learning models that can predict proteins’ biological functions from their 3D atomic structures, with a promise to enhance designing novel medicines. Nasim is an advocate for women in STEM, serves as vice-chair of IEEE Canada Women in Engineering, and was recognized as a “Visionary Emerging Leader”.
Farnoosh Khodakarami is an experienced computer scientist with a demonstrated history of working in the research industry. Skilled in application development with experience in machine learning applications. Strong research professional with a Doctor of Philosophy (Ph.D.) focused in Computer Science. Creative, self-motivated, and committed to working with a team-player attitude, great problem-solving skills, and the ability to quickly grasp new concepts.
Which talk track does this best fit into?
Workshop
Technical level of your talk?
(Technical level: 7/7)
Are there any industries (in particular) that are relevant for this talk?
Hospital & Health Care
What are the main core message (learning) you want attendees to take away from this talk?
Audience will learn about:
– Graph Neural Network (GNN) in drug discovery
– How to build GNN with PyTorch Geometric
– TorchDrug – ML platform for drug discovery
– TorchProtein – a ML library for protein science
– NodeCoder – a graph-based ML framework for predicting proteins’ biological functions
Abstract of Talk:
Graph Neural Networks (GNNs) have been among the most popular neural network architectures, and as graph is a natural representation for protein and molecule, GNNs have shown big sparks in graph-based ML modeling for drug discovery and protein science. Graph-based ML models can help us in identifying the topology of a protein structure from protein sequence, predicting protein’s biological functions from protein structure as well as identifying protein-protein and protein-drug interactions. In this workshop, we will have an introduction on Graph Neural Network (GNN) and its application in drug discovery followed by a code session on PyTorch Geometric, which is a great PyTorch library for building GNN models for structured data. We will then have a code-base session to walk you through two useful tools built with PyTorch Geometric: TorchDrug and NodeCoder.
Presenter:
Shagun Sodhani, Research Engineer, Meta AI
About the Speaker:
Research Engineer at Meta AI, previously at Mila and Adobe Research
Which talk track does this best fit into?
Workshop
Technical level of your talk?
(Technical level: 5/7)
What you’ll learn:
By the end of the session, the attendees would be able to take a simple PyTorch model and scale it to work with dozens of machines. For the straightforward use cases, this will require writing just a few lines of code.
Abstract of Talk:
PyTorch is one of the most popular ML frameworks with the recent releases focusing on enhanced support for distributed training. This talk discusses the different distributed training mechanisms provided by PyTorch. It should be helpful for both practitioners & researchers who want to train larger models and faster.
Software Engineer / Data Scientist, Bloomberg
Stefanie Molin is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also the author of “Hands-On Data Analysis with Pandas,” which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University’s Fu Foundation School of Engineering and Applied Science. She is currently pursuing a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.
Workshop: Beyond the Basics: Data Visualization in Python
Abstract: The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python.
While there are many plotting libraries to choose from, the prolific Matplotlib library is always a great place to start. Since various Python data science libraries utilize Matplotlib under the hood, familiarity with Matplotlib itself gives you the flexibility to fine tune the resulting visualizations (e.g., add annotations, animate, etc.). This session will also introduce interactive visualizations using HoloViz, which provides a higher-level plotting API capable of using Matplotlib and Bokeh (a Python library for generating interactive, JavaScript-powered visualizations) under the hood.
What You’ll Learn: Data visualization is essential for anyone working with data, but sometimes it can be difficult to create impactful visualizations in Python. In this workshop, we will move beyond the plotting basics and explore how to make compelling static, animated, and interactive visualizations.
Technical Level: 4
Location: New York City
Presenter:
Stefanie Moliin, Software Engineer / Data Scientist, Bloomberg
About the Speaker:
Stefanie Molin is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also the author of “Hands-On Data Analysis with Pandas,” which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University’s Fu Foundation School of Engineering and Applied Science. She is currently pursuing a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.
Which talk track does this best fit into?
Workshop (1.5-4 hours)
Technical level of your talk?
(Technical level: 4/7)
Who is this presentation for?
Data Scientists/ ML Engineers, ML Engineers, Researchers
What you’ll learn:
A workshop provides the attendees opportunities to ask questions to make sure they are understanding the concepts. Attendees will also have a workshop of curated examples using real-world data rather than the dummy or randomly-generated data nearly everywhere. Each of the visualizations is also created step-by-step, viewing how it changes with each command, which gives attendees a much stronger grasp of the concepts that they can apply elsewhere.
What are the main core message (learning) you want attendees to take away from this talk?
Data visualization is essential for anyone working with data, but sometimes it can be difficult to create impactful visualizations in Python. In this workshop, we will move beyond the plotting basics and explore how to make compelling static, animated, and interactive visualizations.
Pre-requisite Knowledge:
You should have basic knowledge of Python and be comfortable working in Jupyter Notebooks. Check out this notebook for a crash course in Python or work through the official Python tutorial for a more formal introduction. The environment we will use for this workshop comes with JupyterLab, which is pretty intuitive, but be sure to familiarize yourself using notebooks in JupyterLab and additional functionality in JupyterLab. In addition, a basic understanding of pandas will be beneficial, but is not required; reviewing the first section of my pandas workshop will be sufficient.
What is unique about this speech, from other speeches given on the topic?
My teaching style is very different: since the code examples I provide are carefully chosen, it’s easy to see why would take the approach I show, so I make sure that the attendees understand exactly what each line of code is doing to make that happen. I find that this gives the attendees knowledge that they can apply to other problems, rather than just knowing that the code all together has some effect — they get a deeper understanding and can use the concepts like building blocks for their own use cases. Attendees often praise the content in the slides as a detailed reference for later as well.
Abstract of Talk:
The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python.
While there are many plotting libraries to choose from, the prolific Matplotlib library is always a great place to start. Since various Python data science libraries utilize Matplotlib under the hood, familiarity with Matplotlib itself gives you the flexibility to fine tune the resulting visualizations (e.g., add annotations, animate, etc.). This session will also introduce interactive visualizations using HoloViz, which provides a higher-level plotting API capable of using Matplotlib and Bokeh (a Python library for generating interactive, JavaScript-powered visualizations) under the hood.
Can you suggest 2-3 topics for post-discussion?
Anything relating to the content covered, building data tools, or writing a book/creating workshops
Presenter:
Eric Hammel, MLOps Engineer, Rocket Science Development
About the Speaker:
A resourceful professional able to bridge skills between Data Science and Infrastructure (Cloud and HPC) to deliver valuable solutions. With experience in prototyping, deploying, and monitoring distributed workloads to drive an organization in translating real-life business problems into scalable data science solutions to generate value.
Which talk track does this best fit into?
Workshop
Technical level of your talk?
(Technical level: 5/7)
What you’ll learn:
The participants will get a crash course about Kubernetes and Cloud Native concepts. They will learn how to deploy an application on a managed kubernetes cluster.
Abstract of Talk:
Have you ever wondered what kubernetes and Cloud Native applications are?
Here is the perfect opportunity to get exposed to these complex yet powerful tools & conecepts.
You will discover Container Orchestration, Cloud Native applications, Kubernetes, and application deployment.
Presenter:
Piero Molino, CEO & Co-Founder, Predibase
Which talk track does this best fit into?
Technical
Technical level of your talk?
(Technical level: 5/7)
Abstract of Talk:
Declarative Machine Learning Systems are a new trend that marries the flexibility of DIY machine learning infrastructure and the simplicity of AutoML solutions. In this talk we will discuss about Ludwig, the open source declarative deep learning framework, and Predibase, an enterprise grade solution based on it.
Professor, Cheriton School of Computer Science, University of Waterloo
Director, Head of Apple Knowledge Platform, Apple
Ihab Ilyas is a professor in the Cheriton School of Computer Science and the NSERC-Thomson Reuters Research Chair on data quality at the University of Waterloo. He is currently on leave at Apple to lead the Apple Knowledge Platform team. His main research focuses on the areas of Data Science and data management, with special interest in data quality and integration, managing uncertain data, machine learning for data curation, and information extraction. Ihab is a co-founder of Tamr, a startup focusing on large-scale data integration, and the co-founder of inductiv (acquired by Apple), a Waterloo-based startup on using AI for structured data cleaning. He is an ACM Fellow and IEEE Fellow, a recipient of the Ontario Early Researcher Award, a Cheriton Faculty Fellowship, an NSERC Discovery Accelerator Award, and a Google Faculty Award. Ihab was an elected member of the VLDB Endowment board of trustees (2016-2021), elected SIGMOD vice chair (2016-2021), an associate editor of the ACM Transactions of Database Systems (2014-2020), and an associate editor of Foundations of Database Systems. He holds a PhD in computer science from Purdue University, West Lafayette.
Talk: Saga: Continuous Construction and Serving of Large Scale Knowledge Graphs
Abstract: In this talk I present Saga, an end-to-end platform for incremental and continuous construction of large scale knowledge graphs we built at Apple. Saga demonstrates the complexity of building such platform in industrial settings with strong consistency, latency, and coverage requirements. In the talk, I will discuss challenges around the following: building source adapters for ingesting heterogenous data sources; building entity linking and fusion pipelines for constructing coherent knowledge graphs that adhere to a common controlled vocabulary; updating the knowledge graphs with real-time streams; and finally, exposing the constructed knowledge via a variety of services. Graph services include: low-latency query answering; graph analytics; ML-biased entity disambiguation and semantic annotation; and other graph-embedding services to power multiple downstream applications. Saga is used in production at large scale to power a variety of user-facing knowledge features.
What You’ll Learn: Complexity of building large scale knowledge graphs
Track: Technical
Technical Level: 5
Location: Seattle
Presenter:
Ihab Ilyas, Professor in the Cheriton School of Computer Science and the NSERC-Thomson Reuters Research Chair on Data quality at the University of Waterloo
About the Speaker:
Ihab Ilyas is a professor in the Cheriton School of Computer Science and the NSERC-Thomson Reuters Research Chair on data quality at the University of Waterloo. He is currently on leave at Apple to lead the Apple Knowledge Platform team. His main research focuses on the areas of Data Science and data management, with special interest in data quality and integration, managing uncertain data, machine learning for data curation, and information extraction. Ihab is a co-founder of Tamr, a startup focusing on large-scale data integration, and the co-founder of inductiv (acquired by Apple), a Waterloo-based startup on using AI for structured data cleaning. He is an ACM Fellow and IEEE Fellow, a recipient of the Ontario Early Researcher Award, a Cheriton Faculty Fellowship, an NSERC Discovery Accelerator Award, and a Google Faculty Award. Ihab was an elected member of the VLDB Endowment board of trustees (2016-2021), elected SIGMOD vice chair (2016-2021), an associate editor of the ACM Transactions of Database Systems (2014-2020), and an associate editor of Foundations of Database Systems. He holds a PhD in computer science from Purdue University, West Lafayette.
Which talk track does this best fit into?
Technical / Research
Technical level of your talk?
(Technical level: 5/7)
Are there any industries (in particular) that are relevant for this talk?
Information Technology & Service
What are the main core message (learning) you want attendees to take away from this talk?
Complexity of building large scale knowledge graphs.
Abstract of Talk:
In this talk I present Saga, an end-to-end platform for incremental and continuous construction of large scale knowledge graphs we built at Apple. Saga demonstrates the complexity of building such platform in industrial settings with strong consistency, latency, and coverage requirements. In the talk, I will discuss challenges around the following: building source adapters for ingesting heterogenous data sources; building entity linking and fusion pipelines for constructing coherent knowledge graphs that adhere to a common controlled vocabulary; updating the knowledge graphs with real-time streams; and finally, exposing the constructed knowledge via a variety of services. Graph services include: low-latency query answering; graph analytics; ML-biased entity disambiguation and semantic annotation; and other graph-embedding services to power multiple downstream applications. Saga is used in production at large scale to power a variety of user-facing knowledge features.
CEO, Claypot AI
Chip Huyen is a co-founder of Claypot AI, a platform for real-time machine learning. Previously, she was with Snorkel AI and NVIDIA. She teaches CS 329S: Machine Learning Systems Design at Stanford. She’s the author of Designing Machine Learning Systems, an Amazon bestseller in AI. She has also written four bestselling Vietnamese books.
Talk: Real-time Machine Learning: Architecture and Challenges
Abstract: Fresh data beats stale data for machine learning applications. This talk discusses the value of fresh data as well as different types of architecture and challenges of online prediction.
What You’ll Learn: Fresh data beats stale data for machine learning applications
Track: Technical
Technical Level: 5
Location: San Franciso
Presenter:
Chip Huyen, CEO, Claypot AI
About the Speaker:
Chip Huyen is a co-founder of Claypot AI, a platform for real-time machine learning. Previously, she was with Snorkel AI and NVIDIA. She teaches CS 329S: Machine Learning Systems Design at Stanford. She’s the author of Designing Machine Learning Systems, an Amazon bestseller in AI. She has also written four bestselling Vietnamese books.
Which talk track does this best fit into?
Technical / Research
Technical level of your talk?
(Technical level: 5/7)
Are there any industries (in particular) that are relevant for this talk?
Banking & Financial Services, Computer Software, Information Technology & Service, Insurance, Marketing & Advertising
What are the main core message (learning) you want attendees to take away from this talk?
Fresh data beats stale data for machine learning applications
Abstract of Talk:
Fresh data beats stale data for machine learning applications. This talk discusses the value of fresh data as well as different types of architecture and challenges of online prediction.
Professor, University of Toronto
Anne Martel is a Professor in Medical Biophysics at the University of Toronto, the Tory Family Chair in Oncology at Sunnybrook Research Institute, and a Faculty Affiliate at the Vector Institute, Toronto. Her research program is focused on medical image and digital pathology analysis, particularly on the development of self-supervised and weakly supervised methods for segmentation, diagnosis, and prediction/prognosis. In 2006 she co-founded Pathcore, a software company developing complete workflow solutions for digital pathology.
Dr Martel is an active member of the medical image analysis community and is a fellow of the MICCAI Society which represents engineers and computer scientists working in this field. She has served as board member of MICCAI and is currently on the editorial board of Medical Image Analysis, on of the leading journals in the field.
Talk: Artificial Intelligence And Digital Pathology: Making The Most of Limited Annotated Data
Abstract: Obtaining large datasets with detailed annotations for medical imaging AI projects is a time consuming and expensive process as it usually requires the input of expert radiologists and pathologists. Collecting data to train outcome prediction models is even more challenging as the number of patients with both imaging and follow up data may be small, and only weak labels are available.
This talk will describe several semi-supervised and self-supervised approaches which can make more efficient use of small and/or weakly labelled datasets. The focus will be on digital pathology but the methods described are applicable any medical imaging modality.
What You’ll Learn: Self-supervision and smart sampling strategies are essential in digital pathology
Track: Advanced Technical/Research
Technical Level: 6
Location: Toronto
Presenter:
Anne Martel, Professor, University of Toronto
About the Speaker:
Anne Martel is a Professor in Medical Biophysics at the University of Toronto, the Tory Family Chair in Oncology at Sunnybrook Research Institute, and a Faculty Affiliate at the Vector Institute, Toronto. Her research program is focused on medical image and digital pathology analysis, particularly on the development of self-supervised and weakly supervised methods for segmentation, diagnosis, and prediction/prognosis. In 2006 she co-founded Pathcore, a software company developing complete workflow solutions for digital pathology.
Dr Martel is an active member of the medical image analysis community and is a fellow of the MICCAI Society which represents engineers and computer scientists working in this field. She has served as board member of MICCAI and is currently on the editorial board of Medical Image Analysis, on of the leading journals in the field.
Which talk track does this best fit into?
Advanced Technical / Research
Technical level of your talk?
(Technical Level: 6/7)
What you’ll learn:
Self-supervision and smart sampling strategies are essential in digital pathology
Abstract of Talk:
Obtaining large datasets with detailed annotations for medical imaging AI projects is a time consuming and expensive process as it usually requires the input of expert radiologists and pathologists. Collecting data to train outcome prediction models is even more challenging as the number of patients with both imaging and follow up data may be small, and only weak labels are available.
This talk will describe several semi-supervised and self-supervised approaches which can make more efficient use of small and/or weakly labelled datasets. The focus will be on digital pathology but the methods described are applicable any medical imaging modality.
Senior Research Scientist, Sony AI
Varun Kompella is currently a senior research scientist at Sony AI. He earned his master’s of science degree in informatics with a specialization in graphics, vision and robotics from Institut Nationale Polytechnique de Grenoble (INRIA Grenoble), and a Ph.D degree from Università della Svizzera Italiana (IDSIA Lugano), Switzerland, working with Prof. Juergen Schmidhuber. In his thesis work he developed algorithms that use the slowness principle for driving exploration in reinforcement learning agents. After completing his Ph.D., he worked as a postdoctoral researcher at the Institute for Neural Computation (INI), Germany. His research contributions led to several patents, publications in peer-reviewed journals and conference proceedings.
Talk: Outracing Champion Gran Turismo Drivers With Deep Reinforcement Learning
Abstract: Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world’s best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics.
In addition, we construct a reward function that enables the agent to be competitive while adhering to racing’s important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four of the world’s best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms.
What You’ll Learn: We demonstrate the possibilities and challenges of using deep RL techniques to control complex dynamical systems in domains such as Gran Turismo where agents must respect imprecisely defined human norms.
Track: Technical / Research
Technical Level: 7
Location: Ottawa
Presenter:
Varun Raj Kompella, Senior Research Scientist, Sony AI
About the Speaker:
Varun Kompella is currently a senior research scientist at Sony AI. He earned his master’s of science degree in informatics with a specialization in graphics, vision and robotics from Institut Nationale Polytechnique de Grenoble (INRIA Grenoble), and a Ph.D degree from Università della Svizzera Italiana (IDSIA Lugano), Switzerland, working with Prof. Juergen Schmidhuber. In his thesis work he developed algorithms that use the slowness principle for driving exploration in reinforcement learning agents. After completing his Ph.D., he worked as a postdoctoral researcher at the Institute for Neural Computation (INI), Germany. His research contributions led to several patents, publications in peer-reviewed journals and conference proceedings.
Which talk track does this best fit into?
Technical / Research
Technical level of your talk?
(Technical Level: 7/7)
What you’ll learn:
We demonstrate the possibilities and challenges of using deep RL techniques to control complex dynamical systems in domains such as Gran Turismo where agents must respect imprecisely defined human norms.
Abstract of Talk:
Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world’s best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics.
In addition, we construct a reward function that enables the agent to be competitive while adhering to racing’s important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four of the world’s best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms.
Software Engineer, Google Brain
Bo Chang is a software engineer at Google Brain, based in Toronto, Canada. Prior to that, he was a machine learning researcher at Borealis AI. He finished his Ph.D. in statistics at the University of British Columbia.
Talk: Latent User Intent Modeling in Recommender Systems
Abstract: The current sequential recommender systems mainly rely on users’ item-level interaction history to capture topical interests and lacks a high-level understanding of user intent. It is challenging to explicitly define and enumerate all possible user intents. We propose to use latent variable models to capture user intents as latent variables through encoding and decoding user behavior signals, with an application to a large industrial recommender system.
What You’ll Learn: How to better model user intent in recommender systems using a latent variable model.
Track: Advanced Technical/ Research
Technical Level: 7
Location: Toronto
Presenter:
Bo Chang, Software Engineer, Google Brain
About the Speaker:
Bo Chang is a software engineer at Google Brain, based in Toronto, Canada. Prior to that, he was a machine learning researcher at Borealis AI. He finished his Ph.D. in statistics at the University of British Columbia.
Which talk track does this best fit into?
Advanced Technica l/ Research
Technical level of your talk?
(Technical Level: 7/7)
What you’ll learn:
How to better model user intent in recommender systems using a latent variable model.
Abstract of Talk:
The current sequential recommender systems mainly rely on users’ item-level interaction history to capture topical interests and lacks a high-level understanding of user intent. It is challenging to explicitly define and enumerate all possible user intents. We propose to use latent variable models to capture user intents as latent variables through encoding and decoding user behavior signals, with an application to a large industrial recommender system.
Data Scientists, Unity Health Toronto
Chloé Pou-Prom is a data scientist with the Data Science and Advanced Analytics (DSAA) team at Unity Health Toronto. The DSAA team uses high quality healthcare data in innovative ways to catalyze communities of data users and decision makers in making transformative changes that improve patient outcomes and healthcare system efficiency.
Co-Presenter: Vaakesan Sundrelingam
Workshop: NLP for Healthcare: Challenges With Processing and De-Identifying Clinical Notes
Abstract: Clinical notes (e.g., admission notes, nurse notes, radiology reports) are rich with information. In this session, we discuss the challenges of working with text data from two different perspectives. First, we provide an overview of the different issues that one can encounter when working with healthcare data, with an emphasis on data processing and cleaning. Then, we focus on the challenges that arise when it comes to sharing data across hospitals, more specifically de-identifying clinical text data. Finally, we provide a demo of pydeid, a Python-based de-identification software that identifies and replaces personal health information (PHI).
What You’ll Learn:
1) Why NLP for healthcare is challenging;
2) Why sharing clinical notes across hospitals is difficult; and
3) Some tips and tools to help out with (1) and (2)
Technical Level: 3
Location: Toronto
Presenters:
Chloe Pou-Prom, Data Scientists, Unity Health Toronto & Vaakesan Sundrelingam, Data Scientists, Unity Health Toronto
About the Speakers:
Chloé Pou-Prom is a data scientist with the Data Science and Advanced Analytics (DSAA) team at Unity Health Toronto. The DSAA team uses high quality healthcare data in innovative ways to catalyze communities of data users and decision makers in making transformative changes that improve patient outcomes and healthcare system efficiency.
Vaakesan Sundrelingam is a data scientist with the GEMINI team at Unity Health Toronto. GEMINI is Canada’s largest hospital data & analytics study, helping physicians, health care teams, and hospitals use data to gain insights into patient care and improve patient outcomes. GEMINI uses machine learning in creative ways to prepare large amounts of data for researchers, as well as in clinical applications such as to detect particularly difficult to measure conditions for quality of care improvement initiatives.
Technical level of your talk?
(Technical Level: 3/7)
What you’ll learn:
1) Why NLP for healthcare is challenging;
2) Why sharing clinical notes across hospitals is difficult; and
3) Some tips and tools to help out with (1) and (2)
Abstract of Talk:
Clinical notes (e.g., admission notes, nurse notes, radiology reports) are rich with information. In this session, we discuss the challenges of working with text data from two different perspectives. First, we provide an overview of the different issues that one can encounter when working with healthcare data, with an emphasis on data processing and cleaning. Then, we focus on the challenges that arise when it comes to sharing data across hospitals, more specifically de-identifying clinical text data. Finally, we provide a demo of pydeid, a Python-based de-identification software that identifies and replaces personal health information (PHI).
Co-Founder & CEO, Private AI
Patricia Thaine is the Co-Founder & CEO of Private AI, a Microsoft-backed startup, is also a Computer Science PhD Candidate at the University of Toronto (on leave) and a Vector Institute alumna. Her R&D work is focused on privacy-preserving natural language processing, with a focus on applied cryptography and re-identification risk. She also does research on computational methods for lost language decipherment. Patricia is a recipient of the NSERC Postgraduate Scholarship, the RBC Graduate Fellowship, the Beatrice “Trixie” Worsley Graduate Scholarship in Computer Science, and the Ontario Graduate Scholarship. She has ten years of research and software development experience, including at the McGill Language Development Lab, the University of Toronto’s Computational Linguistics Lab, the University of Toronto’s Department of Linguistics, and the Public Health Agency of Canada.
Workshop: Demystifying De-Identification
Abstract: Workshop with discussion and demo. The session will begin with an overview of privacy enhancing technologies and then dive into de-identification terminology (de-identification, anonymization, redaction, pseudonymization), how these have been misunderstood, and what to think about when choosing between one of these and other privacy enhancing technologies.
The attendees should bring a sample dataset (preferably made up of unstructured text) and a use case in mind. Each attendee will receive an API key to process a data sample and we will discuss the results. Data can be in languages other than English. Please confirm with organizer that the language is supported first.
What You’ll Learn: Attendees will learn about which privacy enhancing technologies are best for their use case and understand when de-identification is right for them and how not to misuse terminology such as “anonymization”
Technical Level: 4
Location: Toronto
Presenter:
Patricia Thaine, Co-Founder & CEO, Private AI
About the Speaker:
Patricia Thaine is the Co-Founder & CEO of Private AI, a Microsoft-backed startup, is also a Computer Science PhD Candidate at the University of Toronto (on leave) and a Vector Institute alumna. Her R&D work is focused on privacy-preserving natural language processing, with a focus on applied cryptography and re-identification risk. She also does research on computational methods for lost language decipherment. Patricia is a recipient of the NSERC Postgraduate Scholarship, the RBC Graduate Fellowship, the Beatrice “Trixie” Worsley Graduate Scholarship in Computer Science, and the Ontario Graduate Scholarship. She has ten years of research and software development experience, including at the McGill Language Development Lab, the University of Toronto’s Computational Linguistics Lab, the University of Toronto’s Department of Linguistics, and the Public Health Agency of Canada.
Technical level of your talk?
(Technical Level: 4/7)
What you’ll learn:
Attendees will learn about which privacy enhancing technologies are best for their use case and understand when de-identification is right for them and how not to misuse terminology such as “anonymization”
Abstract of Talk:
Workshop with discussion and demo. The session will begin with an overview of privacy enhancing technologies and then dive into de-identification terminology (de-identification, anonymization, redaction, pseudonymization), how these have been misunderstood, and what to think about when choosing between one of these and other privacy enhancing technologies.
The attendees should bring a sample dataset (preferably made up of unstructured text) and a use case in mind. Each attendee will receive an API key to process a data sample and we will discuss the results. Data can be in languages other than English. Please confirm with organizer that the language is supported first.
Senior Data Scientist, BlackRock
Bhaskarjit is a data scientist and has solved business problems in many domains including Retail, FMCG, Banking, Media & Entertainment etc. using machine learning. Currently he is working as a data scientist BlackRock where he builds predictive models for financial markets. His research interests are Network Science, AI Interpretability, Uncertainty, NLP etc.
Workshop: Learning Embedded Representation of the Stock Correlation Matrix using Graph Machine Learning
Abstract: Understanding non-linear relationships among financial instruments has various applications in investment processes ranging from risk management, portfolio construction and trading strategies. Here, we focus on interconnectedness among stocks based on their correlation matrix which we represent as a network with the nodes representing individual stocks and the weighted links between pairs of nodes representing the corresponding pair-wise correlation coefficients. The traditional network science techniques, which are extensively utilized in financial literature, require handcrafted features such as centrality measures to understand such correlation networks.
However, manually enlisting all such handcrafted features may quickly turn out to be a daunting task. Instead, we propose a new approach for studying nuances and relationships within the correlation network in an algorithmic way using a graph machine learning algorithm called Node2Vec.
In particular, the algorithm compresses the network into a lower dimensional continuous space, called an embedding, where pairs of nodes that are identified as similar by the algorithm are placed closer to each other. By using log returns of S&P 500 stock data, we show that our proposed algorithm can learn such an embedding from its correlation network. We define various domain specific quantitative (and objective) and qualitative metrics that are inspired by metrics used in the field of Natural Language Processing (NLP) to evaluate the embeddings in order to identify the optimal one. Further, we discuss various applications of the embeddings in investment management.
What You’ll Learn: In this paper we have shown how to create stock embedding representation from stock correlation matrix. And evaluated the learnt embeddings using a quantitative way
Pre-requiste Knowledge: Network Science, Machine Learning, Word Embeddings
Technical Level: 5
Location: Delhi
Presenter:
Bhaskarjit Sarmah, Senior Data Scientist, BlackRock
About the Speaker:
Bhaskarjit is a data scientist and has solved business problems in many domains including Retail, FMCG, Banking, Media & Entertainment etc. using machine learning. Currently he is working as a data scientist BlackRock where he builds predictive models for financial markets. His research interests are Network Science, AI Interpretability, Uncertainty, NLP etc.
Which talk track does this best fit into?
Research: Advanced Technical.
Technical level of your talk?
(Technical level: 4 /7)
Are there any industries (in particular) that are relevant for this talk?
Banking & Financial Services, Information Technology & Service, Insurance, Marketing & Advertising
Who is this presentation for?
Senior Business Executives, Product Managers, Data Scientists/ ML Engineers and High-level Researchers, Product Managers, Data Scientists/ ML Engineers, ML Engineers, Researchers
What you’ll learn:
In this paper we have shown how to create stock embedding representation from stock correlation matrix. And evaluated the learnt embeddings using a quantitative way.
What are the main core message (learning) you want attendees to take away from this talk?
How to represent financial securities in form of embeddings using graph machine learning
Pre-requisite Knowledge:
Network Science, Machine Learning, Word Embeddings
What is unique about this speech, from other speeches given on the topic?
This speech is centered around feature extraction from networks. In this speech, will first introduce the traditional hand crafted feature extraction technique from networks. And then will explain how we can use graph machine learning for automatic feature extraction in the form embeddings. And how to evaluate those embeddings in quantitative way.
Abstract of Talk:
Understanding non-linear relationships among financial instruments has various applications in investment processes ranging from risk management, portfolio construction and trading strategies. Here, we focus on interconnectedness among stocks based on their correlation matrix which we represent as a network with the nodes representing individual stocks and the weighted links between pairs of nodes representing the corresponding pair-wise correlation coefficients. The traditional network science techniques, which are extensively utilized in financial literature, require handcrafted features such as centrality measures to understand such correlation networks. However, manually enlisting all such handcrafted features may quickly turn out to be a daunting task. Instead, we propose a new approach for studying nuances and relationships within the correlation network in an algorithmic way using a graph machine learning algorithm called Node2Vec. In particular, the algorithm compresses the network into a lower dimensional continuous space, called an embedding, where pairs of nodes that are identified as similar by the algorithm are placed closer to each other. By using log returns of S&P 500 stock data, we show that our proposed algorithm can learn such an embedding from its correlation network. We define various domain specific quantitative (and objective) and qualitative metrics that are inspired by metrics used in the field of Natural Language Processing (NLP) to evaluate the embeddings
Can you suggest 2-3 topics for post-discussion?
Node2Vec, Stock Embeddings, Network Science
Presenters:
Danny Chiao, Tech Lead, Feast & Eddie Esquivel, Sr. Solutions Architect, Tecton & Abhin Chhabra, ML Platform Tech Lead, Shopify
About the Speakers:
Danny Chiao is an engineering lead at Tecton/Feast Inc working on building a next-generation feature store. Previously, Danny was a technical lead at Google working on end to end machine learning problems within Google Workspace, helping build privacy-aware ML platforms / data pipelines and working with research and product teams to deliver large-scale ML powered enterprise functionality. Danny holds a Bachelor’s degree in Computer Science from MIT. |
Eddie Esquivel is a Solutions Architect at Tecton, where he helps customers implement feature stores as part of their stack for Operational ML. Prior to Tecton, Eddie was a Solutions Architect at AWS.
Abhin leads the feature store team for Shopify’s ML Platform.
Which talk track does this best fit into?
Workshop
Technical level of your talk?
(Technical level: 4/7)
Who is this presentation for?
Product Managers, Data Scientists/ ML Engineers, ML Engineers
What you’ll learn:
You will learn how to:
– Build new features
– Automate the transformation of batch data
– Automate the transformation of streaming and real-time data
– Create training datasets
– Serve data online using DynamoDB or Redis
– Build fraud detection system using Tecton and Feast
Pre-requisite Knowledge:
Attendees should have functional knowledge of Python, SQL and Spark, as well as familiarity with the challenges of data engineering for ML.
What is unique about this speech, from other speeches given on the topic?
Danny and Eddie are core members of the Feast and Tecton Engineering and Solutions Architect teams. They have deep expertise in working with dozens of end-users to build real-time recommendation systems using feature stores. They also have a lot of experience working on ML infrastructure at Google, AWS, and Tecton.
Abstract of Talk:
In this workshop, we’ll show how to build a real-time fraud detection system using some of the latest tooling for managing ML data pipelines. We’ll walk through the process of building, deploying, and serving real-time data pipelines, highlighting the differences between a traditional feature store (using Feast, the open source feature store) and a feature platform (using Tecton).
We’ll present common architectural patterns and walk you through building a model in three stages:
– Batch, daily computed predictions
– Online predictions using batch features
– Online predictions using real-time features
Can you suggest 2-3 topics for post-discussion?
– Best practices for ML recommendation systems
– Building streaming and real-time data pipelines for ML
– Feature Stores: have you implemented one? Let’s share learnings
Unity Health Toronto – VP: Data Science and Advanced Analytics; Director: Temerty Centre for Artificial Intelligence Research and Education in Medicine of the University of Toronto; Professor – University of Toronto
Dr. Mamdani is Vice President of Data Science and Advanced Analytics at Unity Health Toronto and Director of the University of Toronto Temerty Faculty of Medicine Centre for Artificial Intelligence Research and Education in Medicine (T-CAIREM). Dr. Mamdani’s team bridges advanced analytics including machine learning with clinical and management decision making to improve patient outcomes and hospital efficiency. Dr. Mamdani is also Professor in the Department of Medicine of the Temerty Faculty of Medicine, the Leslie Dan Faculty of Pharmacy, and the Institute of Health Policy, Management and Evaluation of the Dalla Lana Faculty of Public Health. He is also adjunct Senior Scientist at the Institute for Clinical Evaluative Sciences (ICES) and a Faculty Affiliate of the Vector Institute. In 2010, Dr. Mamdani was named among Canada’s Top 40 under 40. He has published over 500 studies in peer-reviewed medical journals. Dr. Mamdani obtained a Doctor of Pharmacy degree (PharmD) from the University of Michigan (Ann Arbor) and completed a fellowship in pharmacoeconomics and outcomes research at the Detroit Medical Center. During his fellowship, Dr. Mamdani obtained a Master of Arts degree in Economics from Wayne State University in Detroit, Michigan with a concentration in econometric theory. He then completed a Master of Public Health degree from Harvard University in 1998 with a concentration in quantitative methods.
Talk: Saving Lives with ML: Applications and Learnings
Abstract: Machine learning (ML) has transformed numerous industries but its application in healthcare has been limited. ML applications are expected to permeate healthcare in the near future with a recent explosion in academic and commercial activity. The application of ML in healthcare, however, is complicated by a variety of factors including the significant variability in needs, healthcare settings and patients served in these settings, workflows, and available resources. This talk will present a case study of Unity Health Toronto and its journey in developing and deploying numerous ML solutions into clinical practice, including bridging public and private sector partnerships to spread innovations internationally. The talk will also present a novel Canadian academic centre dedicated to artificial intelligence (AI) in medicine – the Temerty Centre for Artificial Intelligence Research and Education in Medicine (T-CAIREM) at the University of Toronto.
What You’ll Learn: The successful application of ML in healthcare is multifaceted and highly dependent on end-user engagement.
Innovative public-private partnerships are needed to spread ML applications globally.
Multidisciplinary, collaborative efforts will fuel innovations in the development and application of ML in healthcare.
Track: Case Study
Technical Level: 3
Location: Toronto
Presenter:
Muhammad Mamdani, Unity Health Toronto – VP: Data Science and Advanced Analytics; Director: Temerty Centre for Artificial Intelligence Research and Education in Medicine of the University of Toronto; Professor – University of Toronto
About the Speaker:
Dr. Mamdani is Vice President of Data Science and Advanced Analytics at Unity Health Toronto and Director of the University of Toronto Temerty Faculty of Medicine Centre for Artificial Intelligence Research and Education in Medicine (T-CAIREM). Dr. Mamdani’s team bridges advanced analytics including machine learning with clinical and management decision making to improve patient outcomes and hospital efficiency.
Dr. Mamdani is also Professor in the Department of Medicine of the Temerty Faculty of Medicine, the Leslie Dan Faculty of Pharmacy, and the Institute of Health Policy, Management and Evaluation of the Dalla Lana Faculty of Public Health. He is also adjunct Senior Scientist at the Institute for Clinical Evaluative Sciences (ICES) and a Faculty Affiliate of the Vector Institute. In 2010, Dr. Mamdani was named among Canada’s Top 40 under 40. He has published over 500 studies in peer-reviewed medical journals.
Dr. Mamdani obtained a Doctor of Pharmacy degree (PharmD) from the University of Michigan (Ann Arbor) and completed a fellowship in pharmacoeconomics and outcomes research at the Detroit Medical Center. During his fellowship, Dr. Mamdani obtained a Master of Arts degree in Economics from Wayne State University in Detroit, Michigan with a concentration in econometric theory. He then completed a Master of Public Health degree from Harvard University in 1998 with a concentration in quantitative methods.
Which talk track does this best fit into?
Case Study
Technical level of your talk?
(Technical level: 3/7)
Are there any industries (in particular) that are relevant for this talk?
Hospital & Health Care
Who is this presentation for?
The successful application of ML in healthcare is multifaceted and highly dependent on end-user engagement.
Innovative public-private partnerships are needed to spread ML applications globally.
Multidisciplinary, collaborative efforts will fuel innovations in the development and application of ML in healthcare.
Abstract of Talk:
Machine learning (ML) has transformed numerous industries but its application in healthcare has been limited. ML applications are expected to permeate healthcare in the near future with a recent explosion in academic and commercial activity. The application of ML in healthcare, however, is complicated by a variety of factors including the significant variability in needs, healthcare settings and patients served in these settings, workflows, and available resources. This talk will present a case study of Unity Health Toronto and its journey in developing and deploying numerous ML solutions into clinical practice, including bridging public and private sector partnerships to spread innovations internationally. The talk will also present a novel Canadian academic centre dedicated to artificial intelligence (AI) in medicine – the Temerty Centre for Artificial Intelligence Research and Education in Medicine (T-CAIREM) at the University of Toronto.
Director of Advanced Analytics, Coca Cola
Nikita has over 10 years of experience in the Retail and Consumer Packaged Goods industries, working for companies like Loblaw and Sears. He is also an alumnus of the Master of Management Analytics program from Queen’s University, and holds a Bachelor of Finance & Economics degree from University of Toronto
Co-Presenter: Winston Li
Talk: The Application of Mobile Location Data for Vending Machine Site Selection and Revenue Optimization.
Abstract: In this presentation, we present an innovative approach to utilizing mobility data to optimize the placement of vending machines in Canada. Coca-Cola has more than 10k vending machines in various locations and their ROI heavily depends on the amount of foot traffic next to them as well as who those people are. For this use case, we’ll be concentrating on using the super detailed mobility data to understand the difference between our best machines and worst at scale, and optimizing their location based on the mobility data to increase the ROI. In addition to the practical and business application, we’ll also be able to share the algorithms used and the tech stack with the audience.
What You’ll Learn: Mobility data as an alternative data source for consumer related analytics and its recency and granularity and really drive measurable business outcomes.
Track: Case Study
Technical Level: 4
Location: Toronto
Presenters:
Nikita Medvedev, Director of Advanced Analytics & Winston Li, Founder, Coca Cola & Arima
About the Speaker:
Winston is the founder of Arima, a Canadian based startup that provides consumer data to its users. Our flagship product, the Synthetic Society, is a privacy-by-design, individual level database that mirrors the real society. Built using trusted sources like census, market research, mobility and purchase patterns, it contains 10k+ attributes across North America and enables advanced modelling at the most granular level.
Prior to founding Arima, Winston was the Director of Data Science at PwC and Omnicom. Winston is also a part-time faculty member at Northeastern University Toronto and sits on the advisory board of the Master of Analytics program.
Nikita is the Director of Advanced Analytics at Coca-Cola Canada Bottling Limited. Together with his team he is transforming terabytes of business operations data into actionable insights to drive growth and innovate in the Consumer Packaged Goods industry. He loves finding novel solutions to old problems and is obsessed with driving real lasting change through better use of data.
Nikita has over 10 years of experience in the Retail and Consumer Packaged Goods industries, working for companies like Loblaw and Sears. He is also an alumnus of the Master of Management Analytics program from Queen’s University, and holds a Bachelor of Finance & Economics degree from University of Toronto.
Which talk track does this best fit into?
Case Study
Technical level of your talk?
(Technical level: 4/7)
Are there any industries (in particular) that are relevant for this talk?
Food & Beverages, Information Technology & Service, Marketing & Advertising
What are the main core message (learning) you want attendees to take away from this talk?
Mobility data as an alternative data source for consumer related analytics and its recency and granularity and really drive measurable business outcomes.
Abstract of Talk:
In this presentation, we present an innovative approach to utilizing mobility data to optimize the placement of vending machines in Canada. Coca-Cola has more than 10k vending machines in various locations and their ROI heavily depends on the amount of foot traffic next to them as well as who those people are. For this use case, we’ll be concentrating on using the super detailed mobility data to understand the difference between our best machines and worst at scale, and optimizing their location based on the mobility data to increase the ROI. In addition to the practical and business application, we’ll also be able to share the algorithms used and the tech stack with the audience.
Lead Data Scientist, FreshBooks
Valerii joined FreshBooks a year ago to lead and grow a team of Data Scientists and Machine Learning Engineers. He has an experience in multiple industries ranging from Electronics to Clean Tech and has contributed to the development of innovative solutions for a variety of brands such as LG Electronics, Panasonic, Samsung, Toyota, Scotiabank, Cineplex. He has a University Degree in Telecom Engineering and PhD in Automated Control Systems. Author of 20 patented inventions in Signal Processing, Electronics and Computing.
Talk: Builidng a Fully Automated ML Platform Using Kubeflow and Declarative Approach to Development of End-to-End ML Pipelines
Abstract: Recent innovations in the ML ecosystem have seen the emergence of operationally-focused technology like declarative systems and data-centric AI. These techniques appear to be a radical change for AI practitioners, who can now more simply frame use cases and manage workflows. In this talk, we’ll take a look at the history of AI to see the progress that has been made and how we’ve arrived at where we are now. How are high-tech companies handling AI initiatives internally, and why aren’t we all copying them? Has MLOps been the promised solution to simplifying deployment and monitoring of production AI? How do we create a simpler paradigm for operationalizing AI? All these questions and more will be addressed.
What You’ll Learn: A journey to higher levels of MLOps maturity is unique for any company and has no recipes due to experimental nature of MLOps. Many insights and ideas in this area are the results of investments by big names (Google, Microsoft, Amazon) and knowledge sharing between smaller companies like us working on similar problems. We are grateful for this opportunity to contribute to the ecosystem so that others can learn from us.
Track: Case Study
Technical Level: 6
Location: Toronto
Presenters:
Valerii Podymov, Lead Data Scientist, FreshBooks & Roshan Isaac, Machine Learning Engineer, FreshBooks & Vlad Ryzhkov, Senior Data Engineer, FreshBooks & Joey Zhou, Senior Data Engineer, FreshBooks
About the Speaker:
Valerii joined FreshBooks a year ago to lead and grow a team of Data Scientists and Machine Learning Engineers. He has an experience in multiple industries ranging from Electronics to Clean Tech and has contributed to the development of innovative solutions for a variety of brands such as LG Electronics, Panasonic, Samsung, Toyota, Scotiabank, Cineplex. He has a University Degree in Telecom Engineering and PhD in Automated Control Systems. Author of 20 patented inventions in Signal Processing, Electronics and Computing.
Roshan works as a Machine Learning Engineer at FreshBooks where he is building ML Platform on Vertex AI and bringing MLOps best practices to the organization. He was previously at the same role with Cineplex. He has a Bachelor Degree in Computer Science and Engineering and hold graduate certificates in AI & Project Management. Overall he has 8+ years of experience in Machine Learning, Data Analytics and CRM software working in different startups and companies in Canada and India. He published papers in IEEE conferences and was a speaker at Libre Software Meeting (LSM), France.
Vlad joined FreshBooks a year ago with extensive Data Engineering background and he works on building ML Platform bringing best practices in large-scale data processing to the company. He has a PhD in System Analysis, Management and Information Processing. Overall, his 15+ years of software development experience comprises such areas as financial systems, e-commerce, e-sport and airlines in Canada and overseas.
Joey joined FreshBooks three months ago and works on the continuous monitoring framework for the ML team. Before, he had an experience in the tech industry, ranging from social-dating to e-commerce, in multiple roles such as Data Scientist and Machine Learning Engineer. He built a recommender systems for one of the largest e-commerce platforms in China. With hands-on experience in building and productionizing ML models, he is ready to pursue his passion for MLOps at FreshBooks.
Which talk track does this best fit into?
Technical / Research
Technical level of your talk?
(Technical level: 6/7)
Are there any industries (in particular) that are relevant for this talk?
Banking & Financial Services, Computer Software
Who is this presentation for?
Data Scientists/ ML Engineers, ML Engineers
What you’ll learn:
How we tackled existing challenges with Kubeflow pipelines changing the imperative approach to the declarative.
What are the main core message (learning) you want attendees to take away from this talk?
A journey to higher levels of MLOps maturity is unique for any company and has no recipes due to experimental nature of MLOps. Many insights and ideas in this area are the results of investments by big names (Google, Microsoft, Amazon) and knowledge sharing between smaller companies like us working on similar problems. We are grateful for this opportunity to contribute to the ecosystem so that others can learn from us.
On a scale of 1-10 how mature is this applied AI application you plan to discuss?
9/10
Pre-requisite Knowledge:
Machine Learning Lifecycle
What kind of DevOps tools you plan to discuss? Open source?
GitHub Actions, Kubeflow
What are some of the languages you plan to discuss?
Python, SQL
What are some of the infrastructures you plan to discuss?
BigQuery, Airflow, Vertex AI, containers
What is unique about this speech, from other speeches given on the topic?
Managing MLOps is highly immature topic with lack or absence of commonly accepted best practice, so the experience of any company in growing over MLOps maturity levels is always unique.
Abstract of Talk:
This talk is about our journey at FreshBooks from mostly manual processes in productionizing of our ML models to the highest levels of maturity in MLOps. First, we briefly go over a list of challenges we faced when working on the ML platform as a hybrid team of Data Scientists, ML Engineers and Data Ops Engineers. And then we provide more detailed overview of our end-to-end Kubeflow pipelines and a declarative MLOps framework that has been designed to speed up, simplify and improve the reliability of ML pipelines at each stage from development to production. Lessons learned and what’s next will be provided at the end of the talk.
Can you suggest 2-3 topics for post-discussion?
ML Ops, ML Model Governance
Senior Engineering Manager – Safety, MLOps and Infrastructure, Amazon/Twitch
I worked as a Software Engineer Manager at Twitch about MLOps and Tooling in Safety team. Here is my linkedin. I spoke at Meta’s At Scale about Scaling ML Workflows for Real-Time Moderation Challenges at Twitch, I also spoke at TwitchCon about Integrating Data into Twitch at Scale. I worked in engineering leadership role for 5 years and our team made several company wide MLOps tooling such as orchstration and feature store.
Co-Presenter: Chen Liu
Talk: From Silo to Collaboration – Building Tooling to Support Distributed ML Teams at Twitch
Abstract: In this talk, we will cover Twitch’s current ML team structure and challenges of it. Then we dive deep into some solutions we have built to support ML development at Twitch, including what they are and how they will benefit the situation. We close with a discussion of Twitch’s distributed ML team style and how we collaborate using Conductor as an example.
ML has been playing a more and more important role in Twitch’s products (e.g. Recommendation, Safety). In order to allow products to iterate fast, we keep ML practitioners in the product teams and empower the teams to work independently. Undoubtedly, there are common challenges in ML development regardless of product areas. So we are striving to develop tooling and infrastructures for general ML development in order to reduce duplicate work across ML teams. We will dive into those efforts we made in this presentation. For example, Twitch machine learning feature store is developed to have a single control plane serving as feature registry but facilitates distributed feature ownership (e.g. storage, pipelines). Conductor, a in-house ML orchestration system, promotes best practices in pipeline management with templated process control flow and distributed infrastructure management. Meanwhile, we are promoting collaborative ML culture among Twitch engineering teams. It is similar to community owned open source projects where teams share the same interests and encourage cross team contribution and development.
What You’ll Learn: Twitch’s strategy of scaling our ML infra and MLOps tooling has never been discussed online. And we aim to help audience figure out the best strategy to utilize ML tooling for enhancing collaborations between ML teams and boost scientists self-service / efficiency. This is a good lesson if companies are seeking to start MLOps from stratch.
Track: Case Study
Technical Level: 4
Presenters:
Shiming Ren, Sr. Engineering Manager – Safety, MLOps and Infrastructure & Chen Liu, Twitch Sr. Engineering Manager on Personalization and ML Infra, Amazon/Twitch
About the Speaker:
I Shiming worked as a Software Engineer Manager at Twitch about MLOps and Tooling in Safety team. Here is my linkedin. I spoke at Meta’s At Scale about Scaling ML Workflows for Real-Time Moderation Challenges at Twitch, I also spoke at TwitchCon about Integrating Data into Twitch at Scale. I worked in engineering leadership role for 5 years and our team made several company wide MLOps tooling such as orchstration and feature store.
Chen is currently supporting teams working on personalization and ML infrastructures at Twitch. He is passionate about building scalable ML products and democratizing ML in the organization.
Which talk track does this best fit into?
Technical / Research
Technical level of your talk?
(Technical level: 4/7)
Are there any industries (in particular) that are relevant for this talk?
Computer Software, Information Technology & Service
Who is this presentation for?
Senior Business Executives, Product Managers, Data Scientists/ ML Engineers and High-level Researchers, Data Scientists/ ML Engineers, ML Engineers
What you’ll learn:
Twitch’s strategy of scaling our ML infra and MLOps tooling has never been discussed online. And we aim to help audience figure out the best strategy to utilize ML tooling for enhancing collaborations between ML teams and boost scientists self-service / efficiency. This is a good lesson if companies are seeking to start MLOps from stratch.
On a scale of 1-10 how mature is this applied AI application you plan to discuss?
7/10
Pre-requisite Knowledge:
Feature store, Orchstration, Large Scale Data Handling
What kind of DevOps tools you plan to discuss? Open source?
N/A Our tools are all in house
What are some of the languages you plan to discuss?
Python, Golang
What are some of the infrastructures you plan to discuss?
Feature Store, ML Orchstration, Realtime Inference, Distributed ML team collaborations
What is unique about this speech, from other speeches given on the topic?
We aim to use examples how Twitch build in house feature store, realtime inference and orchstration system to demonstrate from technology perspective about MLOps collaborations in a company. This is more like a hybrid tech and management talk which will benefit both engineer and leadership groups.
Abstract of Talk:
[High level intro]
In this talk, we will cover Twitch’s current ML team structure and its challenges of it. Then we dive deep into some solutions we have built to support ML development at Twitch, including what they are and how they will benefit the situation. We close with a discussion of Twitch’s distributed ML team style and how we collaborate using Conductor as an example.
[Actual abstract]
ML has been playing a more and more important role in Twitch’s products (e.g. Recommendation, Safety). In order to allow products to iterate fast, we keep ML practitioners in the product teams and empower the teams to work independently. Undoubtedly, there are common challenges in ML development regardless of product areas. So we are striving to develop tooling and infrastructures for general ML development in order to reduce duplicate work across ML teams. We will dive into those efforts we made in this presentation. For example, Twitch machine learning feature store is developed to have a single control plane serving as feature registry but facilitates distributed feature ownership (e.g. storage, pipelines). Conductor, a in-house ML orchestration system, promotes best practices in pipeline management with templated process control flow and distributed infrastructure management. Meanwhile, we are promoting collaborative ML culture among Twitch engineering teams. It is similar to community-owned open source projects where teams share the same interests and encourage cross team contribution and development.
Can you suggest 2-3 topics for post-discussion?
Manage ML teams collaboration in a distributed manner; ML tooling development from 0 to 10; Implementation details for feature store and ML orchestration system.
Staff Data Scientist, Anheuser-Busch
Eric is a Staff Data Scientist with more than 7 years of experience working at Altair Engineering and Anheuser-Busch. He has a PhD in probability from the University of Toronto, and a masters degree in Applied Math and an undergraduate degree in Engineering from Queen’s university. He’s also a world champion Blokus player.
Talk: Optimal Beer Pricing: An Optimization Layer for Price Elasticities
Abstract: At Anheuser-Busch, we’re obsessed with price elasticities. When the price of beer changes, how will that affect the volume of beer that we sell? These questions (yes, this is more than one question) have implications all over the business, from price setting to procurement to financial planning. We’ve worked hard to make sure our answers to these questions are as data driven as possible. But once we have a model to produce (and predict) these elasticities, how do we make business decisions based on that? And how do we make sure those business decisions are also as data driven as possible?
In this talk we’ll discuss an optimal pricing layer for beer elasticities. We’ll cover how to use mathematical optimization to make specific price change suggestions at a variety of granularities to help achieve specific business objectives. We’ll consider what objective we actually want to optimize (Profit? Revenue? Market Share?) and see how to use constraints to help smooth the trade-off between these objectives. Finally, we’ll investigate how to ensure our price suggestions stay within the regions where the underlying elasticities models make sense.
Ever wanted to see a real-world example of levelling up your analytics from predictive- to prescriptive-, and do so in the context of price setting (or beer drinking)? Now’s your chance!
What You’ll Learn: How to add an optimization layer to ml models.
Track: Case Study
Technical Level: 2
Location: Toronto
Presenter:
Eric Hart, Staff Data Scientist at Anheuser-Busch
About the Speaker:
Eric is a Staff Data Scientist with more than 7 years of experience working at Altair Engineering and Anheuser-Busch. He has a PhD in probability from the University of Toronto, and a masters degree in Applied Math and an undergraduate degree in Engineering from Queen’s university. He’s also a world champion Blokus player.
Which talk track does this best fit into?
Case Study
Technical level of your talk?
(Technical level: 2 /7)
Are there any industries (in particular) that are relevant for this talk?
Banking & Financial Services, Food & Beverages, Marketing & Advertising
Who is this presentation for?
Senior Business Executives, Product Managers, Data Scientists/ ML Engineers and High-level Researchers
What you’ll learn:
Putting a mathematical optimization layer on top of predictive models is still a mostly unused tool in the ML space. It’s very difficult to learn about that from existing resources.
What are the main core message (learning) you want attendees to take away from this talk?
How to add an optimization layer to ml models.
Pre-requisite Knowledge:
Not a lot. We’ll briefly discuss what price-elasticities and mathematical optimization are, but having heard those terms before (with a basic understanding) would help.
What is unique about this speech, from other speeches given on the topic?
I would argue the whole topic is fairly unique (optimization layers for predictive models are not widely used or discussed). In addition, the specifics of trying to work around the realities of the beer industry (especially varying laws about beer pricing across different geographies) add an extra layer of complexity to this already deep problem.
Abstract of Talk:
At Anheuser-Busch, we’re obsessed with price elasticities. When the price of beer changes, how will that affect the volume of beer that we sell? These questions (yes, this is more than one question) have implications all over the business, from price setting to procurement to financial planning. We’ve worked hard to make sure our answers to these questions are as data driven as possible. But once we have a model to produce (and predict) these elasticities, how do we make business decisions based on that? And how do we make sure those business decisions are also as data driven as possible?
In this talk we’ll discuss an optimal pricing layer for beer elasticities. We’ll cover how to use mathematical optimization to make specific price change suggestions at a variety of granularities to help achieve specific business objectives. We’ll consider what objective we actually want to optimize (Profit? Revenue? Market Share?) and see how to use constraints to help smooth the trade-off between these objectives. Finally, we’ll investigate how to ensure our price suggestions stay within the regions where the underlying elasticities models make sense.
Ever wanted to see a real-world example of levelling up your analytics from predictive- to prescriptive-, and do so in the context of price setting (or beer drinking)? Now’s your chance!
Can you suggest 2-3 topics for post-discussion?
Optimization Layers. Price Elasticities.
Business Leaders: C-Level Executives, Project Managers, and Product Owners will get to explore best practices, methodologies, principles, and practices for achieving ROI.
Engineers, Researchers, Data Practitioners: Will get a better understanding of the challenges, solutions, and ideas being offered via breakouts & workshops on Natural Language Processing, Neural Nets, Reinforcement Learning, Generative Adversarial Networks (GANs), Evolution Strategies, AutoML, and more.
Job Seekers: Will have the opportunity to network virtually and meet over 60 Top Al Start-ups and companies during the EXPO & Career Fair.