Reproducibility & Data Version Control
for LangChain & LLM/OpenAI Models

FREE Virtual Workshop
Nov. 29th,1PM EST

Proudly Sponsored By

Machine Learning in Retail Summit

September 27th to 28th, 2022 – 10:00 AM to 8:30 PM

A Uniquely Interactive Experience

Join us for the Retail Machine Learning Community’s Annual Gathering

15 speakers and 3 hands-on workshops will explore applications of Machine Learning from both the business and technical areas of expertise.

Attendees will have opportunities to meet with both academic researchers and industrial parties active in the retail sector in order to gain new perspectives from each other’s scope of work.

The Micro-Summit includes:

15 Speakers
3 hands-on Workshop
Access 6 hours of live-streamed content (incl. recordings)
Talks for beginners/intermediate & advanced
Case Studies, Executive Track – Business Alignment & Advanced Technical Research
Q+A with Speakers
Channels to share your work with community

Join this new initiative to help push the AI community forward.

We’re Hosting

Chair

Suhas Pai

Chief Technology Officer, Bedrock AI

Speakers

Jekaterina Novikova

Director of ML, Winterlight Labs

Bio & Abstract

Talk: Interpretability and Robustness of Transformer Models in Healthcare

Shania Raza

CIHR Health Systems Impact Fellow, University of Toronto

Bio & Abstract

Talk: Detecting Medical and Non-Medical Named Entities from COVID-19 Free Texts

Annie En-Shiun Lee

Assistant Professor (Teaching Stream), Computer Science, University of Toronto

Bio & Abstract

Talk: Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?

Workshop Facilitators

Brendan M McKenna

ML Field Engineer, ContinualAI

Bio & Abstract

Workshop: Operationalizing State of the Art Language Models

Annie En-Shiun Lee

Assistant Professor (Teaching Stream), Computer Science, University of Toronto

Bio & Abstract

Workshop: Pre-Trained Multilingual Sequence-to-Sequence Models for NMT: Tips, Tricks and Challenges

Amanda Milberg

Data Scientist, Dataiku

Bio & Abstract

Workshop: Natural Language Processing in Plain English

Speakers and Workshop Facilitators to be announced

Platinum Sponsor

Gold Sponsors

Silver Sponsors

Bronze Sponsor

Community Partners

This event has ended

This event is no longer available.

Shania Raza

CIHR Health Systems Impact Fellow, University of Toronto

Dr. Shaina Raza is a CIHR Health System Impact Fellow. Her post-doctoral fellowship is co-funded by CIHR Institute of Population and Public Health (CIHR-IPPH)/Equitable AI and Public Health Ontario. Her host institution is Public Health Ontario and her academic institute is Dalla Lana School of Public Health in University of Toronto.
She is PhD in Computer Science, specializing in AI, natural language processing and deep neural networks. She is also a seasoned data scientist, a scientific editor in Elsevier and a peer reviewer in many peer-reviewed journals. Her research interests are in computer linguistic, social media, and in developing novel language models with a focus in biomedicine. She has a number of publications in high-quality journals and A* conferences in computer science and recently in biomedicine. Her website is shainaraza and her publication details are in Scholar.

Talk: Detecting Medical and Non-Medical Named Entities from COVID-19 Free Texts

Abstract: The application of the state-of-the-art biomedical named entity recognition task faces a few challenges: first, these methods are trained on a fewer number of clinical entities (e.g., disease, symptom, proteins, genes); second, these methods require a large amount of data for pre-training and prediction, making it difficult to implement them in real-time scenarios; third, these methods do not consider the non-clinical entities such as social determinants of health (age, gender, employment, race) which are also related to patients’ health. We propose a Machine Learning (ML) pipeline that improves on previous efforts in three ways: first, it recognizes many clinical entity types (diseases, symptoms, drugs, diagnosis, etc.), second, this pipeline is easily configurable, reusable and can scale up for training and inference; third, it considers non-clinical factors related to patient’s health. At a high level, this pipeline consists of stages: pre-processing, tokenization, mapping embedding lookup and named entity recognition task. We also present a new dataset that we prepare by curating the COVID-19 case reports. The proposed approach outperforms baseline methods on four benchmark datasets with macro-and microaverage F1 scores around 90, as well as using our dataset with a macro-and micro-average F1 score of 95.25 and 93.18 respectively.

What You’ll Learn: To extract named entities from free texts and to bridge a gap between NLP and epidemiology.

Annie En-Shiun Lee

Assistant Professor (Teaching Stream), Computer Science, University of Toronto

Anne En-Sjiun Lee is an Assistant Professor (Teaching Stream) for the Computer Science Department at the University of Toronto. She received her PhD from the University of Waterloo in 2014 under the supervision of Professor Andrew K. C. Wong and Daniel Stashuk from the Centre of Pattern Intelligence and Machine Intelligence. She has also been a visiting researcher at the Fields Institute (invited by Nancy Reid) and CUHK (invited by K. S. Leung and M. H. Wong) as well as a research scientist at VerticalScope and Stradigi AI.

Talk: Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?

Abstract: What can pre-trained multilingual sequence-to-sequence models like mBART contribute to translating low-resource languages? We conduct a thorough empirical experiment in 10 languages to ascertain this, considering five factors:
(1) the amount of fine-tuning data, (2) the noise in the fine-tuning data,
(3) the amount of pre-training data in the model,
(4) the impact of domain mismatch, and
(5) language typology. In addition to yielding several heuristics, the experiments form a framework for evaluating the data sensitivities of machine translation systems. While mBART is robust to domain differences, its translations for unseen and typologically distant languages remain below 3.0 BLEU. In answer to our title’s question, mBART is not a low-resource panacea; we therefore encourage shifting the emphasis from new models to new data.

What You’ll Learn: What can pre-trained multilingual sequence-to-sequence models like mBART contribute to translating low-resource languages? We try to to answer this through empirical experiments on 10 different languages.

Workshop: Pre-Trained Multilingual Sequence-to-Sequence Models for NMT: Tips, Tricks and Challenges

Abstract: Neural Machine Translation (NMT) has seen a tremendous spurt of growth in less than ten years, and has already entered a mature phase. Pre-trained multilingual sequence-to-sequence (PMSS) models, such as mBART and mT5, are pre-trained on large general data, then fine-tuned to deliver impressive results for natural language inference, question answering, text simplification and neural machine translation. This tutorial presents
1) An Introduction to Sequence-to-Sequence Pre-trained Models,
2) How to adapt pre-trained models for NMT,
3) Tips and Tricks for NMT training and evaluation,
4) Challenges/Problems faced when using these models. This tutorial will be useful for those interested in NMT, from a research as well as industry point of view.

What You’ll Learn: This tutorial will give an overview of Pre-trained Sequence-to-Sequence Multilingual Models, tips, tricks and frameworks that can be used to adapt these models for NMT, the challenges faced while using these models and how to overcome them.

Reproducibility & Data Version Control
for LangChain & LLM/OpenAI Models

FREE Virtual Workshop
Nov. 29th,1PM EST

Toronto Machine Learning Summit

Machine Learning in Retail Summit

September 27th to 28th, 2022 – 10:00 AM to 8:30 PM

A Uniquely Interactive Experience

We’re Hosting

Breakout Sessions
(All Levels)

Discussion Groups

Workshops

Virtual Platform

Chair

Suhas Pai

Speakers

Jekaterina Novikova

Shania Raza

Annie En-Shiun Lee

Workshop Facilitators

Brendan M McKenna

Annie En-Shiun Lee

Amanda Milberg

Platinum Sponsor

Gold Sponsors

Silver Sponsors

Bronze Sponsor

Community Partners

This event has ended

8th Annual:

TMLS

Stay up to date with all social invites and news for TMLS 2024

Join Our Community

Jekaterina Novikova

Shania Raza

Annie En-Shiun Lee

Brendan M McKenna

Amanda Milberg

Sign Up for TMLS 2023 News Updates

Machine Learning in Retail Summit

September 27th to 28th, 2022 – 10:00 AM to 8:30 PM

A Uniquely Interactive Experience

We’re Hosting

Breakout Sessions(All Levels)

Discussion Groups

Workshops

Virtual Platform

Chair

Suhas Pai

Speakers

Jekaterina Novikova

Shania Raza

Annie En-Shiun Lee

Workshop Facilitators

Brendan M McKenna

Annie En-Shiun Lee

Amanda Milberg

Platinum Sponsor

Gold Sponsors

Silver Sponsors

Bronze Sponsor

Community Partners

This event has ended

TMLS

Jekaterina Novikova

Shania Raza

Annie En-Shiun Lee

Brendan M McKenna

Amanda Milberg

Sign Up for TMLS 2023 News Updates

Who Attends

2023 Event Demographics

2023 Technical Background

2023 Attendees & Thought Leadership

Breakout Sessions
(All Levels)