Sriram Selvam
Senior Software Engineer,
Microsoft

ABOUT THE SPEAKER:

Sriram Selvam is a Senior Software Engineer at Microsoft AI with over 14 years of industry experience, specializing in generative search and the deployment of Large Language Model (LLM) applications across distributed systems. He was a core founding team member behind Bing’s Generative Search framework and continues to build AI solutions that enhance user experiences at scale.

Alongside his engineering role, Sriram is an independent researcher deeply invested in the ethical challenges of AI, particularly long-term privacy, sensitive data memorization, and responsible model behavior. His recent work includes co-developing PANORAMA, a large-scale synthetic dataset of 384,000 samples from realistic human profiles, built to model the distribution and context of Personally Identifiable Information (PII) in online content. This work enables robust model auditing and provides researchers with the open-source tooling needed to evaluate privacy-preserving mitigation strategies. Sriram holds an M.S. in Computer Science from the University of Utah.

TALK TITLE:

Emulating Real-World PII with a Large-Scale Synthetic Dataset to Audit LLM Memorization

TRACK:

Fundamental Research (No Direct Business ROI)

SUB TOPIC:

Safety / Interpretability

ABSTRACT:

To address the critical gap in privacy risk assessment, we introduce PANORAMA (Profile-based Assemblage for Naturalistic Online Representation and Attribute Memorization Analysis). PANORAMA is a large-scale, fully synthetic text corpus containing 384,789 samples derived from 9,674 internally consistent synthetic human profiles. Generated using constrained selection and reasoning LLMs, the dataset spans six distinct online modalities, including social media posts, forum discussions, reviews, and marketplace listings. This session will explore how PANORAMA accurately emulates the naturalistic distribution and variety of sensitive data, enabling researchers to systematically study PII memorization, conduct rigorous model auditing, and benchmark privacy-preserving techniques without exposing real user data.

WHAT YOU’LL LEARN:

TBA

Who Attends

Attendees
0 +
Data Practitioners
0 %
Researchers/Academics
0 %
Business Leaders
0 %

2023 Event Demographics

Technical practitioners working directly with ML/AI systems
0 %
Currently Working in Industry*
0 %
Attendees Looking for Solutions
0 %
Currently Hiring
0 %
Attendees Actively Job-Searching
0 %

2023 Technical Background

Expert/Researcher
14%
Advanced
37%
Intermediate
28%
Beginner
7%

2023 Attendees & Thought Leadership

Attendees
0 +
Speakers
0 +
Company Sponsors
0 +

Business Leaders: C-Level Executives, Project Managers, and Product Owners will get to explore best practices, methodologies, principles, and practices for achieving ROI.

Engineers, Researchers, Data Practitioners: Will get a better understanding of the challenges, solutions, and ideas being offered via breakouts & workshops on Natural Language Processing, Neural Nets, Reinforcement Learning, Generative Adversarial Networks (GANs), Evolution Strategies, AutoML, and more.

Job Seekers: Will have the opportunity to network virtually and meet over 30+ Top Al Companies.

Ignite what is an Ignite Talk?

Ignite is an innovative and fast-paced style used to deliver a concise presentation.

During an Ignite Talk, presenters discuss their research using 20 image-centric slides which automatically advance every 15 seconds.

The result is a fun and engaging five-minute presentation.

You can see all our speakers and full agenda here

Get our official conference app
For Blackberry or Windows Phone, Click here
For feature details, visit Whova