艾特诗教育

适合人群

适合年级 (Grade): 大学生及以上

适合专业 (Major): 人工智能、数据科学、统计学等专业学生；

学生需要具备微积分、概率论与数理统计基础，同时会使用Python编程语言；

建议选修: 程序设计与代码实现

导师介绍

Osman

卡内基梅隆大学 (CMU)

终身正教授

Osman导师现任是卡内基梅隆大学(CMU)计算机科学学院的终身正教授。此前他是CMU CyLab的博士后研究员。2011年秋季，他还在亚利桑那州立大学担任访问博士后学者。导师于2011年获得马里兰大学(University of Maryland at College Park, MD)的电气和计算机工程博士学位。导师的研究重点是计算系统的建模、分析和性能优化，并使用应用概率、网络科学、数据科学和机器学习的工具。在数据科学和机器学习的背景下，他正在研究使用顺序样本(例如，多臂机器人)的统计推断和决策，以及弹性分布式机器学习。在网络科学方面，他有广泛的兴趣，包括网络物理系统的健壮性，重点关注关键基础设施系统;安全可靠的大规模自组织网络设计，日益关注物联网的新兴应用;以及复杂网络中的传染过程，重点关注病毒、(错误)信息和意见传播的建模、分析和控制。导师是IEEE的高级成员，CIT院长早期职业奖学金获得者，IBM学术奖获得者，以及ICC 2021和IPSN 2022的最佳论文奖获得者。

Osman is a Full Research Professor of Electrical and Computer Engineering (ECE) at Carnegie Mellon University (CMU). Prior to joining the faculty of the ECE department in August 2013, he was a Postdoctoral Research Fellow in CyLab at CMU. He has also held a visiting Postdoctoral Scholar position at Arizona State University during Fall 2011. Dr. Yağan received his Ph.D. degree in Electrical and Computer Engineering from the University of Maryland at College Park, MD in 2011, and his B.S. degree in Electrical and Electronics Engineering from the Middle East Technical University, Ankara (Turkey) in 2007. Dr. Yağan's research focuses on modeling, analysis, and performance optimization of computing systems, and uses tools from applied probability, network science, data science, and machine learning. In the context of data science and ML, he is working on statistical inference and decision making using sequential samples (e.g., multi-armed bandits), and resilient distributed machine learning. On the network science side, he has broad interests including robustness of cyber-physical systems with emphasis on critical infrastructure systems; secure and reliable design of large-scale ad-hoc networks with an increasing focus on emerging applications of Internet of Things; and contagion processes in complex networks with a focus on modeling, analysis, and control of spread of viruses, (mis)information, and opinions. Dr. Yağan is a senior member of IEEE, and a recipient of a CIT Dean's Early Career Fellowship, an IBM Academic Award, and best paper awards in ICC 2021 and IPSN 2022.

项目背景

一个赌徒面前有N个赌博机，事先他不知道每台赌博机的真实盈利情况，他应该如何根据机器的操作结果做出反应，来使自身的收益最大化呢？——这个假设便是著名的“多臂强盗”问题的名字来源。实际上在机器学习被应用的领域当中，时时刻刻也都存在着这样的选择与决策情景：一个品牌针对其商品有着多款广告宣传，但不知道每一个用户对于每一种广告的接受程度；在投资中我们可能会面临多个项目，但并无法确认每个项目的具体回报率；亦或是在线零售商如何在不完全了解需求信息的背景下进行实时动态定价……

项目介绍

“多臂强盗”问题是概率论中的一个经典问题，亦是深度强化学习中的重要模块。人们针对解决此类不确定性序列决策问题，提出了“多臂强盗”算法框架（Multi-Armed Bandits，简称MAB，中文又译作“多臂老虎机”）。近年来这一算法框架因优异的性能和较少的反馈学习等优点，在推荐系统、信息检索到医疗保健和金融投资等诸多应用领域中受到了广泛关注。本课题正是以此框架为核心内容，学生将在参与的过程中深入了解算法的基础模型及应用，将认识到被广泛使用的上置信界算法（Upper Confidence Bound，简称UCB）及汤普森采样算法（Thompson Sampling Algorithms）。导师还将讲授自身在该领域的最新研究成果。

This is an introductory course on multi-armed bandits, which provides a sequential decision-making framework under uncertainty and has broad applications in recommendation systems, dynamic pricing, clinical trials, financial investments, etc. We will cover the classical multi-armed bandit model and its applications, several widely used algorithms proposed for its solution including the Explore-Then-Commit (ETC), Upper Confidence Bound (UCB) and Thompson Sampling (TS) Algorithms, performance analysis of these algorithms, and conclude the lectures with the recent work of the instructor on correlated and structured bandits.

项目大纲

多臂老虎机问题的基础介绍 Introduction to Multi-armed Bandits

随机多臂老虎机模型 Stochastic Multi-armed Bandits

上置信界（UCB）算法 The Upper Confidence Bound (UCB) Algorithm

贝叶斯强盗策略与汤普森采样算法 Bayesian Bandits and Thompson Sampling (TS)

算法应用于实施，算法性能分析 Algorithm implementation, performance analysis

多臂老虎机算法在推荐系统中的应用 Applications of Bandits in Recommendation Systems

学术研讨1：教授与各组学生探讨并评估个性化研究课题可行性，帮助学生明晰后续科研思路 Final Project Preparation Session I

学术研讨2：学生将在本周课前完成程序设计原型（prototype）及伪代码（Pseudocode），教授将根据各组进度进行个性化指导，确保学生优质的终期课题产出 Final Project Preparation Session II

项目成果展示 Final Presentation

论文指导 Project Deliverables Tutoring

项目收获

7周在线小组科研学习+5周不限时论文指导学习共125课时

项目报告

优秀学员获主导师Reference Letter

EI/CPCI/Scopus/ProQuest/Crossref/EBSCO或同等级别索引国际会议全文投递与发表指导（可用于申请）

结业证书

成绩单

随机过程、强化学习前沿AI算法在Tik Tok智能推荐内容等推荐系统中的应用

这里增加一个文字描述信息

ADD A TEXT DESCRIPTION OF INFORMATION HERE