人人能上手:OpenAI发射初学者友好的强化学习教程 | 代码简约易懂
淘宝搜:【天降红包222】领超级红包,京东搜:【天降红包222】
淘宝互助,淘宝双11微信互助群关注公众号 【淘姐妹】
栗子 发自 凹非寺 量子位 出品 | 公众号 QbitAI
OpenAI说,全无机器学习基础的人类,也可以迅速上手强化学习。
他们刚刚发射了一套强化学习 (RL) 入门教程,叫做Spinning Up。真诚友好,无微不至。
从一套重要概念,到一系列关键算法实现代码,再到热身练习,每一步都以清晰简明为上,全程站在初学者视角。
△ 新手光环
团队表示,目前还没有一套比较通用的强化学习教材,RL领域只有一小撮人进得去。这样的状态要改变啊,因为强化学习真的很有用。
说不定你也用得上。所以,仔细翻翻这个新手包,看看到底有多关怀:
Spinning Up包含了5个重要部分。
这里分为三小步:
一是了解基础概念,即知道RL能用来做什么,理解概念和术语。
二是了解算法分为哪些种类。
三是了解策略优化。
https://spinningup.openai.com/en/latest/spinningup/rl_intro.html
(此部分可选择性忽略)
如何让自己习惯RL研究人员的新设定?
第一,知道哪些数学知识深度学习知识,是需要简要了解的。
第二,在实践中学习,写最最最简单的实现 (后面有代码) ,注重理解。
第三,有了小小的经验之后,试着开发自己的研究项目。这是入门之后的事了。
第四之后,有些遥远,暂时不详述了。
这个论文列表非常详细,分为12个小类别,每个类别下有2-8篇论文。
团队说,列表比全面还全面,足够给一个想做RL研究的人类铺路了。
https://spinningup.openai.com/en/latest/spinningup/keypapers.html
GitHub上面有个叫spinningup的项目,包含了强化学习能用到的各种关键算法:
VPG、TRPO、PPO、DDPG、TD3和SAC等。
团队说,这里的代码都是为初学者定制,很短很好学。比起模块化,Spinning Up以清晰为重,代码都注释过了,可以很清楚得看出每一步都在做什么,并且有背景材料可以辅助理解。
目标就是用最简约的实现,来演示一条理论是如何变成代码的,而抽象层和混淆层(Layers of Abstraction and Obfuscation) 这些东西,都省去了。
https://github.com/openai/spinningup
这里有两个习题集。
一是关于实现的基础,二是关于算法失效模式。
后面还有附加题,是要从零开始自己写代码实现,相对艰辛。
https://spinningup.openai.com/en/latest/spinningup/exercises.html
团队说,要感受强化学习是怎样运作的,最好的方式是跑一跑。
在Spinning Up里面,就很容易,只要用这段代码:
训练结束的时候,你会看到说明,然后可以按照里面讲的方法来观察数据,也观察训练好的智能体的视频。
另外,Spinning Up里面的实现,和一系列Gym环境都兼容:Classic Control,Box2D,MUJOCO等等。
看上去,好像真的没有很难。
OpenAI就是希望其他领域的研究人员,也能很轻易地用强化学习来辅助研究。
所以,试一下吧。
教程入口: https://spinningup.openai.com/en/latest/index.html
GitHub传送门: https://github.com/openai/spinningup
― 完 ―
智能人工的趋势 2018年2月人工智能
智能人工的趋势是什么,“人工”智能,人工到智能的转变,人 工 智 能导语:ShowMeAI资讯日报 2022-06-06 期给您带来以下AI资讯。
- 工具库:【C++ AI工具箱】【PyTorch神经网络框架】【随机时间序列数据分析】【PyTorch组件库】【查询基因组数据库】
- 项目代码:【Python迷你练手项目集锦】【员工人脸识别考勤系统】
- 博文分享:【ACL 2022 Tutorial Slides】【如何设计可扩展系统】【GitHub的Markdown使用技巧】
- 数据资源:【3D人体数据集】【肌肉骨骼模型】【中文纠错数据集】【文本纠错SOTA模型】【联邦学习文献资源】
ShowMeAI日报系列全新升级!覆盖AI人工智能 工具&框架 | 项目&代码 | 博文&分享 | 数据&资源 | 研究&论文 等方向。点击查看 历史文章列表,在公众号内订阅话题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速浏览各专题全集。点击 这里 回复关键字 日报 免费获取AI电子月刊与资料包。
tags:[C++,目标检测,人脸检测,部署]
'Lite.AI.ToolKit: A lite C++ toolkit of awesome AI models, such as RobustVideoMatting, YOLOX, YOLOP etc.' by DefTruth
GitHub:
tags:[深度学习,解释,Pytorch]
'Zennit - a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.' by Christopher
GitHub:
tags:[时间序列]
'PyDaddy - Package to analyse stochastic time series data' by TEE Lab
GitHub:
tags:[持续推理,神经网络,pytorch]
'continual-inference - PyTorch building blocks for Continual Inference Networks' by LukasHedegaard
GitHub:
tags:[基因组,基因数据,生物医疗]
'gget - a free and open-source command-line tool and Python package that enables efficient querying of genomic databases' by Pachter Lab
GitHub:
tags:[Python,项目]
'Python-Mini-Projects - A collection of simple python mini projects to enhance your python skills' by PYTHON WORLD
GitHub:
tags:[opencv,dilib,人脸识别,自动考勤]
GitHub:
tags:[语言模型,少样本学习,自然语言处理]
'ACL 2022 Tutorial: Zero- and Few-Shot NLP with Pretrained Language Models' by AI2
GitHub:
tags:[系统设计]
'The System Design Primer - Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.' by Donne Martin
GitHub:
tags:[资讯,工具]
GitHub:
tags:[数据集,人体]
'THUman3.0 Dataset' by fwbx529
GitHub:
tags:[数据,任务,肌肉,骨骼]
'MyoSuite - a collection of environments/tasks to be solved by musculoskeletal models simulated with the MuJoCo physics engine and wrapped in the OpenAI gym API.' by Meta Research
GitHub:
tags:[纠错,文本纠错,数据集]
GitHub:
tags:[图,表格,结构化,联邦学习]
'Federated-Learning-on-Graph-and-Tabular-Data - Federated learning on graph and tabular data related papers, frameworks, and datasets.' by YoungFish
GitHub:
可以点击 这里 回复关键字 日报,免费获取整理好的6月论文合辑。
论文标题:Pretraining is All You Need for Image-to-Image Translation
论文时间:25 May 2022
所属领域:Adversarial/对抗性
对应任务:Image-to-Image Translation,Texture Synthesis,Translation,图图转换,纹理合成,转换
论文地址:
代码实现:
论文作者:Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng Chen, Fang Wen
论文简介:We propose to use pretraining to boost general image-to-image translation. / 我们建议使用预训练来提高一般的图图转换(Image-to-Image Translation)。
论文摘要:We propose to use pretraining to boost general image-to-image translation. Prior image-to-image translation methods usually need dedicated architectural design and train individual translation models from scratch, struggling for high-quality generation of complex scenes, especially when paired training data are not abundant. In this paper, we regard each image-to-image translation problem as a downstream task and introduce a simple and generic framework that adapts a pretrained diffusion model to accommodate various kinds of image-to-image translation. We also propose adversarial training to enhance the texture synthesis in the diffusion model training, in conjunction with normalized guidance sampling to improve the generation quality. We present extensive empirical comparison across various tasks on challenging benchmarks such as ADE20K, COCO-Stuff, and DIODE, showing the proposed pretraining-based image-to-image translation (PITI) is capable of synthesizing images of unprecedented realism and faithfulness.
我们建议使用预训练来提高一般的图图转换(Image-to-Image Translation)效果。先前的图图转换方法通常需要专门的架构设计和从头开始训练单个翻译模型,难以高质量地生成复杂场景下的图像,尤其是在配对训练数据不丰富的情况下。在本文中,我们将每个图图转换问题视为一个下游任务,并引入了一个简单且通用的框架,该框架采用预训练的扩散模型来适应各种图像到图像的翻译。我们还提出对抗性训练以增强扩散模型训练中的纹理合成,并结合归一化引导采样以提高生成质量。我们在 ADE20K、COCO-Stuff 和 DIODE 等具有挑战性的基准上对各种任务进行了广泛的实证比较,表明所提出的基于预训练的图像到图像转换 (PITI) 能够合成具有前所未有的真实可靠的图像。
论文标题:Variational Diffusion Models
论文时间:NeurIPS 2021
所属领域:Methodology
对应任务:Density Estimation,密度估计
论文地址:
代码实现:
论文作者:Diederik Kingma, Tim Salimans, Ben Poole, Jonathan Ho
论文简介:In addition, we show that the continuous-time VLB is invariant to the noise schedule, except for the signal-to-noise ratio at its endpoints. / 此外,我们证明了连续时间 VLB 对噪声调度是不变的,除了端点处的信噪比。
论文摘要:Diffusion-based generative models have demonstrated a capacity for perceptually impressive synthesis, but can they also be great likelihood-based models? We answer this in the affirmative, and introduce a family of diffusion-based generative models that obtain state-of-the-art likelihoods on standard image density estimation benchmarks. Unlike other diffusion-based models, our method allows for efficient optimization of the noise schedule jointly with the rest of the model. We show that the variational lower bound (VLB) simplifies to a remarkably short expression in terms of the signal-to-noise ratio of the diffused data, thereby improving our theoretical understanding of this model class. Using this insight, we prove an equivalence between several models proposed in the literature. In addition, we show that the continuous-time VLB is invariant to the noise schedule, except for the signal-to-noise ratio at its endpoints. This enables us to learn a noise schedule that minimizes the variance of the resulting VLB estimator, leading to faster optimization. Combining these advances with architectural improvements, we obtain state-of-the-art likelihoods on image density estimation benchmarks, outperforming autoregressive models that have dominated these benchmarks for many years, with often significantly faster optimization. In addition, we show how to use the model as part of a bits-back compression scheme, and demonstrate lossless compression rates close to the theoretical optimum.
基于扩散的生成模型已经证明了一种强大的综合能力,但它们也可以是很好的基于可能性的模型吗?我们的回答是肯定的,并引入了一系列基于扩散的生成模型,这些模型在标准图像密度估计基准上获得了最先进的效果。与其他基于扩散的模型不同,我们的方法允许与模型的其余部分一起有效地优化噪声调度。我们表明,变分下限(VLB)在扩散数据的信噪比方面简化为一个非常短的表达式,从而提高了我们对这个模型类的理论理解。利用这一想法,我们证明了文献中提出的几个模型之间的等价性。此外,我们证明了连续时间 VLB 对噪声调度是不变的,除了其端点的信噪比。这使我们能够学习最小化结果 VLB 估计器的方差的噪声调度,从而实现更快的优化。将这些进步与架构改进相结合,我们在图像密度估计基准上获得了最先进的可能性,优于多年来主导这些基准的自回归模型,而且优化速度通常要快得多。此外,我们展示了如何将该模型用作位回压缩方案的一部分,并展示了接近理论最优值的无损压缩率。
论文标题:Top1 Solution of QQ Browser 2021 Ai Algorithm Competition Track 1 : Multimodal Video Similarity
论文时间:30 Oct 2021
所属领域:Natural Language Processing/自然语言处理
对应任务:Language Modelling,TAG,Video Similarity,语言建模,视频相似度
论文地址:
代码实现:
论文作者:Zhuoran Ma, Majing Lou, Xuan
人工智能 | ShowMeAI资讯日报 #2022.06.07
导语:ShowMeAI资讯日报 2022-06-07 期给您带来以下AI资讯。
- 工具库:【深度学习超分辨率App】【自然语言命令转换成脚本】【AI桌面APP工具箱】【Python转换成C++】【全链数据工程工具库】【AI 大模型训练系统】
- 项目代码:【语义分割实战】
- 博文分享:【Transformers自然语言处理】【ACL 2022 Tutorial Slides】【Web3科学家极简入门指南】【机器学习论文撰写】
- 数据资源:【图神经网络技巧集】【非自回归应用】【预训练模型参数】【地理空间数据科学】
ShowMeAI日报系列全新升级!覆盖AI人工智能 工具&框架 | 项目&代码 | 博文&分享 | 数据&资源 | 研究&论文 等方向。点击查看 历史文章列表,在公众号内订阅话题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速浏览各专题全集。点击 这里 回复关键字 日报 免费获取AI电子月刊与资料包。
tags:[AI,工具库,图像,视频,超分辨率]
'QualityScaler - Image/video deep learning upscaler app for Windows - BRSGAN & RealSR_JPEG' by Annunziata Gianluca
GitHub:
tags:[AI,工具,命令行,自然语言,命令转换]
'Codex CLI - Natural Language Command Line Interface - CLI tool that uses Codex to turn natural language commands into their Bash/ZShell/PowerShell equivalents' by Microsoft
GitHub:
tags:[AI,工具库]
免安装即开即用,已支持15+AI模型,内容涵盖语音合成、视频补帧、视频超分、目标检测、图片风格化、图片OCR识别等领域.
'paper2gui - Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology' by Baiyuetribe
GitHub:
tags:[python,C++]
'codex_py2cpp - Converts python code into c++ by using OpenAI CODEX.' by Alexander
GitHub:
tags:[数据工程,数据开发]
lineapy - Data engineering, simplified. LineaPy creates a frictionless path for taking your data science artifact from development to production.'
GitHub:
tags:[大模型,训练,部署]
它提供了一系列并行训练组件,目标是分布式AI模型训练像普通的单GPU模型一样简单
'Colossal-AI - Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training' by HPC-AI Tech
GitHub:
tags:[语义分割]
'Semantic segmentation - In this tutorial, you will perform inference across 10 well-known pre-trained semantic segmentors and fine-tune on a custom dataset. Design and train your own segmentor.' by Ibrahim Sobh
GitHub:
tags:[transformers,GPT-3 ,DeBERTa,Hugging Face,OpenAI,AllenNLP]
《Transformers自然语言处理(第二版)》随书代码,内容覆盖各种前沿主流模型transformers,GPT-3 ,DeBERTa,vision models等,也覆盖Hugging Face、OpenAI API、Trax和 AllenNLP等NLP平台。
'Transformers-for-NLP-2nd-Edition - Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP' by Denis Rothman
GitHub:
tags:[有限数据,有限数据学习]
'ACL 2022 Limited Data Learning Tutorial' by diyiy
GitHub:
tags:[web3]
GitHub:
tags:[机器学习,论文,写作]
《How to ML Paper - A brief Guide - Google Docs》by Jakob Foerster
Link:
tags:[图神经网络,GNN]
'gtrick: Bag of Tricks for Graph Neural Networks' by Yunxin Sang
GitHub:
tags:[非自回归]
'Overview-of-Non-autoregressive-Applications' by LitterBrother-Xiao
GitHub:
tags:[预训练,参数]
'DeltaPapers - Must-read Papers of Parameter Efficient Methods on Pre-trained Models (Delta Tuning).' by THUNLP
GitHub:
tags:[GIS,地理信息,数据科学]
'Course materials for: Geospatial Data Science - Course materials for: Geospatial Data Science' by Michael Szell
GitHub:
可以点击 这里 回复关键字 日报,免费获取整理好的6月论文合辑。
论文标题:CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
论文时间:29 May 2022
所属领域:自然语言处理,计算机视觉
对应任务:Text-to-Video Generation,Video Generation,文字约束视频生成,视频生成
论文地址:
代码实现:
论文作者:Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
论文简介:Large-scale pretrained transformers have created milestones in text (GPT-3) and text-to-image (DALL-E and CogView) generation. / 大规模预训练transformer在文本 (GPT-3) 和文本到图像 (DALL-E 和 CogView) 生成方面创造了里程碑。
论文摘要:Large-scale pretrained transformers have created milestones in text (GPT-3) and text-to-image (DALL-E and CogView) generation. Its application to video generation is still facing many challenges: The potential huge computation cost makes the training from scratch unaffordable; The scarcity and weak relevance of text-video datasets hinder the model understanding complex movement semantics. In this work, we present 9B-parameter transformer CogVideo, trained by inheriting a pretrained text-to-image model, CogView2. We also propose multi-frame-rate hierarchical training strategy to better align text and video clips. As (probably) the first open-source large-scale pretrained text-to-video model, CogVideo outperforms all publicly available models at a large margin in machine and human evaluations.
大规模预训练transformer在文本(GPT-3)和文本到图像(DALL-E 和 CogView)生成方面创造了里程碑。它在视频生成中的应用仍然面临许多挑战:潜在的巨大计算成本使得从头开始的训练难以承受;文本视频数据集的稀缺性和弱相关性阻碍了模型理解复杂的图像动作语义。在这项工作中,我们提出了 9B 参数转换器 CogVideo,在预训练的文本到图像模型 CogView2 的基础上进行训练。我们还提出了多帧率分层训练策略,以更好地对齐文本和视频剪辑。作为(可能)第一个开源的大规模预训练文本到视频模型,CogVideo 在机器和人工评估中大大优于所有公开可用的模型。
论文标题:Hopular: Modern Hopfield Networks for Tabular Data
论文时间:1 Jun 2022
所属领域:结构化数据
对应任务:表格数据建模,结构化数据建模
论文地址:
代码实现:
论文作者:Bernhard Sch?fl, Lukas Gruber, Angela Bitto-Nemling, Sepp Hochreiter
论文简介:In experiments on small-sized tabular datasets with less than 1, 000 samples, Hopular surpasses Gradient Boosting, Random Forests, SVMs, and in particular several Deep Learning methods. / 在对少于 1,000 个样本的小型表格数据集的实验中,Hopular 超越了梯度提升、随机森林、SVM,尤其是几种深度学习方法。
论文摘要:While Deep Learning excels in structured data as encountered in vision and natural language processing, it failed to meet its expectations on tabular data. For tabular data, Support Vector Machines (SVMs), Random Forests, and Gradient Boosting are the best performing techniques with Gradient Boosting in the lead. Recently, we saw a surge of Deep Learning methods that were tailored to tabular data but still underperform compared to Gradient Boosting on small-sized datasets. We suggest "Hopular", a novel Deep Learning architecture for medium- and small-sized datasets, where each layer is equipped with continuous modern Hopfield networks. The modern Hopfield networks use stored data to identify feature-feature, feature-target, and sample-sample dependencies. Hopular's novelty is that every layer can directly access the original input as well as the whole training set via stored data in the Hopfield networks. Therefore, Hopular can step-wise update its current model and the resulting prediction at every layer like standard iterative learning algorithms. In experiments on small-sized tabular datasets with less than 1,000 samples, Hopular surpasses Gradient Boosting, Random Forests, SVMs, and in particular several Deep Learning methods. In experiments on medium-sized tabular data with about 10,000 samples, Hopular outperforms XGBoost, CatBoost, LightGBM and a state-of-the art Deep Learning method designed for tabular data. Thus, Hopular is a strong alternative to these methods on tabular data.
虽然深度学习在视觉和自然语言处理中遇到的结构化数据方面表现出色,但它未能满足其对表格数据(结构化数据)的期望。对于表格数据,支持向量机 (SVM)、随机森林和梯度提升是性能最好的技术,其中梯度提升处于领先地位版权声明:除非特别标注原创,其它均来自互联网,转载时请以链接形式注明文章出处。