元学习Meta-Learning
目錄
?
1. 背景
2. 元學習meta-learning
3. 應用
3.1 事件抽取(Zero-shot Transfer Learning for Event Extraction)
1. 背景
Artificial Intelligence --> Machine Learning --> Deep Learning --> Deep Reinforcement Learning --> Deep Meta Learning
在Machine Learning時代,復雜一點的分類問題效果就不好了,Deep Learning深度學習的出現基本上解決了一對一映射的問題,比如說圖像分類,一個輸入對一個輸出,因此出現了AlexNet這樣的里程碑式的成果。但如果輸出對下一個輸入還有影響呢?也就是sequential decision making的問題,單一的深度學習就解決不了了,這個時候Reinforcement Learning增強學習就出來了,Deep Learning + Reinforcement Learning = Deep Reinforcement Learning深度增強學習。有了深度增強學習,序列決策初步取得成效,因此,出現了AlphaGo這樣的里程碑式的成果。但是,
- 深度增強學習太依賴于巨量的訓練,并且需要精確的Reward,對于現實世界的很多問題,比如機器人學習,沒有好的reward,也沒辦法無限量訓練,怎么辦?
- 或者把棋盤變大一點AlphaGo還能行嗎?目前的方法顯然不行,AlphaGo會立馬變成傻瓜。而我們人類就厲害多了,分分鐘可以適應新的棋盤。
再舉個例子人臉識別,我們人往往可以只看一面就能記住并識別,而現在的深度學習卻需要成千上萬的圖片才能做到。
我們人類擁有的快速學習能力是目前人工智能所不具備的,而人類之所以能夠快速學習的關鍵是人類具備學會學習的能力,能夠充分的利用以往的知識經驗來指導新任務的學習。因此,如何讓人工智能能夠具備快速學習的能力成為現在的前沿研究問題,namely?Meta Learning.
problem: deep learning依賴大量優質的標柱訓練數據集 and 計算資源;移植能力差poor?portability/?p??.t??b?l.?.ti/ n.;task-specific獨立應用于特定任務,but?new concept or things come out continuously。
Reference:
[1]?https://zhuanlan.zhihu.com/p/27629294?===>作者將人的感性通過weight價值觀網絡體現出來,很有想法,很有意思的一個點!
[2]?https://blog.csdn.net/langb2014/article/details/84953307
2. 元學習meta-learning
solution:快速學習;利用以往的知識經驗來知道新任務學習;learning to learn;inference,思考
概念:元學習, meta-learning, known as learning to learn(Schmidhuber,?1987;?Bengio et al.,?1991;?Thrun and Pratt,?1998)), ?is an alternative paradigm that draws on past experience in order to learn and adapt to new tasks quickly: the model is trained on a number of related tasks such that it can solve unseen tasks using only a small number of training examples.
3. 應用
3.1 事件抽取(Zero-shot Transfer Learning for Event Extraction)
- Problem: Most previous event extraction studies have relied heavily on features derived from annotated event mentions, thus can not be applied to new event types without annotation /ty??n.??te?.??n/ n. 標注?effort.
- Solution: We designed a transferable neural architecture, mapping event mentions and types jointly into a shared semantic space using structural and compositional neural networks, where the type of each evnet mention can be determined by the closest of all candidate types.
- Scheme: By leveraging (1) available manual annotation for a small set of existing event types and (2) existing event ontologies?/?n?t?l.?.d?i/ n. 本體, our framework applies to new event types without requiring additional annotation.
(1) goal of event extraction: event triggers; event arguments from unstructural data.
--->poor portability /?p??.t??b?l.?.ti/ n. 可移植性of traditional supervised methods and the limited coverage of available event annotations.
--->problem: handling new event types means to start from scratch without being able to re-use annotations for old event types.
? ? ? ?reasons: thest approaches modeled event extraction as a classification problem, encoding features only by measuring the similarity between rich features encoded for test event mentions and annotated event mentions.
--->We observed that both event mentions and types can be represented with structures.
? ? ? ?event mention structure <--- constructed from trigger and candidate arguments
? ? ? ?event type structure <--- consists of event type and predefined roles?
---> Figure 2.
Figure 2: Examples of Event Mention and Type Structures from ERE.
? ? ? ?AMR --> abstract meaning representation, to identify candidate arguments and construct event mention structures.
? ? ? ?ERE --> entity relation event, event types can also be represented with structures form ERE.
? ? ? ? ? ? ? ? ? ? ?besides the lexical semantics that relates a trigger to its type, their structures also tend to ben similar.
? ? ? ? ? ? ? ? ? ? ?this observation is similar to the theory that the semantics of an event structure can be generalized and mapped to event mention structures in semantic and predictable way.
? ? ? ? ? ? ? ? ? ? ?event extraction task --> by mapping each mention to its semantically closest event type in the ontology.
---> one possible implementation: Zero-Shot Learning(ZSL), which had been successfully exploited in visual object classification.
? ? ? ? main idea of ZSL for vision tasks: is to represent both images and type labels in a multi-dimensional vector space separately, and then learn a regression model to map from image semantic space to type label semantic space based on annotated images for seen labels. This regression model can be further used to predict the unseen labels of any given image.
---> one goal is to effectively transfer the knowledge of events from seen types to unseen types, so we can extract event mentions of any types defined in the ontology.
? ? ? ? We design a transferable neural architecture, which jointly learns and maps the structural representation of both event mentions and types into a shared semantic space by minimizing the distance between each event mention and its corresponding type.
? ? ? ? ?unseen types' event mentions, their structures will be projected into the same semantic space using the same framework and assigned types with top-ranked similarity values.
(2) Approach
Event Extraction: triggers; arguments
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Figure 3: Architecture Overview
? 1) a sentence S, start by identifying candidate triggers and arguments based on AMR parsing.
? ? e.g. dispatching is the trigger of a Transport_Person event with four arguments(0, China; 1, troops; 2, Himalayas; 3, time)
?? ?we build a structure St using AMR as shown in Figure 3. e.g. dispatch-01
? 2) each structure is composed of a set of tuples, e.g. <dispatch-01, :ARG0, China>
?
?
? ? we use a matrix to represent each AMR relation, composing its semantics with two concepts for each tuple, and feed all tuple representations into CNN to generate event mention structure representation Vst(namely candidate trigger).
? ? ? ? ? ? ? ? ? ? ? ? ? ?---->?pooling & concatenation. --> Vst
? ? ?Shared CNN----> Convolution Layer
? ? ? ? ? ? ? ? ? ? ? ? ? ----> Structure Composition Layer ?<--St
? 3) Given a target event ontology, for each type y, e.g. Transport_Person, we construct a type structure Sy by incorporating its predefined roles, and use a tensor to denote the implicit relation between any types and arguments.
? ? compose the semantics of type and argument role with the tensor for each tuple, e.g. <Tranport_Person, Destination>
? ? we generate the event type structure representation Vsy using the same CNN.
? 4) By minimizing the semantic distance between dispatch-01 and Transport_Person Vst and Vsy.
? ? ?we jointly map the representations of event mention and event types into a shared semantic space, where each mention is closest to its annotated type.
? 5) After training, the compositional functions and CNNs can be further used to project any new event mention(e.g. donate-01) into the semantic space and find its closest event type()
(3) Joint Event Mention and Type Label Embedding
? ? CNN is good at capture sentence level information in various NLP tasks.
? ? --> we use it to generate structure-label representations.
? ? ? ? ? ?For each event mention structure St=(u1,u2,..., un) and each event type structure Sy=(u1', u2', ...., up') which contains h and p tuples respectively.
? ? --> we apply a weight-sharing CNN to each input structure to jointly learn event mention and type structural representations, which will be later used to learn the ranking function for zero-shot event extraction.
? ? --> Input layer is a sequence of tuples, where the order of tuples is represented by a d * 2 dimensional vector, thus each mention structure and each type stucture are represented as a feature map of dimensionality d x 2h* and d x 2p* respectively.
? ? --> Convolution Layer
? ? --> Max-Pooling
? ? --> Learning
(4) Joint Event Argument and Role Embedding
(5) Zero-Shot Classification?
?
?
?
?
總結
以上是生活随笔為你收集整理的元学习Meta-Learning的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 沉浸式技术immersive techn
- 下一篇: 【转载】Few-shot learnin