More from: Natural Language Processing

Paper Study: A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain Knowledge from Wikipedia

Paper Note 1 Introduction Current weakness: previous work on dialog interfaces has focused on dealing with only a single target task analyze and maintain dialog topics from a more systematic perspective: 1) a separate sub-problem of dialog management and attempted to solve it with text categorization approaches for the recognized utterances in a given turn. 2) domain models(knowledge-based methods) and their weakness. […]

Paper Study: Citation Resolution: A Method for Evaluating Context-based Citation Recommendation Systems

Paper Note 1 Introduction Helpful: if the system is able to take into account the context in which the citation occurs. CBCR: context-based citation recommendation system Objective: suggesting other document with content that is relevant to a particular context in the draft Modest aims in this paper: present initial results using existing IR-based approaches […]

Paper Study: A Machine Learning Approach to Conference Resolution of Noun Phrases

Paper Note 1. Introduction Coreference resolution: the process of determining whether two expressions in natural language refer to the same entity in the world. MUC-6, MUC-7 A coreference relation denotes an identity of reference and holds between two textual elements known as markables, which can be definite noun phrases, demonstrative noun phrases, proper names, […]

Paper Study: Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features

Paper Note 1. Introduction 2. Background Previous mention-pair model: each co-reference classification decision is limited to information about two mentions that make up a pair. Using the entity-mention models to overcome shortcoming. Tree representation: tree-based model construes the representation of coreference clusters as a rooted tree. Latent tree: provides more meaningful antecedents for training 3. Representation and […]


Reference 《自然语言理解》讲义, by宗成庆 A Method for Disambiguating Word Senses in a Large Corpus, by Gale et al., 1992 1 问题的提出 任何一种自然语言中,一词多义(歧义)现象是普遍存在的。如何区分不同上下文中的词汇语义,就是词汇歧义消解问题,或称词义消歧(word sense disambiguation, WSD) 。词义消歧是自然语言处理中的基本问题之一。 2 基本思路 每个词表达不同的含意时其上下文(语境)往往不同,也就是说,不同的词义对应不同的上下文,因此,如果能够将多义词的上下文区别开,其词义自然就明确了。 例如:“他很会与人打交道。”,其中“打”的上下文为 “与”(-2),“人”(-1),打(0),交道(+1),“。”(+2) 基本的上下文信息:词、词性、位置 3 基于贝叶斯分类器的消歧方法 假设某个多义词w所处的上下文语境为C,如果w的多个语义记作si(i≥2),那么,可通过计算argmaxP(si | C)确定w的词义。 根据贝叶斯公式:P(si | C) = P(si) × P(C | si) / P(C) 考虑分母的归一化,并运用如下的独立性假设:P(C | si) =∏ P(vk | si),即朴素贝叶斯假设,其中vk∈C 因此,Vmap=argmax{P(si) × ∏ P(vk | si)} […]