How to Combine Text-Mining Methods to Validate Induced Verb-Object Relations

Nicolas Bechet1, Jacques Chauche2, Violaine Prince2 and Mathieu Roche2, 3

  1. GREYC – UMR 6072, CNRS – Univ. de Caen Basse-Normandie
    14032 Caen Cedex – France
  2. LIRMM – UMR 5506, CNRS – Univ. Montpellier 2
    34000 Montpellier – France
  3. TETIS – Cirad
    34093 Montpellier Cedex 5 – France

Abstract

This paper describes methods using Natural Language Processing approaches to extract and validate induced syntactic relations (here restricted to the Verb-Object relation). These methods use a syntactic parser and a semantic closeness measure to extract such relations. Then, their validation is based on two different techniques: A Web Validation system on one part, then a Semantic-Vectorbased approach, and finally different combinations of both techniques in order to rank induced Verb-Object relations. The Semantic Vector approach is a Roget-based method which computes a syntactic relation as a vector. Web Validation uses a search engine to determine the relevance of a syntactic relation according to its popularity. An experimental protocol is set up to judge automatically the relevance of the sorted induced relations. We finally apply our approach on a French corpus of news by using ROC Curves to evaluate the results.

Key words

Text-Mining, Web-Mining, Syntactic Analysis

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS130528021B

Publication information

Volume 11, Issue 1 (January 2014)
Year of Publication: 2014
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Bechet, N., Chauche, J., Prince, V., Roche, M.: How to Combine Text-Mining Methods to Validate Induced Verb-Object Relations. Computer Science and Information Systems, Vol. 11, No. 1, 133-156. (2014), https://doi.org/10.2298/CSIS130528021B