Survey of Word Co-occurrence Measures for Collocation Detection



Título del documento: Survey of Word Co-occurrence Measures for Collocation Detection
Revue: Computación y sistemas
Base de datos: PERIÓDICA
Número de sistema: 000411053
ISSN: 1405-5546
Autores: 1
Instituciones: 1Instituto Politécnico Nacional, Escuela Superior de Cómputo, Ciudad de México. México
Año:
Periodo: Jul-Sep
Volumen: 20
Número: 3
Paginación: 327-344
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Experimental, aplicado
Resumen en inglés This paper presents a detailed survey of word co-occurrence measures used in natural language processing. Word co-occurrence information is vital for accurate computational text treatment, it is important to distinguish words which can combine freely with other words from other words whose preferences to generate phrases are restricted. The latter words together with their typical co-occurring companions are called collocations. To detect collocations, many word co-occurrence measures, also called association measures, are used to determine a high degree of cohesion between words in collocations as opposed to a low degree of cohesion in free word combinations. We describe such association measures grouping them in classes depending on approaches and mathematical models used to formalize word co-occurrence
Disciplinas: Ciencias de la computación,
Literatura y lingüística
Palabras clave: Procesamiento de datos,
Análisis del discurso,
Lingüística aplicada,
Lingüística computacional,
Co-ocurrencia de palabras,
Medida de asociación,
Modelos de lenguajes
Keyword: Computer science,
Literature and linguistics,
Data processing,
Applied linguistics,
Discourse analysis,
Computing linguistics,
Speech analysis,
Word co-occurrence,
Association measure,
Language models
Texte intégral: Texto completo (Ver HTML)