Evaluación del efecto en el algoritmo de Análisis Semántico Latente al utilizar colecciones de datos cada vez más grandes para la detección y extracción de sinónimos y su independencia respecto al lenguaje, por medio de su implementación distribuida
Abstract
Access to large data, especially for text processing applications, results in more effec
tive algorithms and therefore becomes transcendental to take advantage of these large
amounts of data. Latent Semantic Analysis (LSA) is an unsupervised machine learning
algorithm which benefits from these features and can be used for synonym detection
and extraction. LSA takes advantage of the implicit semantic structure that exists in
the association between documents and the terms they contain to statistically analyze
the relationships between the terms of the collection of text documents; and because it
uses a strictly mathematical approach, it is inherently independent of language. This
is a thesis for the Masters in Computing degree that analyzes the LSA algorithm in
a distributed environment, in order to evaluate its effect for synonym detection and
extraction on larger collections of data.
Description
Proyecto de Graduación (Maestría en Computación) Instituto Tecnológico de Costa Rica, Escuela de Ingeniería en Computación, 2014.
Share
Metrics
Collections
- Maestría en Computación [107]