Propuesta de un algoritmo para la selección democrática de individuos.
MetadataShow full item record
This document presents an algorithm inspired in stratified sampling. Unlike stratified sampling, which only works on discrete strata with a null intersection (usually created by distributing individuals into strata by using the discrete values available in a single property of the population), the proposed algorithm is designed to work with discrete strata with a non-null intersection, created by dividing individuals into strata by using the discrete values available in several of the properties of the population. Possible uses for this algorithm include creating samples that are as diverse (or, more euphemistically, “democratic”) as possible from a population. This can be useful, for instance, when selecting individuals for a survey en which the diversity of the sample is key. It can also be useful in anomaly detection (since it will yield a sample with every kind of individual, even the more or unusual). The proposed algorithm is thoroughly compared against the SRS algorithm in terms of performance and the effectiveness with which it increases diversity. Parameters such as size of the population, the sample and the number of properties in the sample are taken into account to perform this comparison. Results indicate that the algorithm, while slower than the SRS, meets the objective of yielding diverse samples. An hypothesis that arises from this research is that the algorithm is a generalization of stratified sampling: executing it while taking into account only one property of the population is likely to return the same result. A formal proof of this hypothesis is an interesting avenue of research, that shouldn’t prove too hard.
Proyecto de Graduación (Maestría en Ingeniería en Computación) Instituto Tecnológico de Costa Rica, Escuela de Ingeniería en Computación, 2014.