Propuesta de un algoritmo para la selección democrática de individuos.
Abstract
This document presents an algorithm inspired in stratified sampling. Unlike stratified sampling,
which only works on discrete strata with a null intersection (usually created by distributing
individuals into strata by using the discrete values available in a single property of the
population), the proposed algorithm is designed to work with discrete strata with a non-null
intersection, created by dividing individuals into strata by using the discrete values available in
several of the properties of the population.
Possible uses for this algorithm include creating samples that are as diverse (or, more
euphemistically, “democratic”) as possible from a population. This can be useful, for instance,
when selecting individuals for a survey en which the diversity of the sample is key. It can also be
useful in anomaly detection (since it will yield a sample with every kind of individual, even the
more or unusual).
The proposed algorithm is thoroughly compared against the SRS algorithm in terms of
performance and the effectiveness with which it increases diversity. Parameters such as size of
the population, the sample and the number of properties in the sample are taken into account to
perform this comparison. Results indicate that the algorithm, while slower than the SRS, meets
the objective of yielding diverse samples.
An hypothesis that arises from this research is that the algorithm is a generalization of stratified
sampling: executing it while taking into account only one property of the population is likely to
return the same result. A formal proof of this hypothesis is an interesting avenue of research, that
shouldn’t prove too hard.
Description
Proyecto de Graduación (Maestría en Computación) Instituto Tecnológico de Costa Rica, Escuela de Ingeniería en Computación, 2014.
Share
Metrics
Collections
- Maestría en Computación [107]