Development of a software application for the execution of the quantitative association rules mining algorithm (QUARG)
Resumen
Nowadays, the use of the information and communication technologies has resulted in storage of large
amounts of data every day. That data can be used later as a source of knowledge to enterprises and
organizations through the use of data mining techniques.
One of the most frequently applied data-mining techniques is association rules. This is due of their
ability to identify interesting relationships among huge amounts of data. But for certain types of data it
often results in a computationally expensive algorithm.
According to F. Karel [3], the association rules are understood as an implication X → Y in a
transactional database, each of which contains a subset of items or attributes, where the item sets X and
Y are disjoint. The left side of this implication is called the antecedent, and the right side is referred to
as consequent. An example of this is: having the statement “if a customer buys a toothbrush, then she
also probably buys toothpaste” then the association rule can be written as: {toothbrush}→{toothpaste}.
In 2009, Filip Karel proposed and verified in his doctoral thesis an efficient way to obtain this
knowledge using ordinal association rules. He developed an alternative approach to the quantitative
association rule mining (QUARG).
This document intends to present the QUARGsoft application, which main objective is to develop an
efficient implementation of the quantitative association rules mining algorithm (QUARG) into a web
application. The document starts introducing the description of the problem with the details of how
works the QUARG genetic algorithm and the role of the Intelligent Data Analysis Research Laboratory
within this project.
Then, the document introduces the generalities of the project and shows how the problem was solved
using open source tools during the development of it, within the important elements are the taken
design, the developed pseudo-code, and the source code of the application. All of this let the reader to
have an comprehensive view of the application's functionalities.
Finally, some runtime comparison results, between the experimental scripts in Matlab and the
developed application in Java, are shown. Which denote the accomplishment of the project's main
objective and concludes that the QUARGsoft application improves the Filip Karel's experimental
scripts reducing the runtime of the algorithm more than 50%
Descripción
Proyecto de graduación (Bachillerato en Ingeniería en Computación) Instituto Tecnológico de Costa Rica, Escuela de Ingeniería en Computación, 2012