Thursday, July 19, 2012

I have been writing... I swear.

So I have been writing, just not somewhere where you can see it.

I am changing this fact: my_thesis

I'll try to update nightly, before I go to bed,
along with a brief description of the day's work.


As always, your feedback is warmly received.

~tony

Abstract

    We are in the midst of the information age. Those who can collect, decipher and make
informed decisions from the vast amounts of data at hand will gain a competitive advantage
over their peers. Until recently, it has been the role of a human to perform transform
data into models and decision making processes. This process, known as the scienti c
method, has been the foundation of progress for many hundreds of years. Data is analyzed,
hypotheses are formulated and tested for validity. Those hypotheses that succinctly explain
the data become theories.
    However, as the amount and rate of data acquisition increase with technology, so do the
di culties humans have with comprehending, modeling, and describing that data. To deal
with this, techniques in statistics, data mining, machine learning, and arti cial intelligence
have been crafted to facilitate our understanding of big data.
    A particular model well known to scientists and engineers is the mathematical equation.
Equations encapsulate the relationships between observed variables and responses. These
analytical models are more than predictive entities, the espouse relationships, and in turn
theories, something which can be studied in its own right.

The question thus arises:
Can a feasible algorithm be developed for the general problem of recovering an-
alytical equations from observational data?

We show here that the task of recovering equations, called Symbolic Regression, is in fact
an achievable goal.

    We introduce Symbolic Regression and formulate it as a problem. We then give a back-
ground of Symbolic Regression implementations We focus on the most common method:
Genetic Programming, a non-deterministic algorithm. We next describe our main contri-
bution, a deterministic algorithm for Symbolic Regression, which we call Prioritized Enu-
meration. We use a suite of benchmarks and real-world data sets, to providing, what we
aim to be as, an in-depth and fair comparison of the most prominent implementation and
Prioritized Enumeration. We conclude by introducing a framework in which both human
and machine work in synergy to perform the task of Symbolic Regression.




No comments:

Post a Comment