Monday, April 7, 2014

Order within Chaos

My first exposure to statistical analysis was from the marvelous book Chaos Theory Tamed by Garnett P. Williams. This book explores chaotic systems, systems that are organized by clearly defined rules yet have seemingly-random behavior and are highly sensitive to initial conditions. In this book he explored many of the fundamental techniques I apply to analyzing time series models such as stationarity checks, analysis for seasonality, periodicity, and trend.

The idea behind chaos theory is looking for trends in information, finding clues that indicate there is a pattern behind the data rather than just random noise. And this technique applies to many other branches of statistical analysis. One of the main goals in modelling is determining if there is some predictability in variables or if they have no effect on one another. Finding these correlations is vital to developing proper models.

Chaotic behavior is often found in places like fluid dynamics and hypothesized to be in systems as complex as the stock market. It is an incredibly interesting phenomena that demonstrates many of the interesting features of statistical modelling.

Universally Selected Hyper Logically Developed Quantum Information Theory (U SHLD QuIT)

A common problem that I have mentioned many times before is the problem of big data. There is an enormous amount of information in the world. We have learned how to harness inputs from a myriad of different fields and the result is more data that can be feasibly handled using classical techniques.

This naturally gives rise to the question, what are some non-classical techniques? And many of these have been discussed before such as a different way of isolating trends or a different method of modelling. These ideas are based on the fact that we have only so much processing power and increasingly large amounts of data. But there is another option. What if we limit our processing technique, but in exchange give it nearly unlimited power? In other terms, running our programs won't tell us the same things, but they will run orders of magnitude faster! What is this amazing technology you ask? Well, welcome to quantum computing.

Quantum computing has suffered a large part of poorly-researched journalism over the years but after focusing my summer research project at Stanford on quantum information theory I feel prepared enough to banish the illusions.

The basis of quantum computing is that the concept of a bit, a "light bulb" that is either on or off, can be slightly changed. In classical computing this idea of on or off, all or nothing is how we store data. Through long strings of on and off light bulbs (or 1's and 0's as they are often known) we can express all manners of ideas. Quantum computing uses physical properties of the universe to make things a little bit more interesting. Instead of a bit being on or off, it has some probability of being on, some probability of being off. Basically that means that we don't know if it is a 1 or a 0 and if we look closely enough we can find out, but without looking closely all we know are these probabilities. (And while this explanation still skates around some major concerns, it is accurate enough for this blog post)

But Ryan, what does this have to do with analyzing data? I'm glad you asked! It turns out that since this bit can have a whole continuous spectrum of probabilities of its on and offs, it can store a lot more data in it. This means that we can put are large amounts of data, translate them into these "quantum bits" and use them for our purposes. But, this comes with a great drawback. Information in a "qubit" is not as accessible as a regular bit. When we "read" qubits, information is lost. It resolves into either a 1 or a 0, and any other information is lost. However, there are certain mathematical techniques that we can use to solve problems faster than we could using classical bits. And thus comes the hope that someday we can use these techniques to analyze large amounts of data in a quick fashion.

SUMaC and Statistics

Mathematics and modelling go hand in hand. Much of mathematics is simply development of models that fit some subset of the universe or categorize some phenomena. So when it comes to statistical modelling a good deal of math is often involved. Yet this is a problem, because as useful as this modelling is, it suffers an enormous scarcity issue due to a challenging problem.

Mathematics is one of the least popular fields in all of academia. Many non-academics fear, loath, or reject math for a variety of reasons. Even many academics have professed that math is simply "not for them." And that could have happened to me too. While I was not struggling with math and I even had a certain fondness for it I was not particularly compelled to know more about it. I lacked the curiosity that is necessary to strive for insight to mathematical problems.

Thankfully I avoided this issue by enrolling in Mrs. Bailey's Category Theory class. There I learned an appreciation of math that I had lacked before and it led me on my path toward the Stanford University Mathematics Camp. And that was where I learned the true meaning of being a mathematician. It is more than creating formulas and equations. These things are often done, but it comes down to more than that. Mathematics is about solving problems in a logical manner. And these techniques are the cornerstone of succeeding in the modern business world.

Friday, April 4, 2014

A Brief Summary of Ryan Smith

One of my frequent human interactions in the last few week has been in the Stanford facebook group. It is composed of the admitted students and after regular decisions came out a week ago the group has been inundated with new people. A common trend is for people to introduce themselves, talking about things they like, and similar events. I eventually decided to take a stab at it and here is my introduction. 

I want you all to know that you are influencing me with peer pressure and that is wrong and you should feel terrible. Well, with that out of the way...
Hello everyone, I'm Ryan Smith and if you know a way to put me into a stasis until September please let me know! I am the youngest of 5, an act-a-holic, frequent video game connoisseur, and math enthusiast. That's in fact how I came to apply to Stanford. A few of my friends had attended the Stanford Mathematics Camp (SUMaC represent!) and this led me to apply in 2013 and I proceeded to have one of the best summers of my life. I met lots of amazing people, many of which are in this group, fell in love with the campus, the lifestyle, and the community. From that point on I couldn't see myself going anywhere else and I was and am very relieved that I found out that I was going in December.
On other topics, I've made a second home at my local community theater and have been a part of over 40 performances in the last 4 years and if there is anything that I am going to miss it will be my lovely Fountain Hills Theater.
The other major influences in my life have been Warcraft III, my first major online game, WoW and LoL as a place where I found many of my closest friends, and my obnoxious older sister who has shaped my mind to her own purposes.
I"m going to major in Mathematics and Computer Science and love learning about all of the amazing technologies we use. Anyway, that's me, hi. How are you?

Abstract

Today I am showing my abstract for my SRP presentation. This is the first step toward my actual presentation which I will present in May. Without further adieu here is my abstract.

As the world of data analytics becomes increasingly vital to the business world, many corporations are utilizing it to streamline their marketing, sales, and development departments. This research project explores the data manipulation techniques and tools used by software giants like Google, Facebook, Amazon, and Netflix to market their product and improve their services. These companies utilize petabytes of information that ranges from data on their clients to marketing trends of certain products and this information requires proper handling to prove useful. There are many different approaches to analyzing this data such as time series analysis or regression modeling and as time progresses even more advanced techniques are being developed. The research on this topic was conducted by analyzing the tools used by these companies, such as sentiment analysis and segmentation modeling and the tools used to manage data in general such as SQL and R. The purpose of this project is to provide a perspective on how important information management is to the modern world and shows that the new techniques in data analysis are critically important to success as a major business.


Update on Life

Today I'm giving a general update on things I've been doing for the past few weeks. I've been learning a lot about the programming language/analytics tool R which is enormously useful for creating models and processing data. It shares common features with many languages like C+ or Java and only requires learning a little new syntax. It's made a number of my projects easier.

This last weekend I learned all about the mathematics of sound design helping my theater set up for their annual fundraiser Broadway in the Hills. The gist of it is that setting up a temporary acoustic environment in a day is enormously challenging and requires a LOT of wiring.

In terms of colleges last week was the D-Day for a lot of schools and I am happy to announce that I was rejected by all of the other high end schools I applied to including Harvard, Caltech, MIT, and Harvey Mudd. While slightly saddening I can understand their decisions as my applications may have suffered after I was accepted into Stanford in December.

On top of my internship I am currently a part of 3 performances of the Fountain Hills Theater. I am running sound for the comedy The Man Who Came to Dinner, student stage managing The Little Princess: Sara Crewe, and performing at Papa Vito in our annual Murder Mystery event, Bellamorte! I do these with a mix of pleasure and pain as I know that there will not be many more chances for me to spend time at my home away from home for these past 4 years but I hope to go out with a bang!