INFO 3300 - Data-driven Web Applications

How can we evaluate confidence in a model? When we apply a function to a dataset, we get a result. But how do we know whether that result is exact, or if it could easily be bigger or smaller? We want to avoid the Reinhart / Rogoff trap, where adding or removing a few data points could completely invalidate our argument.
We did this experiment in class when we sampled multiple data sets from the same model. We found that the error and the number of data points both had a big effect.
But how can we estimate variability without knowing a data-generating distribution? In most cases we're just recording numbers from the world, and we have what we have. We can't sample new GDP values for Angola in 2009, for example. But what we can do is called the "bootstrap": we create fake data sets that look a lot like our real dataset by sampling the same number of data points, with replacement, from the original data. Some values will disappear, some will appear two or more times. We can then evaluate that variability of our function of interest.
The important point is that we can always sample with replacement from our data. From any dataset you can run the bootstrap and get a confidence interval.
What makes data sensitive to bootstrap samples? The big reason is outliers. If most of the data points are pretty similar, then whether we lose or amplify those points doesn't really matter. It's when there are a few really weird points that we start to see lots of variability.