Fitting, Uncertainty, Significance and Error
By Mark Ciotola
First published on May 17, 2019. Last updated on February 16, 2020.
For the moment, let assume that we have a validated, precisely known data set. However, we don’t know the relationship between the data, its trends or driving tendencies. So we decide to develop a model to gin a deeper understanding. Generating a model is easy. Personal income = (5 * personal height) + 6. There. Done! Yet to what extent is it a valid model? There are tools for that. In fact, modeling is often a process of adjusting the model function and parameters until the model fits the data well.
One technique is minimizing the sum of the squares. For each term of data, calculate the value the model would produce (e.g. for each point of time). Take the square of difference between calculated and actual value. Add up all of those squares. Adjust your unknown parameters to produce the smallest value for the sum of the squares. This will be your best fit model.
There is a limit in the significance of quantities. Here, we are only referring to mathematical significance. For example, a quality is only significant up to half of the smallest unit being measured. For example, if you measure a distance with a meter stick, and the stick is only ruled in 1 centimeter units, then you can express the distance in terms of the smallest subdivision of the rulings: in this case, that would be 1/2 of a centimeter. So the significance would be 0.5 cm, which is three digits, e.g. 91.5 cm.
Measurements involve a degree of uncertainly. Once again, here, we are only referring to mathematical uncertainly. Using the previous example, the distance is likely not exactly at any specific centimeter ruling. It is somewhere between centimeter marks, and it can sometimes be a bit of a judgment call to determine which is the closest mark. So the uncertainly here would be plus or minus 0.5 cm. So if the distance was measured as 91.5 cm, then the measurement would be expressed as 91.5 cm +/- 0.5 cm. This sort of error cannot typically be eliminated.
Measurements can be subject to systematic error. This type of error occurs due to a consistent flaw in the measurement system.
For example, suppose the end of the meter stick was once cut off at the 1 cm mark, so that it always understates the distance by 1 cm. Such sources can sometimes be identified through examination of the measuring apparatus, and eliminated if identified.
Statistical Approaches to Improving Quality of Data
Large regimes are comprised of vast numbers of individuals. Even a small city might contain tens of thousands of people. Most large urban areas contain millions of people. Most powerful countries contain at least 50 million people to over one billion people in modern times.
Even if a regime is governed by a single individual such as a monarch or dictator, the regime is nevertheless comprised of all of the individuals governed, each with their now needs, perspectives, influence and power (even if individually small).