Last week in Data Collection Basics (Part 1) I discussed data collection, introducing the topics of **identifying **required data and then **locating **or **creating **that data. Once you have some data, you typically need to do some **analysis **on it before you can effectively use that data.

**Select Distribution. **Typically input data to a simulation model is specified as a distribution. If you have estimated data you must select the most appropriate distribution (for example a minimum time, typical time, and maximum time may be represented as a Triangular distribution). If you have actual data, then you will need to run a statistical analysis on it. Many software products (some generic and some simulation-specific) are available to help you with selecting (fitting) a distribution and its shape parameters, and even with cleaning the data to eliminate bad observations.

**Analyze Sensitivity.** Once you have some data you can build it into your model and start making trial runs. Particularly if you have relied on an estimate, you might want to run your model with values above and below the estimated values to determine system sensitivity to that parameter. If you find that the system is sensitive to an estimated value (e.g. the results change significantly with a change to the input parameter), then you can determine if it is worth a greater investment to obtain a more reliable value. This is one potential solution to the problems of bias and inaccuracy discussed in the initial article. But more than that, it is also a good way to iteratively determine how much time to spend on your input data.

**Adjust Detail.** Sometimes the quality of the available data can help you determine the appropriate level of detail for a model. If the data you intend to use is not very good, then there is little point to building a highly detailed model. This is not to imply that such a model is of no value, after all *every model is just a representation or estimate of reality* – no model will be perfect. But it is important to represent to your stakeholders the relative accuracy of the model and its underlying data.

This was a quick overview of some steps to data collection. Whole textbook chapters have been written about each of these, so **be sure to look for greater detail when you are ready**.

Dave Sturrock

VP Products – Simio LLC