Many managers don’t understand what exactly risk analysis is. We put together some of the most common questions with responses for you.
What does the risk percentage mean?
The risk percentage approximates
the on-time probability for an order with appropriate consideration of the
number of replications or “experiments.”
It tells the user how confident they can be in meeting the due date
given how many trials they have conducted.
How does Simio calculate the on-time probability?
Simio adjusts from a base rate of 50% with each risk
replication. If an order is on time in
an individual replication, Simio updates the probability, increasing it closer
to 100%. If the order is late, Simio
decreases the probability closer to 0%.
Each replication is an experiment that provides new information about
the likelihood of success or failure.
More experiments mean more confidence in the answer.
Why is the base rate 50%?
Before any plan is generated or any activity is simulated,
there is no information about the order other than the possible outcomes. Because there are only two outcomes that
matter (on time or not), the base rate is set to 50%.
I have an overdue order in my system. Why is it not always 0%?
Because the calculation is an adjustment of a base rate of
50%, Simio needs a lot of evidence before it will guarantee that an order will
be late (or on time for that matter). If
the user runs 1000 replications, and the result is late in all of them, Simio
will reflect a 0% on time probability.
What formula does Simio use to calculate the probability?
For the statistics experts, Simio uses a binomial proportion
confidence internal formula known as the Wilson Score. We report the midpoint of the confidence
interval as the risk measure.
Why not just report the outcome of the replications as
the probability (e.g., if 9 of 10 are on time, report 90% on time probability)?
This was the original implementation. However, it gives a false sense of confidence
and can be misleading. A single
replication would always yield either 100% on time or 0% on time. We wanted the answer to also give decision
makers a sense of how confident they could be in the answer. Using the Wilson Score, a single replication will
yield a result of 60% at best and 40% at worst (using 95% confidence level). This helps the decision maker identify that
they have a very small sample of data and would encourage them to run
Can you give me an example of how this works?
Risk analysis can be demonstrated using any scheduling
example. It is best viewed in the Entity
Gantt. In the screenshots below, we’ve
included 2 orders from the Candy Manufacturing Scheduling example. One of the orders is overdue (will be late
always), and the other has plenty of time (will be on time always).
The base rate is 50%.
After 1 replication, Simio updates the probabilities. Order 1 now has a 60% on time
probability. Order 2 has a 40% on time
After 2 replications, 67% and 33%:
After 5 replications, 78% and 22%:
After 100 replications, 98% and 2%:
Finally, after 1000 replications, 100% and 0%:
How many replications should I run?
By default, we suggest 10 replications (and 95% confidence
level). With these settings, a risk
measure of 86% is a good sign, while 14% is a bad one. Beyond the default settings, there are
several additional factors which are dependent on the situation and use case. One of these factors is slack time (the time
between estimated completion and due date).
On the Gantt, slack time is the distance between the grey marker and the
green marker. If the slack time is
large, a single replication may suffice.
If the slack time is small, additional replications will help identify
if the order is in trouble or not.
Now that I know my risk, what can I do about it?
Depending on your position in the organization (and
therefore your decision rights), you can change either the design or operation
of the system. Example design changes include things like adding another
assembly line or buying another forklift. These changes are long term and
may require approvals for capital expenditure (which the model facilitates by
quantifying the impact of the expenditure). Example operational changes
include things like adding overtime, expediting a material, or changing order
priorities, quantities, due dates etc. Bridging the gap between design
and operation are the dispatching rules, which relate to overall business
objectives. They are also flexible parameters which control how Simio
chooses the next job from a queue (e.g., earliest due date, least setup,
critical ratio, etc.). All of these parameters influence risk and can be
changed, provided that the user has the authority to change them.
Will Simio choose the best design and operation for me?
Decision rights and business processes have far reaching
consequences. A floor manager can probably authorize overtime if the
schedule looks risky. He probably cannot buy a piece of equipment.
To change a priority or a due date, he probably needs to consult with the
commercial team and/or account managers. To expedite a material, he
probably needs to communicate with the procurement team. To make a
capital expenditure (i.e., change system design), he probably needs
executive/financial approval. Our solution respects those
boundaries. We treat priorities, due dates, etc. as inputs rather than
outputs. Any of these parameters can be changed by the appropriate
decision maker. They should not be changed by the tool without
consent. Simio assists the decision maker (at any level in the
organization) by exposing the true consequences.
With so many choices, how can I quickly explore the
consequences across multiple scenarios?
The experiment runner is used to explore consequences (which
we call Responses) across multiple scenarios where a user can influence the
parameters mentioned above (which we call Controls). If the solution
space is very large (i.e., there are many controls with a wide range of
acceptable values), we recommend using OptQuest to automate the search of the
solution space based on single or multiple objectives (e.g., low cost and high
service level). OptQuest uses a Tabu search which learns how the control
values influence the objectives as it explores the solution space.
How often should I run these type of experiments?
Experiments are most relevant to design choices.
Operational decisions have many hard constraints which cannot be easily
influenced. For example, though Simio will allow you to adjust material
receipt dates of critical materials and show you the impact on the schedule,
many of them are inflexible and out of control of planner or even the
business. If you ask OptQuest how much inventory you would like to have,
it will tell you, but this information adds no value because it is not
actionable in the short term. The planners need to work with what they
have and make the best of it. In practical application, we recommend
running large experiments to explore design decisions on a monthly or quarterly
In today’s world, companies compete not only on price and
quality, but on their ability to reliably deliver product on time. A good production schedule, therefore, influences
a company’s throughput, sales and customer satisfaction. Although companies have invested millions in
information technology for Enterprise Resource Planning (ERP) and Manufacturing
Execution Systems (MES), the investment has fallen short on detailed production
scheduling, causing most companies to fall back on manual methods involving Excel
and planning boards. Meanwhile, industry
trends towards reduced inventory, shorter lead times, increased product
customization, SKU proliferation, and flexible manufacturing are making the
task more complicated. Creating a
feasible plan requires simultaneous consideration of materials, labor,
equipment, and demand. This bar is
simply too high for any manual planning method.
The challenge of creating a reliable plan requires a digital
transformation which can support automated and reliable scheduling.
Central to the idea of effective factory scheduling is the
concept of an actionable schedule.
An actionable schedule is one that fully accounts for the detailed
constraints and operating rules in the system and can therefore be executed in
the factory by the production staff. An
issue with many scheduling solutions is that they ignore one or more detailed
constraints, and therefore cannot be executed as specified on the factory
floor. A non-actionable schedule
requires the operators to step in and override the planned schedule to
accommodate the actual constraints of the system. At this point the schedule is no longer
being followed, and local decisions are being made that impact the system KPIs
in ways that are not visible to the operators.
A second central idea of effective scheduling is properly
accounting for variability and unplanned events in the factory and the corresponding
detrimental impact on throughput and on-time delivery. Most scheduling approaches completely ignore
this critical element of the system, and therefore produce optimistic schedules
that cannot be met in practice. What
starts off looking like a feasible schedule degrades overtime as machines
break, workers call off sick, materials arrive late, rework is required,
etc. The optimistic promises that were
made cannot be kept.
A third consideration is the effect of an infeasible
schedule on the supply chain plan. Factory
scheduling is only the final step in the production planning process, which
begins with supply chain planning based on actual and/or forecast demand. The supply chain planning process generates
production orders and typically establishes material requirements for each planning
period across the entire production network.
The production orders that are generated for each factory in the network
during this process are based on a rough-cut model of the production capacity. The supply chain planning process has very
limited visibility of the true constraints of the factory, and the resulting
production requirements often overestimate the capacity of the factory. Subsequently, the factory schedulers must
develop a detailed plan to meet these production requirements given the actual
constraints of the equipment, workforce, etc.
The factory adjustments to make the plan actionable will not be
transparent to the supply chain planners.
This creates a disconnect in a core business planning function where
enormous spending occurs.
In this paper we will discuss the solution to these
challenges, the Process Digital Twin, and the path to get there. The Simio Digital Twin solution is built on
the patented Simio Risk-based Planning and Scheduling (RPS) software. We
will begin by describing and comparing the three common approaches to factory
scheduling. We will then discuss in
detail the advantages of a process Digital Twin for factory scheduling built on
Factory Scheduling Approaches
Let’s begin by discussion the three most common approaches
to solving the scheduling problem in use today:
1) manual methods using planning boards or spreadsheets, 2) resource
models, and 3) process Digital Twin.
The most common method in use today for factory scheduling
is the manual method, typically augmented with spreadsheets or planning
boards. The use of manual scheduling is
typically not the companies first choice but is the result of failure to
succeed with automated systems.
Manually generating a schedule for a complex factory is a
very challenging task, requiring a detailed understanding of all the equipment,
workforce, and operational constraints. Five
of the most frustrating drawbacks include:
It is difficult for a scheduler to consider all
the critical constraints. While
schedulers can typically focus on primary constraints, they are often unaware –
or must ignore – secondary constraints, and these omissions lead to a
Manual scheduling typically takes hours to
complete, and the moment any change occurs the schedule becomes
The quality of the schedule is entirely
dependent on the knowledge and skill of the scheduler. If the scheduler retires is out for vacation
or illness, the backup scheduler may be less skilled and the KPIs may degrade.
It is virtually impossible for the scheduler to
account for the degrading effect of variation on the schedule and therefore
provide confident completion times for orders.
As critical jobs become late, manual schedulers
resort to bumping other jobs to accommodate these “hot” jobs, disrupting the flow
and creating more “hot” jobs. The system
becomes jerky and the system dissolves into firefighting.
Companies that utilize an automated method for factory
scheduling typically use an approach based on a resource model of the
factory. A resource model is comprised
of a list of critical resources with time slots allocated to tasks that must be
processed by the resource based on estimated task times. The
resource list includes machines, fixtures, workers, etc., that are required for
production. The following is a Gantt
chart depicting simple resource model with four resources (A, B, C, D) and two
jobs (blue, red). The blue job has task
sequence A, D, and B, and the red job has task sequence A and B.
The resources in a resource model are defined by a state
that can be busy, idle, or off-shift.
When a resource is busy with one task or off-shift, other tasks must
wait to be allocated to the resource (e.g. red waits for blue on resource A). The scheduling tools that are based on a
resource model all share this same representation of the factory capacity and
differ only in how tasks are assigned to the resources.
The problem that all these tools share is an overly
simplistic constraint model. Although this
model may work in some simple applications, there are many constraints in
factories that can’t be represented by a simple busy, idle, off-shift state for
a resource. Consider the following
A system has two cranes (A and B) on a runway
that are used to move aircraft components to workstations. Although crane A is currently idle, it is
blocked by crane B and therefore cannot be assigned the task.
A workstation on production line 1 is currently
idle and ready to begin a new task.
However, this workstation has only limited availability when a complex
operation is underway on adjacent line 2.
An assembly operator is required for completing
assembly. There are assembly operators
currently idle, but the same operator that was assigned to the previous task
must also be used on this task, and that operator is currently busy.
A setup operator is required for this task. The operator is idle but is in the adjacent
building and must travel to this location before setup can start.
The tasks involve the flow of fluid through
pipes, valves, and storage/mixing tanks, and the flow is limited by complex
A job requires treatment in an oven, the oven is
idle but not currently at the required temperature.
This is just a few examples of typical constraints for which
a simple busy, idle, off-shift resource model is inadequate. Every factory has its own set of such
constraints that limit the capacity of the facility.
The scheduling tools that utilize a simple resource model
allocate tasks to the resources using one of three basic approaches;
heuristics, optimization, and simulation.
One common heuristic is job-sequencing that begins with the
highest priority job, and assigns all tasks for that job, and repeats this
process for each job until all jobs are scheduled (in the previous example blue
is sequenced, then red). This simple
approach to job sequencing can be done in either a forward direction starting
with the release date, or a backward direction starting with the due date. Note that backward sequencing (while useful
in master planning) is typically problematic in detailed scheduling because the
resulting schedule is fragile and any disruption in the flow of work will
create a tardy job. This simple one-job-at-a-time
sequencing heuristic cannot accommodate complex operating rules such as
minimizing changeovers or running production campaigns based on attributes such
as size or color. However, there have
been many different heuristics developed over time to accommodate special
application requirements. Examples of
scheduling tools that utilize heuristics include Preactor from Siemens and
PP/DS from SAP.
The second approach to assigning tasks to resources in the
resource model is optimization, in which the task assignment problem is
formulated as a set of sequencing constraints that must be satisfied while meeting
an objective such as minimizing tardiness or cost. The mathematical formulation is then “solved”
using a Constraint Programming (CP) solver.
The CP solver uses heuristic rules for searching for possible task
assignments that meet the sequencing constraints and improve the objective. Note that there is no algorithm that can
optimize the mathematical formulation of the task assignment for the resource
model in a reasonable time (this problem is technically classified as NP Hard),
and hence the available CP solvers rely on heuristics to find a “practical” but
not optimal solution. In practice, the
optimization approach has limited application because often long run times (hours)
are required to get to a good solution.
Although PP/DS incorporates the CP solver from ILOG to assign tasks to
resources, most installations of PP/DS rely on the available heuristics for
The third approach to assigning tasks in the simple resource
model is a simulation approach. In this
case we simulate the flow of jobs through the resource model of the factory and
assign tasks to available resources using dispatching rules such as smallest
changeover or earliest completion. This
approach has several advantages over the optimization approach. First, it executes much faster, producing a
schedule in minutes instead of hours. Another
key advantage is that it can support custom decision logic for allocating tasks
to resources. An example of tool that
utilizes this approach is Preactor 400 from Siemens.
Regardless which approach is used to assign tasks to
resources, the resulting schedule assumes away all random events and variation
in the system. Hence the resulting
schedules are optimistic and lead to overpromising of delivery times to
customers. These tools provide no
mechanism for assessing the related risk with the schedule.
The third and latest approach to factory scheduling is a
process Digital Twin of the factory. A
Digital Twin is a digital replica of the processes, equipment, people, and devices
that make up the factory and can be used for both system design and operation. The resources in the system not only have a
busy, idle, and off-shift state, but they are objects that have behaviors and
can move around the system and interact with the other objects in the model to
replicate the behavior and detailed constraints of the real factory. The
Digital Twin brings a new level of fidelity to scheduling that is not available
in the existing resource-based modeling tools.
Simio Digital Twin
The Simio Digital Twin is an object-based, data driven, 3D
animated model of the factory that is connected to real time data from the ERP,
MES, and related data sources. We will
now summarize the key advantages of the Simio Digital Twin as a factory
Dual Use: System Design and Operation
Although the focus here is on enhancing throughput and
on-time delivery by better scheduling using the existing factory design, unlike
traditional scheduling tools, the Simio Digital Twin can also be used to
optimize the factory deign. The same
Simio model that is used for factory scheduling can be used to test our changes
to the facility such as adding new equipment, changing staffing levels,
consolidating production steps, adding buffer inventory, etc.
A basic requirement of any scheduling solution is that it
provide actionable schedules that can implemented in the real factory. If a non-actionable production schedule is
sent to the factory floor, the production staff have no choice to be ignore the
schedule and make their own decisions based on local information.
For a schedule to be actionable, it must capture all the
detailed constraints of the system. Since
the foundation of the Simio Digital Twin is an object-based modeling tool, the
factory model can capture all these constraints in as much detail as necessary. This includes complex constraints such as
material handling devices, complex equipment, workers with different skill sets,
and complex sequencing requirements,
In many systems there are operating rules that have been developed
over time to control the production processes.
These operating rules are just as important to capture as the key system
constraints; any schedule that ignores these operating rules is non-actionable. The Simio modeling framework has flexible rule-based
decision logic for implementing these operating rules. The result is an actionable schedule that respects
both the physical constraints of the system as well as the standard operating
In most organizations, the useful life of a schedule is
short because unplanned events and variation occur that make the current
schedule invalid. When this occurs, a new
schedule must be regenerated and distributed as immediately as possible, to
keep the production running smoothly. A
manual or optimization-based approach to schedule regeneration that takes hours
to complete is not practical; in this case the shop floor operators will take
over and implement their own local scheduling decisions that may not aligned
with the system-wide KPIs. When random
events occur, the Simio Digital Twin can quickly respond and generate and
distribute a new actionable schedule. Schedule
regeneration can either be manually triggered by the scheduler, or
automatically triggered by events in the system.
3D Animated Model and Schedule
In other scheduling systems the only graphical view of the
model and schedule is the resource Gantt chart.
In contrast, the Simio Digital Twin provides a powerful communication
and visualization of both the model structure and resulting schedule. Ideally, anyone in the organization – from
the shop floor to the top floor – should be able to view and understand the model
well enough to validate its structure. A
good solution improves not only the ability to generate an actionable schedule,
but to visualize it and explain it across all levels of the organization.
The Simio Gantt chart has direct link to the 3D animated facility;
right click on a resource along the time scale in the Gantt view and you
instantly jump to an animated view of that portion of facility – showing the machines,
workers, and work in process at that point in time in the schedule. From that point you can simulate forward in
time and watch the schedule unfold as it will in the real the system. The benefits of the Simio Digital Twin begin
with its accurate and fast generation of an actionable schedule. But the benefits culminate in the Digital
Twins ability to communicate its structure, its model logic, and its resulting
schedules to anyone that needs to know.
One of the key shortcomings of scheduling tools is their
inability to deal with unplanned events and variation. In contrast, the Simio Digital Twin can
accurately model these unplanned events and variations to not only provide a
detailed schedule, but also analyze the risk associated with the schedule.
When generating a schedule, the random events/variations are
automatically disabled to generate a deterministic schedule. Like other deterministic schedules it is
optimistic in terms of on time completions.
However, once this schedule is generated, the same model is executed
multiple times with the events/variation enabled, to generate a random sampling
of multiple schedules based on the uncertainty in the system. The set of randomly generated schedules is
then used to derive risk measures – such as the likelihood that each order will
ship on time. These risk measures are
directly displayed on the Gantt Gannt chart and in related reports. This let’s the scheduler know in advance
which orders are risky and take action to make sure important orders have a
high likelihood of shipping on time.
It’s not uncommon that the supply chain planning process which
is based on a rough-cut capacity model of the factory sends more work to a
production facility than can be easily produced given the true capacity and
operational constraints of the facility.
When this occurs, the resulting detailed schedule will have one or more
late jobs and/or jobs with high risk of being late. The question then arises as to what actions
can be taken by the scheduler to ensure that the important jobs all delivered
Although other scheduling approaches generate a schedule,
the Simio Digital Twin goes one step further by also providing a constraint
analysis detailing all the non-value added (NVA) time that is spent by each job
in the system. This includes time
waiting for a machine, an operator, material, a material handling device, or
any other constraint that is impeding the production of the item. Hence if the schedule shows that an item is
going to be late, the constraint analysis shows what actions might be taken to
reduce the NVA time and ship the product on time. For example, if the item spends a significant
time waiting for a setup operation, scheduling overtime for that operator may
Although scheduling within the four walls of a discrete production
facility is an important application area, there are many scheduling
applications beyond discrete manufacturing.
Many manufacturing applications involve fluid flows with storage/mixing
tanks, batch processing, as well as discrete part production. In contrast to other scheduling tools that are
limited in scope to discrete manufacturing, the Simio Digital Twin has been
applied across many different application areas including mixed-mode manufacturing,
and areas outside of manufacturing such as logistics and healthcare. These applications are made possible by the
flexible modeling framework of Simio RPS.
A process Digital Twin is a detailed simulation model that
is directly connected to real time system data. Traditional simulation modeling
tools have limited ability to connect to real time data from ERP, MES, and
other data sources. In contrast, Simio
RPS is designed from the ground up with data integration as a primary
Simio RPS supports a Digital Twin implementation by
providing a flexible relational in-memory data set that can directly map to both
model components and to external data sources.
This approach allows for direct integration with a wide range of data
sources while enabling fast execution of the Simio RPS model.
Data Generated Models
In global applications there are typically multiple
production facilities located around the world that produce the same
products. Although each facility has its
own unique layout there is typically significant overlap in terms of resources
(equipment, workers, etc.) and processes.
In this case Simio RPS provides special features to allow the Digital
Twin for each facility to be automatically generated from data tables that map
to modeling components that describe the resources and processes. This greatly simplifies the development of
multiple Digital Twins across the enterprise and also supports the reconfiguring
of each Digital Twin via data table edits to accommodate ongoing changes in
resources and/or processes.
Simio is a forward scheduling simulation engine. We do not support backwards scheduling. We have found the backwards scheduling
approach fails to represent reality, thus generating an infeasible plan that is
unhelpful to planners. Many of our customers
have learned this lesson the hard way.
The underlying principle of forward scheduling is
feasibility first. A schedule is built
looking forwards considering all the constraints and conditions of the system
(e.g., resource availability, inventory levels, work in progress, etc.). The schedule is optimized in run time while
only considering the set of feasible choices available at that time. Decisions are made according to user
specified dispatching rules (the same as backwards scheduling). The output is a detailed schedule that
reflects what is possible and tells the planner how to achieve it. As in real life, a planner can only choose
when to start an operation. Completion date
is an outcome, not a user specified input.
The most salient technical difference between the two
approaches is material availability (both raw and intermediate manufactured
materials). A forward-looking schedule
makes no assumptions. If materials are
available, a finished good can be produced.
Otherwise, it cannot. If the
materials must be ordered or manufactured, the system will order them or
manufacture them before the finished good can start. A backwards schedule plans the last operation
first, assuming that materials will be available (*we have yet to find an environment
where future inventory can be accurately forecast). If the materials must be produced or
purchased, it will try to schedule or order them prior, hoping that the start
date isn’t yesterday. If the clock is
wound backwards from due date all the way to present time, the resulting
schedule shows the planner what their current stockpile and on-order inventory
would have to be to execute the idealized plan.
It does not tell the planner what they could do with their actual
stockpile and on-order inventory.
Next consider a situation where demand exceeds plant
capacity (this is reality for most of our customers). The plant cannot produce everything that the
planner wants. The planner must choose
amongst the alternatives and face the tradeoffs. Forward scheduling deals with this situation
by continuing to schedule into the future, past the due date, showing the
planner which orders will be late. By
adjusting the dispatching rules, priorities, and the release dates, the planner
can improve the schedule until they reach a satisfactory alternative. Every alternative is a valid choice and feasible
for execution. Backwards scheduling
deals with this situation by continuing to schedule into the past, showing the
planner which orders should have been produced yesterday. The planner must tweak and adjust dispatching
rules and due dates until finding a feasible alternative. In our experience, the planner can make the
best decision by comparing multiple feasible plans, rather than searching for a
Any complete scheduling solution must also be capable of
rescheduling. Rescheduling can be
triggered by any number of random events that occur daily. In rescheduling, the output must respect work
in progress. Forward scheduling loads
WIP first, making the resource unavailable until the WIP is complete. Backwards scheduling loads WIP last, if at
all. Imagine building a weekly schedule
backwards in time, hoping that the “ending” point exactly equals current plant
WIP. The result is often infeasible.
In terms of feasibility, the advantages of forward
scheduling are clear. But we also get
questions about optimization, particularly around JIT delivery. A quick Google search on forward scheduling
reveals literature and blog posts that describe forward scheduling “As early as
possible” (meaning a forward schedule starts an operation as soon as a resource
is available, regardless of when the order is due). This is false. Forward scheduling manages the inventory of
finished goods the same way the plant does.
A planner specifies a release date as a function of due date (or in some
cases specifies individual release dates for each order). In forward scheduling, no order is started
prior to release date. The power of this
approach is experimentation. Changing
lead time is as easy as typing in a different integer and rescheduling. As above, the result is a different feasible
alternative which makes the tradeoff transparent. Shorter lead times minimize inventory of
finished goods but increase late deliveries and vice versa. We have found many customers focus on short
lead times based on financial goals rather than operational goals. Inventory ties up cash. Typically, the decision to focus on cash is
made without quantifying the tradeoff.
We provide decision makers with clear cut differences between
operational strategies so that they can choose based on complete information.
Forward scheduling is reality. It properly represents material flows and
constraints, plant capacity, and work in progress. It manages the plant the same way a planner
does. Accordingly, it generates sets of
feasible alternatives that quantify tradeoffs for planners and executive
decision makers alike. It answers the
question “What should the plant do next?” as opposed to “What should the plant
have done before?” We’ve found the
feasibility first approach is the most helpful to a planner and therefore the
most valuable to a business.