Simulating a Pre-Archival System

by Aineth Torres-Ruiz (EGADE BS) and Ariel Shtul (The Archivists)

As presented at the 2016 Winter Simulation Conference


We developed a simulation model in SIMIO representing the system elements of the pre-archival process taking place at the largest archival services company in Israel. The pre-archival process usually involves a data entry operator manually registering retrieval information of boxes and files arriving on a roller-conveyor before they are assigned a space at the storing facility. The operators sit around the conveyor and pick a barcoded box. Using the simulation model we explored the behavior of the original system and identified opportunities for efficiency improvement. Initial changes in the system have shown an improvement in the system’s capacity of up to 15% over several months. The following sections provide the system descriptions and features of the modelling components.


"Ha'Archivarim" is the largest company in Israel which provides archival services. It is a subsidiary of the archive group within Villar Int'l, a Tel-Aviv 75 Index company. The archive group includes two more companies: Archive House ("Beit Ha'Archiv") and Archivit, located in Romania. The company manages more than 40,000 m2 (400,000 ft2) of storage space and over 3.5 million boxes. A normal box size is 40x30x30 cm but the company also handles boxes of different dimensions. 

Each year, the company receives hundreds of thousands of new boxes from clients. In the pre-archival department, the box content is entered into a database to allow clients retrieving needed files in the future. While some clients enter the content using a website, most of the information has to be entered manually by company employees, the data entry operators. These operators  sit around a circular conveyor and pick a barcoded box. They scan the barcode and a screen with the appropriate fields, according to clients' request, opens on their PC. If there are more than 6 files, the operators mark the files with a number and then type the content. For example, for a medical record they may type – Name, ID, Date and department. At last, they push the box back onto the conveyor and pick a new box, and so on. Boxes stay on the conveyor until they are picked by any one operator. Operators may choose not to pick a box due to self-interest reasons. Finally, an automated system pulls typed boxes off the conveyor. This system was not functional when the model was first developed causing that about 10% of boxes failed to leave the conveyor when they should.   The department includes over 10 full time data entry operators, a direct manager, two conveyor loaders/unloaders, acceptance/client-coordinator personnel and a department manager. The seasonality rate in which boxes arrive from clients dictates the seasonality of employment. During peak time, the company fills all conveyor positions and has additional shifts. 

Simulation Objectives

The objective  was to develop a simulation model to allow users to test different scenarios through the modification of system attributes as the following: Number of data entry operators, data entry operators’ skill levels, box and file processing time and rate of boxes failing to leave the conveyor (when they were supposed to).

Basic Modeling Logic

BOX, represents the default entity moving within the system through a network of operator objects (Figure 1) and is assigned a number of files upon entry. The data entry operator object is composed of a combination of three conveyors and a workstation. It models a box arrival, selection, pick up, barcode scanning and return logic to the conveyor.  Multiple instances of the data entry operator object are placed within the main model and connected to each other. In addition, the access and exit to the main conveying area are represented through two other conveyors (Figure 2).

Figure 1: Operator object
Figure 2: Main model

Following, we describe additional pieces of logic and the modelling approach used:

  • Logic1.  Boxes are scanned by operators given that they have capacity available and that the box has not been scanned before.
    Approach: As boxes approach the first node of the operator model, the attribute ModelEntity.Picture  is checked. A value of 0 indicates that the box needs to be scanned and sent to the input node at the data entry operator workstation, else, it is sent to the output node. A box is marked as scanned (modelEntity ==1) as boxes are setup at the data entry operator station.  
  • Logic2. There exists a probability that, even if the conditions indicated in Logic1 are met, the boxes are not scanned (due to a data entry operator’s self-interest behavior).
    Approach: In the operator model, after ModelEntity.Picture is checked, a random uniform probability between 0 and 1 is assigned to each box, which below a certain probability value is sent to the data entry operator’s input node.
  • Logic3. Boxes fail to leave the conveyor when they should (due to system failure).
    Approach: Upon leaving the last operator instance, the decision is taken on whether a typed box will loop again based on a pull out probability.  
  • Logic4. Some operators are more capable than others and take longer to type the same box
    Approach: In the operator model, the total typing time is divided by an efficiency coefficient.

Output Metrics

The model collects a range of output metrics, which include: Data entry operator cumulative busy, idle and waiting time, overall number of files and boxes processed, number of files and boxes processed by each data entry operator and number of boxes failing to leave the system.


The model has allowed the implementation of several changes. An additional engine with higher speed and an Arduino PLC were installed to fix the pull-out system which now works close to a 100% efficiency. Also, a bonus scheme was updated to level out the different boxes and eliminate the data entry operators’ urge to wait for a “better” box. Loss of marginal efficiency with each additional data entry operator is being investigated in order to support the decision of opening a 2nd conveyor during peak season.