A Laboratory Manual for Simulation with GPSS/H for Computer Science Majors: an Integrated Approach Ruth Silverman, University of the District of Columbia Supported by NSF Grant #DUE-9554863, Simulation for Computer Science Majors, to Ruth Silverman, PI, University of the District of Columbia. November 18, 1996, revised and edited October 1, 1997. Consultants: Maurice F. Aburdene, Bucknell University Richard Elnicki, University of Florida Richard Fujimoto, Georgia Institute of Technology Thomas J. Schriber, University of Michigan Copyright 1997, all rights reserved. Table of Contents Section 0. Introduction to Laboratory Manual Section 1. Getting started with GPSS/H simulation software 1.0 Introduction to GPSS/H. 1.1 Software installation 1.2 Setting up a program file 1.3 Running a program file. Section 2. Primer on computer systems and communication network simulation problems 2.0 Background 2.1 Queues, Probability and Statistics 2.2 Operating Systems and CPU's 2.3 Networks and Communications 2.31 Computer Networks 2.32 Performance Prediction and Analysis Section 3. Exercises using simulation 3.0 Introductory problems (using GENERATE, TERMINATE and ADVANCE blocks) 3.1 Problems using single queues (also using SEIZE and RELEASE blocks) 3.2 Problems in simulation theory or concepts 3.3 Simulation problems using GPSS/H including queues (making substantial use of material from Schriber beyond Chapter 6) 3.4 Problems on communications networks requiring extensive theory 4. Instructor's Aids 4.0 Guidelines for developing 4.01 Team Projects 4.02 Supplementary Exercises 4.1 Course Outline 4.2 UDC Syllabus 4.3 Preliminary Student Questionnaire 4.4 Final Student Questionnaire 5. Bibliography and References 6. Resources Appendix 1. RUN Software 0. Introduction to the laboratory manual Introduction to the project The primary author (Ruth Silverman) has revised and restructured an existing simulation course designed for senior computer science majors. This project has been undertaken at UDC (the University of the District of Columbia). It is supported by the NSF (National Science Foundation) under Grant #DUE-9554863 for Course and Curriculum Development (CCD) from its Department of Undergraduate Education entitled "Simulation for Computer Science Majors". The project consultants are Maurice Aburdene, Richard Elnicki, Richard Fujimoto, and Thomas Schriber. This manual contains: 1) instructions in the use of GPSS/H software; 2) an integrated set of graded laboratory exercises using GPSS/H; 3) related computer science reference materials; and 4) instructor's aids, including syllabi, course outlines keyed to the manual and the text, and student questionnaires to evaluate the course. The author taught a pilot course in the Fall of 1996 and 1997 using this manual together with a single text [Schriber]. The lectures and lab manual emphasized computer science-oriented aspects and applications of simulation, with lecture materials closely integrated with the lab exercises. The author presented a paper describing this project at the Winter Simulation Conference of 1996 [Silverman1], and a second follow-up paper to that Conference [Silverman2] has been accepted. She also disseminated the preliminary materials from this project on the internet to newsgroups and mailing lists, including comp.simulation, SIGCSE, and facsup. Initial Course The initial version of the course was developed by the author, and was taught at UDC for several years as an upper division elective. The author selected GPSS/H [Wolverine] for the simulation lab exercises. The manufacturer of this widely used simulation software provides free site license to educational institutions. The textbook used with the GPSS/H software [Schriber] includes diskettes for IBM-compatible machines with the book. This book is not aimed at undergraduates in computer science; ROSS was used for the more theoretical aspects of simulation. Some instructors might prefer to use notes on probability, statistics and modeling to obviate the need of a second text. The pair of books selected encompassed the material for the initial course. However, the textual and laboratory emphasis on computer science material needed significant expansion. Curriculum Needs [COMPUTER CURRICULA 91] identifies special needs in the computer science curricula, including: the need for a capstone course; the need for more hands-on laboratory experience; the need to integrate abstraction, theory and practice at an upper division level in the curriculum; and the need for learning by doing. The report also defines knowledge units and subject areas of the field of computing. Computer Science graduates need experience in the use of simulation as a tool in solving problems in these subject areas, e.g. networks, operating systems, performance analysis and scheduling. Also needed are instructor's materials for such a course. The manual and the simulation course have been designed to respond to all these needs. The optimized course is a strong pedagogical tool because it requires that students integrate their previous studies while learning simulation, and applying statistical methods. Simulation texts are made more useful for computer science majors when used with this laboratory manual. The author believes that a simulation course customized for computer science using the instructional materials in this manual can be a model for the type of capstone course that might be required for all computer science majors. Use of the manual Section 1: Getting started with GPSS/H Because this manual is based on a low-cost, widely used and easily available commercial software, it is of particular value to students and educators in underfunded colleges. Section 2: Primer on computer system and communication network simulation This section enables students (or faculty) to develop simulation models and run them on simulation software. This section also includes sufficient material on probability and statistics, followed by queues, so that juniors can take the course as well as seniors. Section 3: Exercises and experiments This section is graded according to the skills required. In the more advanced exercises the student will: clearly define and analyze the problem, developing a "lumped model" or abstraction; solve the abstract problem; implement the solution by writing the GPSS/H program file and running the simulation; and answer questions about the simulation. Some of the more advanced exercises are designed for use by groups of two or three students. Section 4: Instructor's aids These are: a syllabus and a detailed outline, keyed to the text and the laboratory manual, as well as questionnaires for students to answer at the beginning and end of the semester course. These questionnaires, to be used to establish student scientific development over the period of the course, were somewhat loosely modeled on those used by Abt Associates in connection with the NSF REU (Research Experiences for Undergraduates) Program [NSF1, NSF2]. 1. Getting started with GPSS/H Simulation Software 1.0 Introduction to GPSS/H The exercises are arranged in lessons, based on how much knowledge of GPSS/H is required. The exercises are of two types: 1. The word problem and model file are given. The student is to type in the model file, run it in GPSS, and answer questions about the simulation output, including evaluation of the performance of the computer system. 2. A word problem is given. The student is provided with some suggestions, such as what Blocks to use, and is to develop the model file, run it, and answer questions about the simulation output. (Unless otherwise specified, references are to "An Introduction to Simulation using GPSS/H)" [Schriber]. 1.0 Software Installation Instructions on installation and use of GPSS/H software References: See Fishwick, Simulation Modeling, Design and Execution, pp 413-418, for a Minigpss compiler, a version of GPSS which can be run in C on a Unix system. In what follows, it is assumed the student will be using GPSS on a personal computer with a recent version of DOS and Windows. Before starting the exercises, the student must install the student copy of the GPSS/H software. ( See Schriber p. 374 or other directions.) Schriber's Section A.2 describes the steps necessary to uncompress the software included on the diskette in his text, move required files, and run a test GPSS/H model. See p 42 for the recommended format of GPSS/H statements. Equivalent installation procedures are included in "Getting Started with GPSS/H" [Banks, et al] and in [Fishwick]. 1.2 Setting up a program file Create a model file using the DOS editor. Type EDIT at the command prompt in the GPSSH directory. (* EDLIN within DOS can also be used*) It is recommended that the following fixed column format be used. Col. 1 * (only used for a comment line) Col. 2-9 Label Col. 10 blank Col. 11-20 Operation ( Control Statement or Block Statement) Col. 21 blank Col. 22-72 Operands Read Chapter 2 and Appendix B of Schriber or other text concerning control block usage, including parameters (or operands) and how to run a GPSSH model file. 1.3 Running a program file Here are some abbreviated directions. See your GPSS reference material for further instructions. To initiate an interactive mode run: Assuming you are in the subdirectory GPSSH, and your file is \fn.gps: Then type gpssh \fn.gps tvtnw (ret) In response to ready:! type step (ret). The simulation will then proceed until a first block execution has taken place. The software will then write a message and again say ready:! For further possibilities and a complete example see Schriber Section 2.22, pp 47ff. A procedure to run a GPSS/H file is in Appendix 1. It gives the following options: 1. Execute the file 2. Edit the file 3. Print the output, or 4. Exit the run procedure. When using the RUN procedure, the execution command shown above to initiate an interactive mode run, namely gpssh fn.gps tvtnw (ret) is replaced by run fn tvtnw (ret) This RUN procedure will execute under any DOS (3.0 or higher) or a DOS shell under Windows 3.x, Windows 95 or NT. 2. Primer on computer systems and communications networks simulation problems 2.0 Background Computer system users, administrators, and designers usually have a goal of highest performance at lowest cost. Student analysis and simulation of system design trade - offs is good preparation for design and engineering decisions in real world jobs. In the context of computer systems, we might consider access to a processor, memory, or a printer, and contention for a network, processor or memory. The systems we will consider are primarily: CPUs, databases, and networks. We need a proper knowledge of both the techniques of simulation modeling and the simulated systems themselves. Consider the following scenario. You are the designer of a new switch for asynchronous transfer mode (ATM) networks, a new switching technology that has appeared on the marketplace in recent years. In order to help ensure the success of your product in this highly competitive field, it is important that you design the switch to yield the highest possible performance while maintaining a reasonable manufacturing cost. How much memory should be built into the switch? Should the memory be associated with incoming communication links to buffer messages as they arrive, or should it be associated with outgoing links to hold messages competing to use the same link? Moreover, what is the best organization of hardware components within the switch? These are but a few of the questions that you must answer in coming up with a design. The scenario described above is but one situation where computer simulation can be effectively used. In addition to its use as a tool to better understand and optimize performance and/or reliability of systems, simulation is also extensively used to verify the correctness of designs. Most if not all digital integrated circuits manufactured today are first extensively simulated before they are manufactured to identify and correct design errors. Simulation early in the design cycle is important because the cost to repair mistakes increases dramatically the later in the product life-cycle that the error is detected. Another important application of simulation is in developing virtual environments, e.g., for training. Analogous to the holodeck in the popular science-fiction television program Star Trek, simulations can generate dynamic environments with which users can interact ``as if they were really there.'' Such simulations are used extensively today to train military personnel for battlefield situations, at a fraction of the cost of running exercises involving real tanks, aircraft, etc. The focus of this manual is on the use of simulation as a tool to evaluate the performance of computer and communication systems. Simulation is one of three principal techniques commonly used today to evaluate systems. A second, pencil-and-paper approach is to use mathematical models to characterize the behavior of the system. This approach has the advantage that it is less time consuming and costly than simulation, however, it is often severely constrained because strong assumptions are usually required in order to be able to derive a mathematical solution of the model. Introductory treatment of an important category of analytic models, networks of queues, is included in this manual. A third approach is to develop a prototype implementation of the system under investigation, e.g., a hardware realization of a switch if one is examining the design of these network components. Prototype systems are designed to provide some flexibility in examining different design choices (e.g., different amounts of memory in a switch), but are still necessarily limited in the range of experiments that can be performed. This approach has the advantage that it provides a very realistic ``model'' for the system under investigation. However, in addition to being somewhat inflexible, it is the most costly among the three approaches. In many cases development of a full scale prototype may be impractical or dangerous, e.g., consider the design of a large factory. Actual measurement of a system in real time is generally far more expensive and time-consuming than simulation, and is beyond the scope of this manual. Simulation provides a useful middle ground between analytic modeling and prototypes that when used correctly, provides reasonably accurate models at a moderate expense. The simulation software used in this manual is GPSS/H, a common, widely used simulation package that has been in use in industry for many decades. Performance evaluation techniques include: measurement, simulation, and analytical modeling. We study queues as an example of analytical modeling, and simulation of a system using GPSS/H software on a personal computer. See Fishwick, pp 413-418, for a Minigpss compiler, a version of GPSS which can be run in C on a Unix system. Possible quantities to measure include: 1. number of instructions executed by a processor 2. degree of multi-programming on a time sharing system 3. response time of packets on a network 4. response time (time to service a request) 5. throughput - TPS (trans/second). The following outcomes can also be considered: 1. a computer is up or down 2. a packet reaches or doesn't reach the destination 3. a bit in a packet is affected by noise and may arrive in error. 2.1. Queues, Probability and Statistics The general material in these areas is included in appropriate textbooks. The student is urged to refer to the main course textbook, or other reference materials suggested by the instructor. What follows is the application of these subjects to computer systems. Why use queuing analysis? For example, if the load on a system is expected to increase, the designer may want to project the effect of a change in a design. Queues occur in many computer systems, for example in waiting lines of computer jobs awaiting service from a processor. Consider again the design of a switch in a data communication network. The switch contains some number of incoming and outgoing communication links that carry messages from and to host computers or other switches. The switch is responsible for routing incoming messages to the appropriate output link to (usually) move the message closer to its final destination. Suppose two messages simultaneously arrive on two different incoming links that must be forwarded on (the same) outgoing link. Assuming the two messages cannot simultaneously utilize the outgoing link, most switches will transmit one message on the outgoing link and buffer the other messages in the switch's internal memory until the link becomes available. Scenarios such as the above where multiple ``users'' (messages in the aforementioned example) compete for a shared resource (the outgoing communication link) and are queued until the resource becomes available are common in computer and communication systems. The amount of time the message must wait in a buffer (queue) waiting to use the link clearly affects the latency (delay) observed by the user in transmitting the message from one computer to another. Thus, systems such as this represent an important class of applications deserving consideration. An M/M/1 Queue is the most commonly used type of queue. It is used to model single-processor systems or to model individual devices in a computer system. An M/M/m queue is used to model multiprocessor systems or devices that have several identical servers, where all jobs waiting for these servers are kept in one queue. For each of the m servers, we stipulate a service rate and an arrival rate in jobs/time. The state of the system is the number n of jobs in the system. Queueing networks are an abstract representation of systems intended to capture the dynamic behavior of users competing for shared resources. As the name implies, a queueing network is simply a network of queues (sometimes called stations in the literature). Jobs dynamically move from one queue to another before they are removed from the system. A queue consists of the jobs waiting to use a resource. The system includes one or more servers and a pool of jobs waiting to use the server(s). Queueing network structures are also commonly used to model computer systems, as well as telecommunication networks. For example, a network system might be used to represent a central processing unit, its I/O buffering devices, peripheral equipment and jobs or transactions using these devices. The student is urged to read p 132 of Schriber or another text, to make sure he or she understands the distinction between queue, queueing system and resource. Some analytical solutions for D/D/1, M/M/1 and M/G/1 have been obtained. With other values of the parameters, these queueing problems have not been solved analytically. This situation motivates the use of simulation to tackle those problems not in the above category. An important question in any performance evaluation study is the nature of the workload driving the system. For example, in a simulation of a computer system, when are new programs (jobs) introduced to the system? How much time is required for the CPU to execute each one? How often does the program access an I/O device? Such details could be specified by an exact schedule indicating when jobs enter the system, what resources they access and for how long, etc. If the particular scenario captured by the schedule is one that often arises in the system being modeled, then this is an entirely appropriate choice. However, a simulation using such a description of the workload will only produce performance results for that particular scenario; this says very little about how the system will perform for a different scenario. Even a scenario that is only slightly different from the one that was examined may produce dramatically different results. Often, rather than providing a specific scenario, the workload is specified by one or more probability distributions. For example, one might flip a coin every 15 minutes during the operation of the system under investigation, and if the coin comes up heads, a new job is introduced into the system at that point in time. If the coin comes up tails, no job is inserted. In a computer simulation, ``flipping a coin'' can be implemented by generating a random number. If probability distributions are used to specify the workload, and proper statistical analysis techniques are applied to analyze the results of the simulation, the results apply to a much broader range of workloads than the approach using a specific scenario. Parameters that are typically used to characterize the behavior of a queue include (among others): 1. the average rate (jobs per second) or the equivalent, interarrival time and probability distribution (e.g., Poisson) that determines the arrival of successive jobs at the queue (this is determined by the network traffic load and distribution of traffic in a model for a telecommunications system), 2. the average time and probability distribution that determines the amount of time each job uses the server (determined by the size of the message and the bandwidth, i.e., number of bits per second that can be transmitted on the link), 3. the number of servers (the number of processors in a multiprocessor system for processing user jobs), the discipline used to select the next job to use the server (e.g., first-come-first-serve), and the size of the queue (often, particularly when analytic solutions are being developed, unlimited queue size is assumed). At any instant, the state of the queueing network includes some indication of the number of jobs at each queue of the network. The state may include other information depending on details of the model. A shorthand that is often used to capture the first three parameters listed above is X/Y/Z, where X indicates the arrival time distribution, Y indicates the service time distribution, and Z determines the number of servers. The most well known model is the so-called M/M/1 queue system where the arrival and service time distributions are Markov (more about this later) and there is one server. For example in the M/M/1 system, the interarrival times are exponentially distributed and the service times, too, are exponentially distributed. This queue is sometimes used to model the CPU in a single-processor computer system, or to model I/O devices (e.g., a disk). A D/D/1 system is deterministic, while a D/M/1 system is mixed. If little is known about the system, it might be denoted general (G/G/n). An M/M/m queue system may be used to model a multiple processor system where there are m identical CPUs, each of which can execute incoming programs. Jobs waiting to receive service are all placed into a single queue, independent of which CPU eventually services (executes) it. It cannot be overemphasized that when using a particular queue system (e.g., M/M/1) it is imperative that the appropriate parameters and distributions be used to characterize the behavior of the queue and how the system will be used. Serious design errors (leading to production of equipment with unacceptable performance) have resulted when analysts used inappropriate assumptions in developing the model for the system. In many cases, there is no general agreement concerning probabilistic workload models, forcing modelers to use a set of scenarios to characterize the workload. For example, detailed simulations of memory systems in high performance computers often use a set of benchmark programs to generate sequences of memory references as the workload because there are no widely accepted probabilistic models of program behavior. Workloads that do not include a random element are often referred to as deterministic, while workloads that use random elements are called stochastic. High server utilization may lead to long waiting times in queue and to long system residence times. System planners typically attempt to optimize systems. In a for-profit organization this would involve balancing the benefits of higher utilization, e.g. , lower average costs resulting in higher profits, with the costs of higher utilization, e.g., dissatisfied customers resulting in lower revenues and profits. Returning to the M/M/m queue, Markovian processes are important for two reasons: (1) this often captures the behavior of independent customer arrivals, and (2) exact mathematical solutions can be derived for many queueing network models assuming this distribution, whereas solution is often impossible once this assumption is relaxed. There is obviously little value in developing a mathematical model for a system that cannot be solved; simulation is often used to analyze such systems. If the arrival process is Markovian, this means the time between successive arrivals may be obtained by selecting a random number from an exponential distribution. The exponential distribution is important because it is frequently found in reality and has the ``memoryless'' property. This means the amount of time until the next job arrival does not depend on the amount of time that has elapsed since the last arrival. This would be an appropriate assumption if each job represents say a user walking up to an automatic teller machine and generating a transaction on a bank's database because the behavior of the user is typically independent of the last user to walk up to the teller machine. Of course, one could argue that in a real system, customers will be less inclined to get in line if there are already many individuals waiting to use the machine. If this is the case, the time between new arrivals should increase if the number of jobs waiting for service increases. In this case, the Markovian assumption breaks down because the time between customer arrivals depends on the state of the system, and is therefore not memoryless. One could argue that the exponential distribution is not an appropriate model for the arrival distribution. A pragmatic approach to this problem would be to specify a Markovian distribution for arrivals, but include the caveat that simulation results should only be considered valid for a ``light to moderate'' rate of customer arrivals (where the number of customers waiting to use the machine does not affect a new customer's behavior). Mean time in system is the sum of the mean waiting time and the mean service time. One can assume that arrivals are random, while processing time is deterministic. (Hint-- use ADVANCE block in GPSS/H, with operand B = 0.) Here are some useful measures of performance: - Mean waiting time in queue - Mean waiting time in system - Mean number waiting for service - Mean number in system - 90th percentile of time in system Find: mean number of messages waiting mean waiting time in queue mean waiting time for delayed messages mean number of messages in system Several probabilistic distributions will be used throughout the exercises described in this manual. Consider flipping a coin n times, where the coin has a probability of p for coming up heads. Count the number of times the coin comes up heads. The number of heads that are observed is said to follow a binomial distribution. This might be an appropriate distribution to determine the number of CPUs that are available in a multiprocessor system, where there are a total of n CPUs, and p is the probability each CPU is available (thus each coin flip indicates whether or not a particular CPU is available). Here, an important assumption is that the failure of one CPU does not affect the likelihood that another CPU will fail. This would not be the case if one CPU generates excessively high voltages on the system bus that causes others to fail, or if a single lightning strike causes many CPUs to fail simultaneously. A binomial distribution can be used for: 1. number of processors up in a multiprocessing system 2. number of packets that reach destination without loss 3. number of items in a batch with certain characteristics A majority of queuing models used in computer system performance analysis assume exponential interarrival times and exponential service times. Exponential distributions are used extensively in queuing models. An exponential distribution is memoryless. It can be used for: 1. time between successive request arrivals to a device 2. time between failures of a device Poisson distribution can be used for: 1. number of requests to a server in a given time interval 2. number of component failures/time 3. number of queries to data bases/time in seconds A Poisson arrival rate distribution corresponds to a exponential interarrival time distribution. The Poisson distribution is related, and is often used to characterize the number of arrivals from a large number of independent sources over a given interval. For instance, the number of transactions issued to a large banking system from automated tellers or the number of integrated circuits that fail in a large system over a certain time interval. In queueing systems, an arrival process is said to be Poisson if the times between subsequent arrivals are independent and follow an exponential distribution. Another useful distribution is the Uniform distribution (continuous). The Uniform distribution can be used for: 1. Distribution between the source and destination of messages on a network. 2. seek time on a disk Uniform distribution (discrete) 1. track number for seeks on a disk 2. I/O device number selected for next I/O 3. Source and destination node for next packet on the network The Normal distribution is also used. A uniform distribution involves selecting one choice from among several, where each possible choice is as likely to be selected as any other. In the continuous uniform distribution, the number of choices is infinite, e.g., selecting a number between 0.0 and 1.0. In the discrete uniform distribution, the number of choices is finite, e.g., select a card from a deck of playing cards. The continuous distribution might be used in selecting the seek time of a disk (time to move the disk head to the track to be read). This assumes there is no ``locality of reference'' on the disk, i.e., successive references are independent, and are not likely to be clustered together. This may or may not be the case in practice. For example, the blocks making up a large file may be all written at the same time, so one would expect them to be clustered together on the disk. Moreover, the operating system may attempt to cluster blocks that it expects will be referenced together in close physical proximity on the disk to minimize seek times. A discrete uniform distribution might be used to select the destination processor to receive messages in a computer network. The implicit assumption here is that there are no ``favorites'', i.e., each destination is equally likely to be selected. If there are favorites (e.g., a file server might be expected to receive a higher proportion of messages than a user workstation) or if there is a dependence between messages (e.g., sending a large file might involve sending a sequence of messages to the same destination) then this model breaks down. >From the above discussion it should be clear that the selection of distributions, or whether or not to use a stochastic workload at all, is a non-trivial matter. Any selection is almost certainly going to be a compromise that captures certain types of usages of the system, but not others. As a developer and user of simulation tools, it is the responsibility of the modeler to select appropriate workload models and distributions according to the envisioned usage of the system, and to clearly document these assumptions in all reports describing results of the simulation study. As a consumer of simulation results, it is the reader's responsibility to take all results with a grain of salt, and understand the underlying assumptions and caveats (which may or may not be adequately documented) that speak to the applicability of the results. Our use of GPSS/H will be primarily for problems in discrete event simulation. In discrete event simulation, the simulation clock is event driven. References: Fishwick has a good bibliography pp 429ff. Stallings has full, up-to-date coverage of the theory of queues-single and multiserver. See Stallings or Jain or other text for well-known formulas for queue parameters such as traffic intensity in terms of the mean service time and mean interarrival time. 2.2. Operating Systems and CPUs Much of this manual is concerned with modeling computer systems. These include the central processing unit (CPU), memory, I/O devices , and peripheral devices such as disks, printers and network connections. Computer simulation is used extensively to evaluate computer systems, and is perhaps the principal method used to evaluate these systems. CPU simulations are used to evaluate machine instructions and alternate designs of the CPU itself. Simulations are used extensively to evaluate the operation of memory systems at all levels (cache memory, main memory, secondary or tertiary storage). At a higher level, the operating system is responsible for managing the resources in the computing systems. Typical functions include determining which program should execute on the CPU next, scheduling I/O operations, and evaluating different approaches to implementing virtual memory systems. Database management systems are often modeled to evaluate their performance (number of transactions that are performed per second). For the purposes of the exercises described in this manual, the computing system will be modeled as a network of queues. Metrics of interest typically include the time required to complete the program (time from when the job enters the queueing network until when it departs), sometimes referred to as the system residence time, and the number of programs processed through the system per unit time (system throughput). Students can simulate the entire operating system of a computer, or a portion, and also can simulate various implementation strategies. Some functions of an operating system are: 1. to manage the operation of the CPU 2. to control I/O, and 3. to control storage Portions that can be simulated include: device management such as secondary storage, I/O peripherals, , e.g. tape or disk drive, keyboard, printer or display. Memory Management: virtual paging systems can study alternative allocation schemes. For example, schemes to swap part of programs and data between memory and secondary storage. An overload can result in slower response time. File management: saves and retrieves files A simulation problem might require the analyst to simulate the workload and print the throughput. The workload can be simulated by using input from a random number generator. A sample output might include such job statistics as: id, (identification), time job entered system, job turnaround time, job process time, total job waiting time , number I/O operations, and amount of memory requested by the job. 2.3 Networks and Communications 2.31 Computer Networks Advances in fiber optics technology have revolutionized the telecommunications industry, analogous to the way advances in integrated circuits revolutionized the computer industry in the 1970's and 1980's. Historically, networking technology has evolved from two separate communities: the telephone industry providing voice communications and the data communication industry supporting communications between computers (remote logins, file transfers, electronic mail, etc.). Today, the distinctions between voice and data communications have become blurred, as highlighted by the increased number of multi-media applications that combine voice, data, still images, and video into a single software package. These trends have given rise to so-called Broadband Integrated Services Digital Networks (B-ISDNs) where a single networking infrastructure is used to support all of these diverse types of communications. Networks are sometimes classified by their geographic extent. Local area networks (LANs) are limited to a small collection of buildings in close physical proximity, e.g., a college campus. Metropolitan area networks (MANs) cover a single city and its surrounding suburban areas. Wide area networks (WANs) cover wider areas, and may include sites across the globe. An important challenge in B-ISDN networks is satisfying the requirements of different types of traffic within a single network, e.g., voice and video traffic require low, predictable delays in transmitting information from one location to another or the result may be a distorted voice or videos received at the destination. Some amount of data loss can be tolerated because such losses may not be perceptible to the human receiving the signal. On the other hand, data transmissions (e-mail, file transfers) are not as sensitive to timing constraints, but are usually very sensitive to lost data. There are specialized commercial services available in: 1) methods of overload and congestion control in broadband networks; 2) methods of telecommunications network topological design; and 3) techniques of telecommunications system simulation and modeling. The problems in this manual are simplified versions of the real- life problems facing designers of networks. The unit of data transmitted over the network varies from one networking technology to another. In datagram networks, the unit of data that is transmitted is an entire message (subject to maximum size constraints). The size of the data unit thus varies from one message to another. Another approach is to place each message into fixed sized packets, and transmit packets through the network. Messages larger than a single packet must be first divided into packets before they are sent into the network. Packets may be routed separately through the network, so they may not arrive at the destination in the same order that they were sent. The packets must be reassembled at the destination before the message is delivered to the user. Similarly, a protocol is required to retransmit lost packets. The physical medium used to transmit information within the network may be point-to-point links or a broadcast medium. In a point-to-point link there is a dedicated physical connection (e.g., a wire or fiber optic cable) between each pair of components (switches, computers, I/O devices, etc.). Networks based on these types of links are referred to as point-to-point networks. If the connection supports communications in only one direction or allows simultaneous transmission of data in both directions, there is no contention to access the link. If the link supports communications in both directions, but only in one direction at a time, a protocol is required to ``turn the link around'' to allow both parties to use the link. A broadcast medium allows multiple components to use the communication medium. A protocol is required to prevent transmitters from interfering with each other. This is referred to as the medium access control (MAC) protocol. For example, a widely used LAN technology based on broadcast medium is Ethernet. Here, all components are connected to a single wire (e.g., a co-axial cable). The MAC protocol uses a few simple rules. A transmitter does not transmit data until the wire is not being used by another transmitter. If two transmitters simultaneously begin to transmit data, each transmitter detects such a collision by monitoring the wire and observing that the data on the wire does not match what it is sending. When a collision is detected, the transmitter stops transmitting, waits a random amount of time, and retransmits the message. A similar protocol is often used to control access in wireless communication networks. Another widely-used LAN technology is the token ring. A token ring is a point-to-point network where the network is configured as a ring, with data transmitted in one direction around the ring. The ring is similar to a conveyor belt that carries data sequentially from one computer to another on the ring. The conveyor belt is divided into ``slots'' where each slot can hold a single fixed-sized block (e.g., a packet) of data. Each computer examines the data passing by on the conveyor belt. If the destination address in the data matches that of the computer, the data is removed from the belt, and the slot is marked as empty. If the address does not match, the data continues along the conveyor belt. To transmit data, the computer waits for an empty slot, and upon finding one, fills that slot with the data to be transmitted. The network simulation exercises in this manual include examination of both simple systems that include a network, and data transmission within the network itself. Here, a general purpose simulation language is used to develop the models. Although beyond the scope of this manual, commercial simulation packages do exist that are designed specifically to model networks. These packages typically include models for different networking protocols and technologies currently being used, and can substantially reduce the time to develop a simulation model for a network. 2.32 Performance Prediction and Analysis See Stallings for other questions relating to communications lines whose answers are important to designers of networks, such as: 1. what happens to terminal response time when line utilization goes up? 2. does response time change if both line speed and the number of terminals on line are doubled? 3. how many terminals are needed in an on-line inquiry center, and how much idle time will the operators have? Designing a network to satisfy differing requirements is a challenging task, and simulation is an indispensable tool that is widely used for this purpose. When the objective is to understand the performance of a distributed computing system that includes the network as one component (as well as computers, printers, and other I/O devices), the network may be modeled as an aggregated component, perhaps a single queue. A typical question that might be asked is how will the average delay to perform a file transfer increase if five new computers (creating additional traffic on the network) are added to the system. On the other hand, if the objective is to understand and optimize the performance of internal components within the network itself (e.g., switches), the network may be modeled as complex queueing network, perhaps containing hundreds or even thousands of queues. Typical questions a network designer might ask include: how should congestion in the network be controlled? How much buffer space should be provided in each switch? How should priorities be assigned to different types of traffic utilizing the network? Still another set of questions might be asked by the individuals configuring a network for a specific site, e.g., a company or city. What network topology should be used? How much bandwidth should be allocated to individual links? Typical questions and problems designers might consider include: contention - access to a network; performance analysis - network response to different packet distribution under different service schemes. To resolve these issues, the designer will want 1) to determine: waiting time, utilization end-to-end transit time; and 2) to consider tradeoffs: between quantum length and quantum switching, overhead and response Here is a sample design problem. A number of terminals, pc's and workstations are used on a 4 mega bps token ring LAN. An increase in load is expected. The modeler can model analytically or simulate or wait and see. The last may be bad, because it might result in dissatisfied customers. In some of the problems we consider, there will be end systems interconnected by intermediate systems. We will assume infinite buffering capacity in the intermediate systems. When packets enter a system, there are several different possible outcomes. The packets may be delivered in order, delivered out of order, duplicated or lost. If the packets are delivered in order, useful metrics to consider when evaluating the outcome are: response time, throughput and processor time per packet. Response time can have several different definitions. A simplified definition might be the time from the user's request to the system's response. This definition ignores the time to input the request and to output the response. In a time-sharing system for interactive users, the definition of response time might be from the end of the input to the end of the output, while the corresponding definition for batch systems might be from the time of submission of the request to the completion of the output. Two important metrics that are often used to characterize network performance (or for that matter, system performance in general) are latency and throughput. Latency means the delay associated with performing an action, e.g., the time that elapses from when the first bit of a message is transmitted into the network until the time the last bit appears at its final destination. The exact definition depends on the objectives of the simulation study. Latency is often referred to as response time, particularly when discussing system performance. A simple definition for response time is the time from when the user requests the operation be performed, until the time the system produces a response, e.g., the time from typing return on a file transfer request until the entire file has been received. Throughput is usually defined as the rate in requests per unit of time at which requests can be serviced by the system. Interactive throughput might be defined in requests per second, while batch throughput could be defined in jobs per second. Throughput for a CPU is usually given in millions of instructions per second (MIPS) or millions of floating point operations per second (MFLOPS). The throughput for a transaction processing system is usually given in TPS, i.e., transactions per second. Throughput is usually defined as the number of user requests that can be satisfied per unit time. In a communication network, this may refer to the number of bits (in total or for each user) of information that can be transmitted through the network per second. When examining the performance of a system, throughput may be characterized as the number of jobs (say print jobs) that can be performed per second. It is important to keep in mind that latency and throughput are distinct performance metrics. By analogy, consider the pipe that transmits water to your home. Latency refers to the time required for a drop of water to reach your house from the pumping station, and depends (among other factors) on the distance between the pumping station and your house. Throughput indicates the amount of water that flows out of your faucet each second, and depends on the size (cross section) of the pipe. High latency does not necessarily imply low throughput. For example, your house may be very far from the pumping station, but the pipe from the station to your house may be very large. Similarly, networks using satellites may provide high latency, high bandwidth, and high throughput. A third, useful, metric that is frequently used is the utilization of a resource. The utilization of a resource is usually defined as the fraction of the time it is busy servicing requests, e.g., the fraction of time that a printer is busy printing jobs. The remainder of the time is called idle time. For processors, we can define utilization as the ratio of busy time to idle time. For memory, only a fraction of the resource may be used at a given time, so we define utilization as the average fraction used over an interval. Clearly, utilization must be between 0.0 ( completely idle) and 1.0 (always busy). This metric can be used to quantitatively identify bottlenecks in the system that limit performance. Some resources may not be completely used at any instant. For example, not all of the memory in a computer may be used at one time. In this case, utilization is defined as the average fraction of the resource that is used over an interval of time, e.g., if a memory is utilized 50%, on average, half of the computer's memory is being used and the rest is available for use by other programs. Other frequently used terms are reliability, expressed in MTBE (mean time between errors) and availability expressed as MTBF (mean time between failures). Simulation is often used to evaluate the reliability of a system. MTBF indicates the average time between failures of a component or system. For a complex such as a network, an important question is what constitutes a failure, because some portions of the network may fail while other portions continue to operate, e.g., a phone customer in Chicago may not notice a failure of the portion of the phone system in Los Angeles. If the goal is to provide service to a customer, then MTBF might be defined as the average time between failures as perceived by a single customer. Similarly, data transmitted through networks may be corrupted during transmission, e.g., external noise such as a lightning strike near a transmission line may cause the data being transmitted to be changed, or lost entirely. MTBE is often used to characterize performance with respect to this aspect. An example of cost per performance unit for a comparison of transaction processing systems might be calculated as dollars per TPS. Congestion control might be studied in terms of packet sizes and service times.  Section 3. Exercises using simulation 3.0 Introductory exercises using the GENERATE, TERMINATE and ADVANCE blocks only (refer to Chapters 2-6 Schriber) Exercises Solutions are included for many of the problems. Exercise #1. References : Schriber, An Introduction to Simulation, 2.1 to 2.18 Banks, Carson, and Sy, Getting Started with GPSS/H, Chapters 2 and parts of Chapter 3. Objective : The objective of this lab exercise is to use an editor or word processor to create a GPSSH model file, run the simulation, and answer questions on the automatically produced postsimulation report. Procedure: Type in the file listed below and run it. New Blocks: GENERATE, TERMINATE New Control statements: SIMULATE, START, END Additional statement type : Comment statement * Model File: SIMULATE * The SIMULATE control statement indicates to the compiler to * compile and run the support system to run the simulation model GENERATE 2 * Create an arrival every two time units beginning at time 2 *units TERMINATE 1 * Remove each arrival,one by one * START 5 * * Simulate until 5 arrivals have been removed * END indicates to the compiler the end of model file * END Obtain and submit your postsimulation report. Questions 1. At what simulated time was the postsimulation report produced? 2. What is the number of the GENERATE BLOCK ? 3. What is the number of the TERMINATE BLOCK ? 4. What is the total number of arrivals? Hint: Look at the TOTAL count at the GENERATE BLOCK. 5. What is the CURRENT number of arrivals at each BLOCK ? 6. Modify and execute a model to determine the time of the 13th arrival. Exercise #2. References: Same as Exercise #1. Also, Schriber 13.6, Banks et. al. 13.5.7 Emphasis on options of the GENERATE Block. Objective: Same objectives as Exercise #1 and focus on random time between arrivals and using additional features of the GENERATE BLOCK. Procedure : Type in the file listed below and run it. New: Additional Features of GENERATE Block Model File: SIMULATE * The SIMULATE control statement indicates to the support system * compiler to compile and run the simulation model GENERATE RVEXPO(1,12),,0 * Create arrivals randomly with mean time between arrivals 12 * units using random number generator 1 beginning at time 0 units * TERMINATE 1 * Remove each arrival, one by one * START 1000 * * Simulate until 1000 arrivals have been removed * * END indicates to the compiler the end of model file * END Obtain your postsimulation report. Questions 1. At what simulated time was the postsimulation report produced? 2. What is the number of the GENERATE BLOCK ? 3. What is the number of the TERMINATE BLOCK ? 4. What is the total number of arrivals? Hint: Look at the TOTAL count at the GENERATE BLOCK. 5. What is the CURRENT number of arrivals at each BLOCK ? 6. Repeat the simulation using random number generator 2 and answer questions 1-5. (Hint: use RVEXPO (2, 12)). What are the differences if any between the two simulations ? Exercise #3. Objective: Develop and test a simulation model for the following situation using the concepts of Exercises #1 and #2: Assume students arrive at a workstation every 13.9 minutes starting at time 7. At what time will the 11th student arrive ? Build and execute a model to verify your answer. Hint: What should the START statement be ? Exercise #4. Objective: Become familiar with using two GENERATE and TERMINATE Blocks Assume that juniors arrive at a computer room every 15 minutes, and seniors randomly as in exercise 2 with mean time between arrivals 19 minutes. How much time is required for a total of 25 arrivals ? Which random number generator did you use? Did you specify the first arrival time of the juniors and/or the seniors? Solution: SIMULATE GENERATE 15 TERMINATE 1 GENERATE RVEXPO(2,19) TERMINATE 1 START 25 END Exercise #5. Objective : Specify time-based condition to stop the simulation Procedure: Consider a modified model file for Exercise #4. Here, we use simulated time to determine the end of the simulation run. SIMULATE * GENERATE 15 TERMINATE 0 GENERATE RVEXPO(2,19) TERMINATE 0 * GENERATE 480 TERMINATE 1 * START 1 END Questions 1. What is the purpose of the GENERATE 480 block ? 2. At what simulated time do you predict the simulation would stop ? 3. Run the model and compare the results with your answer to Question 2. 4. How many juniors arrive before the simulation is stopped ? 5. How many seniors arrive until the simulation is stopped ? 6. Modify and run the model until exactly 25 seniors have arrived. 7. How long did take for the 25 seniors to arrive ? Exercise #6. References : Schriber Chapter 3, sections 3.1 to 3.2; Banks et al, Chapter 3.4.2 New BLOCK : ADVANCE In a point-to-point computer communication network messages are sent every 19 seconds starting at time 0. The transmission time to the destination is fixed at 12 seconds. Simulate and run until 37 messages have been sent and received. Solution SIMULATE GENERATE 19,,0 * The ADVANCE block applies a simulated time delay to a * message(transmission time) ADVANCE 12 TERMINATE 1 * START 37 END Questions 1. When will the 37th message be received ? Hint: What is the ending time of the simulation ? 2. How many messages were created by the GENERATE block ? Hint: What is the TOTAL COUNT at the GENERATE block ? 3. What is the CURRENT COUNT at the GENERATE block ? 4. If the simulation had not stopped then when would the 38th message be sent ? 5. What is the maximum number of messages sent simultaneously ? Why ? 6. There are several ways to modify the model to obtain simultaneous transmissions. Modify and run the model to demonstrate simultaneous transmissions. Hint: Simultaneous transmissions occur when two or more messages are in the ADVANCE block. 7. Modify the GENERATE 19,,0 Block to GENERATE RVEXPO(1,19),,0 and the ADVANCE 12 to ADVANCE RVEXPO(2,12). Repeat questions 1, 2 and 3. Can you determine the answers to questions 4 and 5 ? Why not ? 3.1 Problems using single queues (also using SEIZE and RELEASE blocks) New Blocks: SEIZE, RELEASE New Concept: Average Utilization. Reference: Schriber Chapter 6, especially pps. 132 and 136. Fig 6.5, Facility Report Exercise #7. Programs are submitted for execution from a computer every 15 to 25 sec. The execution time of a program is 4 to 14 sec. 7a. Run this sample model file. SIMULATE GENERATE 20,5 PROGRAMS ARRIVE AT COMPUTER ADVANCE 0 SEIZE CPU PROGRAM GAINS CONTROL OF CPU ADVANCE 9,5 PROGRAM IS EXECUTED RELEASE CPU IS FINISHED WITH PROGRAM TERMINATE 1 PROGRAM IS REMOVED FROM SYSTEM START 100 SIMULATE FOR 100 PROGRAMS * EXECUTED END PHYSICAL END OF GPSS/H PROGRAM By studying the output from the GPSS/H simulator program, the utilization of the cpu, and the total elapsed time can be determined. Note: this is a simplified problem, and the program file has a lack of detail. Other portions of systems that can be simulated include: multiprogramming scheduling methods, page fetch schemes, page replacement techniques FIFO, LRU (least recently used)), etc. The next file is a Model File for students queuing to see their faculty advisor for registration. (a modification of a program written by student Tariquil Khan). The laboratory exercise should include hand calculation of waiting time, service time, and total time for each student, as well as average waiting time and average service time for a group of 10 students. Note: the xact is the student, and the server is the professor. The time unit in the simulation model is minutes. 7b. Run this sample model file and answer questions 1-5. SIMULATE GENERATE 30,4.5,5,10 CREATE STUDENT ARRIVALS ADVANCE 0 SEIZE PROF CONSULTATION BEGINS IF * PROFESSOR IS FREE ADVANCE 15 CONSULTING TIME RELEASE PROF CONSULTING PERIOD ENDS TERMINATE 1 STUDENT LEAVES START 10 SIMULATE FOR 10 STUDENTS * END Refer to the model, and answer: 1. What time does the first student arrive for consultation? 2. How many students arrive per hour? 3. What is the average consulting time? 4. How long is the professor idle before a student arrives for consultation? 5. What is the longest interarrival time between students? NEW Blocks : SEIZE, RELEASE New Concept : single server models Reference : Schriber Chapters 6, 9 Exercise #8. Packets arrive at computer communication network node A randomly using a mean interarrival time of 5 milliseconds (use RVEXPO(3,2)) and are transmitted to the destination nodes. The message lengths are 256 bytes per message. Compute the average message delay in the network assuming that only one message can be transmitted at a time from node A. The line capacities are 56 kilobytes per second (kbps). If node A is busy transmitting a message then the node cannot receive a message until it has completed its transmission. Solution. SIMULATE * GENERATE RVEXPO(3,),,0 * Another message arrives at node A QUEUE ONE * SEIZE NODEA * Capture node for transmission DEPART ONE * A message has been transmitted ADVANCE 256./(56.*1000.) * RELEASE NODEA * Give up node A for another message TERMINATE 1 * START 1000 * END 3.2 Problems in simulation theory or concepts **think about this section** 3.2 contains theoretical and computational exercises. Many of these computations about queues can then be written as GPSS/H model files and simulated. Theoretical exercises Questions on networks. Exercise #9. What distributions would you use to model the time between successive arrivals at a computer network? (Answer: Any distribution is possible, including exponential.) The following problems are reprinted from Stallings, pp 767-770, with the permission of the author. Exercise #10. A computer has 10 leased telephone lines, and 10 remote terminals. The average transmission time of messages on line is 6 seconds. The standard deviation equals the mean. At peak time, the message rate reaches 2/min. 1. Find the average response time ignoring line overhead. 2. Calculate response time as a function of message load, and of utilization. Exercise #11. Messages arrive at a switching center. Assume the message length is exponentially distributed. Assume the arrival rate = 5/sec, the average message length = 144 characters, and the line speed = 9600 bps. Calculate the mean queueing time in the switching center, and how many messages are in the switching center, including those waiting for transmission and the one currently being transmitted, on the average. Also do the calculations for 90%, and 95% (Note to the instructor: the calculations are shown in Stallings pp767-770, and can be used in the lecture material). Exercise #12. During a 1-second observation period, 400 packets were serviced by a gateway whose CPU can service 200 pps. What was the utilization of the gateway CPU? Exercise #13. The throughput of a timesharing system was observed to be five jobs per second over a 10-minute observation period. If the average number of jobs in the system was four during this period, what was the average response time? Exercise #14. During a 10-second observation period, 40 requests were serviced by a file server. Each request requires two disk accesses. The average service time at the disk was 30 milliseconds. What was the average disk utilization during this period? Exercise #15. A distributed system has a print server with a printing speed of 60 pages per minute. The server was observed to print 500 pages over a 10-minute observation period. If each job prints five pages on the average, what was the job completion rate of the system? The next few problems on networks are based on Jain, pp 524-526. See Chapter 31 of Jain and the rest of pp 519-546 for analysis of a single queue, and for exercises and materials on computation of queue parameters. Exercise #16. On a network gateway, measurements show that the packets arrive at a mean rate of 125 packets per second (pps) and the gateway takes about 2 milliseconds to forward them. Using a M/M/1 model, analyze the gateway. What is the probability of buffer overflow if the gateway had only 13 buffers? How many buffers do we need to keep packet loss below one packet per million? Solution: Arrival rate = 125 pps, service rate = 1/0.002 = 500 pps. Gateway utilization = 0.25. Probability of n packets in gateway = (1-r)*r**n = 0.75*(0.25)**n. Mean number of packets in gateway = r/(1-r) = 0.25/0.75 = 0.33 Mean time spent in gateway = 1/500*(1-0.25) = 2.66 milliseconds. Probability of buffer overflow = P(more than 13 packets in gateway)= r**13 = 0.25**13 = 1.49*10**(-8) = (about) 15 packets per billion packets. To limit the probability of loss to less than 10**(-6), r**n <=10**(-6), or n > log(10**(-6))/log (0.25) = 9.96. We therefore need about 10 buffers. Problems on database systems. Exercise #17. The execution time of queries on a database is normally distributed with a mean of 5 seconds and a standard deviation of 1 second. Using normal tables, determine the following: 1. What is the probability of the execution time being more than 8 seconds? 2. What is the probability of the execution time being less than 6 seconds? 3. What percent of responses will take between 4 and 7 seconds? 4. What is the 95-percentile execution time? Exercise #18. The average response time on a database system is 3 seconds. During a 1-minute observation interval, the idle time on the system was measured to be 10 seconds. Using an M/M/1 model for the system, determine the following: 1. System utilization 2. Average service time per query 3. Number of queries completed during the observation interval 4. Average number of jobs in the system 5. Probability of number of jobs in the system being greater than 10 6. 90-percentile response time 7. 90-percentile waiting time 8. Also write a GPSS/H model file for this system, and run it. Operating Systems problem. Exercise #19. A storage system consists of three disk drives sharing a common queue. The average time to service an I/O request is 50 milliseconds. The I/O requests arrive to the storage system at the rate of 30 requests per second. Using an M/M/3 model for this system, determine the following: 1. Average disk drive utilization 2. Probability of the system being idle 3. Probability of queuing 4. Average number of jobs in the system 5. Average number of jobs waiting in the queue. 6. Also write a GPSS/H model file for this system, and run it. Time sharing problems. Exercise #20. A problem concerning a University Computer Center. Students use one of five terminals, with an average time of 20 minutes spent at a terminal. You may assume Poisson arrivals and exponentially distributed service times. An M/M/m queue with m = 5 is used to model the system. The arrival rate at the computer center is given as 1/6 per minute, and the service rate at 1/20 per minute. Questions. 1. What is the traffic intensity? 2. What is the probability that all terminals are idle? 3. What is the probability that all terminals are busy? 4. What is the average terminal utilization? 5. What is the average number of students in the Center? 6. What is the average number of students waiting in queue? 7. What is the mean and the variance of the time spent in the Center? 8. What is the 90-percentile of the waiting time? 9. Calculate the utilization, service time, number of queries and jobs 10. Calculate the probabilities of 90% response time, and 90% waiting time. Exercise #21. In the previous problem, consider the situation if the five terminals were in five separate locations on the campus, with a separate queue for each. Calculate the mean time in a terminal room, and compare to that in the previous example, when all five terminals were at the same location. Is the single-queue alternative better? Write model files for both the problems concerning a University time sharing system, and run them both. 3.3 Simulation problems on queues (making use of substantial material from Schriber in addition to Chapter 6.) The following exercises use more advanced statistical knowledge. References: Schriber Chapter 14; Banks et al 2.2.6 and 9.10, GPSSH manual A-19 New Concept : Transaction Id numbers or Standard Numerical Attributes (SNA). New Block : BPUTPIC The Control Statement PUTPIC produces customized output by modifying the display commands, including showing where in the output to place text, and directing whether output is to go to the screen or to a designated external file. BPUTPIC is the Block form of the Control Statement PUTPIC. Note that the line following BPUTPIC starts with a 0 in column 1 to format double spacing of output. See Exercise #25 for further explanation of use of BPUTPIC. Study the references for further details. Exercise #22. In our earlier point-to-point computer communication network ( see section 3.0 Exercise #6) messages are sent every 19 seconds starting at time 0. The transmission time to the destination is fixed at 12 seconds. Simulate and run until 5 messages have been sent and received. The BPUTPIC block is used in this model to report the times at which each message is sent and received. This allows the student to trace the times at which significant events occur during the simulation. SIMULATE * GENERATE 19,,0 * * BPUTPIC Block allows us to write information to an external file * or the screen * * XID1 is the message number and AC1 means the simulated time or * simulation clock time * BPUTPIC FILE=SYSPRINT,(XID1,AC1) * * Message Number * is being sent at time ***.* * ADVANCE 12 * BUTPIC FILE=SYSPRINT,(XID1,AC1) * * Message number * is being received at time ***.* * TERMINATE 1 * START 5 * END Questions 1. When was the transmission of the second message started and when was the transmission completed ? 2. When were messages 3,4, and 5 started and completed ? 3. What is the ending time of the simulation ? 4. How many messages were created by the GENERATE block ? 5. What is the TOTAL COUNT at the GENERATE block ? 6. If the simulation had not stopped then when would the 6th message be sent ? 7. What is the maximum number of messages sent simultaneously ? Why ? 8. There are several ways to modify the model to obtain simultaneous transmissions. Modify and run the model to demonstrate simultaneous transmissions. Hint: change the Block operands GENERATE and/or ADVANCE. 9. Modify the GENERATE 19,,0 Block to GENERATE RVEXPO(1,19),,0 and the ADVANCE 12 to ADVANCE RVEXPO(2,12). Repeat questions 1,2,3,4,5, and 7. 10. Modify and run the model of question 9 to determine the number of messages transmitted completely in an eight hour period. References: Chapters 9, 14 Schriber Exercise #23. Modify and run the model of the previous exercise to determine the number of messages transmitted in an eight hour period. Hint: Refer to Section 3.1, Exercise #5. Exercise #24. In a three node computer network, each computer has an attached printer. The first computer generates a print job every 41 seconds and it takes 23 seconds to print the job. The second and third computers generate print jobs every 51 and 67 seconds respectively and these print jobs take 13 and 11 seconds respectively. Develop and run a model file. Use BPUTPIC blocks to trace the model execution through the first 30 minutes of system operation. Hint: be sure to have the BPUTPIC blocks identify the source computer of the print jobs. Solution SIMULATE * GENERATE 41,,0 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 1 produces print job ** at time ***.** ADVANCE 23 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 1 finished print job ** at time ***.** TERMINATE 0 * GENERATE 51,,0 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 2 produces print job ** at time ***.** ADVANCE 13 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 2 finished print job ** at time ***.** TERMINATE 0 * GENERATE 67,,0 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 3 produces print job ** at time ***.** ADVANCE 11 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 3 finished print job ** at time ***.** TERMINATE 0 * GENERATE 30*60 * Thirty minutes expressed in seconds * TERMINATE 1 * START 1 * END Questions: 1. How many print jobs are generated in 30 minutes ? 2. What are the transaction ID numbers of print jobs for the first computer ? 3. At what time does the second computer generate its 4th job and at what time is the print job completed ? 4. At what time is the 6th print job of the third computer completed ? New Concept: Queues and statistics New Block: QUEUE and DEPART References: Schriber Chapters 9, 14; Banks et al Chapter 5 Exercise #25. In our earlier point-to-point computer communication network messages are now sent to node A randomly with mean interarrival rate of 19 seconds starting at time 0 using RVEXPO(1,19) as shown in the model file. The transmission time to the destination is fixed at 18 seconds. Simulate and run until 10 messages have been sent and received. The QUEUE and DEPART blocks are used to gather statistics about messages during the transmission process. Solution. SIMULATE * GENERATE RVEXPO(1,19),,0 * * BPUTPIC Block allows to write information to an external file or the screen * XID1 is the message number and AC1 means the simulated time or simulation clock time BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Message Number * is being sent at time ***.* * QUEUE ONTHEWAY * Another message is on the way ADVANCE 18 * DEPART ONTHEWAY * Another message has been transmitted BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Message number * is being received at time ***.* * TERMINATE 1 * START 10 * END Questions 1. What is the maximum number of messages that needed to be transmitted from node A simultaneously ? Hint: what was the MAXIMUM CONTENT of the ONTHEWAY queue ? 2. What was the average number of messages being transmitted over the simulation time ? Hint: what was the AVERAGE CONTENTS of the ONTHEWAY queue ? 3. What is the average transmission time of the messages ? Hint: what was the AVERAGE TIME/UNIT OF THE ONTHEWAY queue? 4. How many messages were in the process of being transmitted when the simulation stopped ? Hint: what was the CURRENT CONTENTS of the ONTHEWAY queue ? Reference: Chapter 14, Schriber Exercise #26. In a three node computer network, each computer has an attached printer. The first computer generates a print job with mean interarrival time of 41 seconds randomly using RVEXPO(1,41) and it takes 23 seconds to print the job. The second and third computers generate print jobs with mean interarrival times of 51 and 67 seconds respectively and these print jobs take 13 and 11 seconds respectively. Here, we use BPUTPIC blocks to trace the model execution. Hint: be sure to have the BPUTPIC blocks identify the source computer of the print jobs. Run the model file. Solution SIMULATE * GENERATE RVEXPO(1,41),,0 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 1 produces print job ** at time ***.** ADVANCE 23 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 1 finished print job ** at time ***.** TERMINATE 0 * GENERATE RVEXPO(2,51),,0 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 2 produces print job ** at time ***.** ADVANCE 13 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 2 finished print job ** at time ***.** TERMINATE 0 * GENERATE RVEXPO(9,67),,0 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 3 produces print job ** at time ***.** ADVANCE 11 BUTPIC FILE=SYSPRINT,(XID1,AC1) 0 Computer 3 finished print job ** at time ***.** TERMINATE 0 * GENERATE 30*60 * Thirty minutes expressed in seconds * TERMINATE 1 * START 1 * END Questions 1. Modify and run the model file to include QUEUE and DEPART blocks for each ADVANCE block. 2. How many print jobs are generated in 30 minutes ? 3. What are the transaction ID numbers of print jobs for the first computer ? 4. At what time does the second computer generate its 4th job and at what time is the print job completed ? 5. At what time is the 6th print job of the third computer completed ? 6. What is the average number of jobs being printed simultaneously by each printer? Note that a real printer can print one job at a time indicating that our model is not realistic. In a more advanced problem we will introduce the new block TRANSFER to remedy the simulation. ----------------------------------------------------------------- 3.4 Problems in computer systems and communications networks requiring extensive theory References: Schriber Chapter 12.2; Banks et al 7.3.1; also see the indicated pages of Aburdene. New Block: Statistical Mode Transfer: TRANSFER New concept: use of block labels and probability Exercise #27 (Aburdene, p32 #9). Let p be the probability of an error occurring during the transmission of a character from a file on your hard drive to your computer network. Using simulation, estimate the number of characters that are transmitted correctly in a 10,000 character file ? Let p = 0.01. The following model file simulates the transmission of one file. Run the model file. SIMULATE GENERATE 1,,0 TRANSFER 0.01,SUCCESS,ERROR SUCCESS TERMINATE 1 ERROR TERMINATE 1 START 10000 END Questions 1. How many characters did you expect to be transmitted correctly? 2. How many characters were transmitted correctly ? Hint: What is the TOTAL COUNT at the SUCCESS block ? 3. How many characters were transmitted incorrectly ? 4. Answer questions 1,2, and 3 and plot the number of characters that are transmitted correctly and the expected number of characters to be transmitted correctly vs. p for p = 0.1, 0.05, 0.01, .005, and 0.001. 5. Modify the START 10000 control statement successively to START 1000 , START 100, START 10. Repeat questions 1,2, and 3. 6. Modify the START statement to START 1. What do you expect the answers to questions, 1, 2, and 3 will be ? Exercise #28. Develop a simulation model to predict the probability that a message is transmitted correctly from your computer to your professor's computer if the probability of transmitting the message correctly is 0.001. Assume a message is 320 bytes. Solution The program modifies the TRANSFER Block in the previous exercise. Note that message length has nothing to do with the solution. It is useless information. Exercise #29. (Aburdene, p.32 #10). In an ethernet local network, a 10-megabit-per-second data link is used to transmit packets from one node to another. Tests show that the probability of a packet requiring retransmission due to errors is p = 10**(-6). Let us define the effective data rate as the number of data bits transmitted per second. Using simulation, determine the effective data rate of the link when the packet length is 320 bits in total with 256 data bits. (See Figure #1). **Insert Fig. #1 here p32 fig 2.8** Solution Modify the model in Exercise #27 by changing the p value and start value. Indicate to students that they need to make some computation offline. Exercise #30. Students working on an ethernet network send print jobs (email messages) to a network printer every 10 seconds. Periodically, the printer runs out of toner and the file must be rescheduled for the printer. Assume that the probability of a printer running out of toner is 0.001. Run the model file. SIMULATE * GENERATE 10,,0 * RETRY ADVANCE 1 * QUEUE PRINTER * SEIZE PRINTER * DEPART PRINTER * ADVANCE RVEXPO(2,10) * RELEASE PRINTER * TRANSFER 0.001,JOBDONE,RETRY * JOBDONE TERMINATE 1 START 1000 * END Questions 1. Can you determine how many times the printer ran out of toner ? How many ? 2. On average, how many print jobs were waiting to be printed ? 3. What was the utilization of the printer ? 4. What was the maximum number of jobs waiting to be printed ? 5. What fraction of jobs did not have to wait for the printer ? Hint: Look at the PERCENT ZEROS in the postsimulation report. 6. Modify the GENERATE block to GENERATE RVEXPO(3,10),,0 and answer questions 1-5. 7. Modify the GENERATE block to GENERATE RVEXPO(3,9),,0 and answer questions 1-5. Would a second printer help reduce the waiting time for the print jobs ? Is it cost effective ? Exercise #31. (Aburdene, p. 168 #8). References: Schriber Chapter 11; Banks et al 6.3.2; Aburdene pp. 167-168. NEW BLOCK : ENTER, LEAVE New concept : Modeling groups of identical servers and transaction attributes NEW Control statement: Storage The distributed computing system shown (Fig. #2) is used by a computer services firm. Assume that jobs are submitted to a front end, which assigns each job to one of the parallel computers. All processors are identical and have a deterministic service rate of 10 mega instructions per second (MIPS). Assume that the number of instructions in a job is one million. The arrival rate of jobs is Poisson distributed with a mean of 10 jobs per second. By simulation determine the average job delay from entry to completion. **INSERT FIGURE #2 pp167-8 Aburdene fig 6.22** Model File. SIMULATE * Specify four identical processors PARALLEL STORAGE 4 * For a Poisson arrival rate use RVEXPO with interarrival rate = *1/10 GENERATE RVEXPO(2,1./10.) * QUEUE ONE *Request and capture an idle processor ENTER PARALLEL * ADVANCE 1./10. * Free a processor LEAVE PARALLEL * DEPART ONE * TERMINATE 1 * START 1000 * END Exercise #32. (Aburdene, p. 167 #7). A computer communication network (Figure #3) has Poisson arrival rates of 40 and 5 messages per second. The message lengths are exponentially distributed with a mean of 256 bytes per message. Compute the average message delay in the network. The line capacities are 56 kilobytes per second (kbps). Assume an infinite storage buffer. In this case, your computer must contend for access to the network before transmitting your message since three of your classmates will be trying to send a message as well. **insert Figure #3 here pp167#7 ** Exercise #33. Consider the computer network shown (Figure #4). The probability of a failure of a link between any two nodes in a 1-hour period is 10**(-8). A network is defined as disconnected if at least one node is unreachable from all other nodes. Determine the probability that the network will become disconnected within a year. Assume that it takes 24 hours to repair a failed link. **Insert Figure #4 p33 prob 11** Exercise #34. (Aburdene, p. 119, #25). In a simple point-to-point computer communication network, a packet is transmitted from the source host to the destination host as shown (Figure #5). The probability that a packet could become garbled along links 1 and 2 is p, and the probability that it becomes garbled along links 3 and 4 is q. If garbled, the packet will have to be re-transmitted from the source node. The probability that it travels along link 1 is r. 1. Compute the probability of a packet reaching the destination node without error. 2. Compute the average number of times a packet must be transmitted before it reaches its destination. 3. Simulate the system above and compare your simulation results with the computations in parts 1 and 2. **Insert Fig. 5 p119 prob 25 fig 5.5** Exercise #35. (Aburdene p. 167 #8). The distributed computing system shown (Figure #4) is used by a computer services firm. Assume that jobs are submitted to a front end, which assigns each job to one of the parallel computers. Using simulation, determine which of the following strategies will provide the higher throughput. Define throughput as the number of jobs completed per hour. Assign a job to each computer in round-robin fashion. Assign jobs to the computer with the smallest queue length. In case of a tie, choose randomly among the computers. All processes have a deterministic service rate of 10 mega instructions per second (MIPS). Assume that the number of instructions in a job are exponentially distributed with a mean of one million. The arrival rate of jobs is Poisson distributed with a mean of 10 jobs per second. ** Insert Figure #4 Aburdene, p. 167#8, problem 13** Exercise #36. (See Aburdene, p. 207 #13). Consider the computer network described. (Figure #6). The probability that a message in computer A is routed to computer B is p and to node G is 1-p. In general, when a message is at a computer it is forwarded clockwise with probability p and counterclockwise with probability 1-p. Note, this is an advanced problem. Ignore it if you have not studied Markov chains. 1. Can you model this as a Markov chain? If yes, define the states and the probability transition matrix. If no, specify why not. Assume that when a message reaches its destination, it is removed from the network. 2. Using simulation, determine the average number of hops (links) a message travels before it reaches its destination. Assume that all destinations are equally likely for any message entering the network. **Insert Figure 6 p208, fig 7.4 **  4. Instructor's Aids 4.1 Team Projects and Supplementary Exercises 4.11 Team Projects Other computer systems to study: the instructor can design a project n computers and networks using the concepts we've learned to analyze these other computer systems. In general, note that: jobs = transactions; resources = processors, disks, main memory data blocks. As an alternative model, the xact is the user; the server is a support system. Study such measures as: Utilization, service time, number queries and jobs. Also study: Probabilities of numbers of jobs, 90% response time, 90% waiting time. Consider such metrics as appropriate, include: time/call or call rate/time, bytes/call and parameters. Study resources--local computer-client, remote computer-service, network links, as well as performance metrics such as: response time, throughput, process time/packet, turn around time and MTTF Factors: CPU type: 8086, 80286, 80386, OS type: MS-DOS, UNIX, Disk drive type: A, B, C Memory Size: 512KB, 2MB, 8MB Number of peripherals and cost 1. microprocessor or PC 2. disk drive 3. floppy drive 4. MS-DOS version 5. Operating System 6. window system 7. text editor 8. languages compiler--C or Pascal server - compiler;xact - program or programmer 9. spreadsheet package 10. data flow architecture 11. transaction processing system 12. database system--1) airline travel agent; 2) fast food inventory; 3) fast food orders 13. network program 14. Processor interconnection network 15. packet retransmission algorithm 16. Characterize workload of a campus timeshare system 17. Simulate communication control for intelligent terminal system (1000 workstations) 18. Internet; server = homepage ( = set of files); xact = user 19. Printer Buffer; server = buffer (FCFS or other);xact = file 20. ATM access 21. Elevator Computer (real-time) Team Projects. 1-17 above could be used as projects for student teams to analyze a pair of systems. 4.12 Supplementary Exercises Exercise #1 Topics: Networks, Scheduling, Queues An office has a local area network. There are 20 workers on the staff, each with a PC on the LAN. However, there is only one printer. Each worker will want to print 1 file. Consider different times of the day, with different levels of utilization of the system. Assume the system is busy enough so that a queue will form, and assume a reasonable distribution of sending times. Assume the files are of various lengths. Make any reasonable assumption about printer speed. Justify your assumptions. Refer to your materials on queues and modeling of single servers, both in your text and in handouts, to answer the questions in Part I. Only then, proceed to Part II. Part I. Setting up model and theory. 1. Draw a spatial representation of the system, showing arrivals, waiting time, server and departures. 2. Choose a suitable distribution for your arrival times from: deterministic, exponential, or other distribution. Explain your choice. 3. When you model the server, should you use FCFS (first come, first served) or should you establish priority levels? Explain your choice. Make any reasonable assumptions about the office use of the printer. You might want to solve the problem both ways. Part II. Build and run a GPSS/H model file for this system. 1. Consider the GPSS/H operations: GENERATE, ADVANCE, SEIZE, RELEASE, TRANSFER. What is the sequence of steps for your model file? What are suitable values for your Operands? 2. What is the default distribution in GPSS/H? Should you use that distribution, or is there a better one for this problem? What is the server and what are the transactions for your system? 3. Set up your GPSS/H file and run it. 4. Interpret the data in the Postsimulation report with respect to your system. The problem above is to develop the theory and Model File and run the file, and answer the questions. All further problems require answering the questions in Parts I and II of the last problem. Exercise #2 The following material from Sankar's paper could be made into 1 or more exercises involving LANs. Model for a Token Ring LAN. Simulation and performance analysis. Uses CSIM, a process-oriented discrete event language based on C. Performance measures: average access delay at each node; ring utilization. Token ring topology: allows only 1 user at a time to access the medium. Assess effect of fixed and exponentially distributed packets on network performance. Network parameters used for performance evaluation: nbr stations: 50; ring length: 2 km; ring capacity: 1 M bps; propagation delay: 5 ms/km; station/mode latency: 2 bits. Do with packets of 1000 bits. Performance: study average access delay (m sec) against arrival rate per node (packets/sec). Assumptions: 1. Interarrival times for the message at each node are exponentially distributed. 2. Times for message arrival at each node are statistically independent. 3. All nodes have equal access to the network and have equal priority. 4. Neglect transmission error. The average arrival rate in packets per sec is Poisson distributed. Assume infinite buffer at each node. M/G/1 queue. T = effective service time. Token rotation time = elapsed time between arrivals of a free token at a specified node. The active station has a waiting packet when it receives the token. N = the number of active stations during a token rotation. GPSS/H Code for a token ring LAN. GENERATE T1 TTIM TABLE M1, A, B, C SEIZE "F" ADVANCE T2 RELEASE "F" TABULATE TTIM TERMINATE 1 START NAR 4.2 Course Outline An Introduction to Simulation with GPSS/H for Computer Science: an Integrated Approach Week 1: Introduction to Simulation. Definitions, uses, problems, simulation languages. Lumped models, verification, validation. Week 2. Simulation examples: Discrete and continuous systems, computer networks, manufacturing systems, physical systems, queues and Monte Carlo methods. Week 3. Introduction to GPSS/H. Transactions, blocks, software installation. Computer screen during various stages of runs. Week 4. Continuation of GPSS/H. Generate and terminate blocks. Week 5. Probability and statistics. Basic concepts. Week 6. Monte Carlo method and simulation. Week 7. Discrete event simulation. Queues. Examples. Week 8. Continuation of GPSS/H. Advance, Seize, Release blocks. Week 9. GPSS/H. Examples of programs. Transaction movement, chains: CEC (Current events chain), FEC (Future events chain). Stepping through a simulation. Week 10. GPSS/H. Practice in setting up and running program files. Modelling a server: queues. Week 11. Discrete systems and Markov processes. Week 12. Probability distributions available in GPSS/H: Uniform (default), exponential, poisson, normal. Random number generators, sampling. Week 13. Parallel simulation--an introduction. 4.3 UDC Syllabus COURSE SYLLABUS I. COURSE TITLE and NUMBER, SEMESTER and YEAR SYSTEMS SIMULATION I 2529.461 3 credits Fall 1997 II. DESCRIPTION for CATALOG Methods of simulation for computers, including Monte Carlo simulation, discrete event simulation and Markov processes. Examples of physical systems including computer communications networks. Simulation languages, including study and use of GPSS (General Purpose Simulation Software). III. PRE- and CO-REQUISITES pre-requisites: 2529.232, 2529.234, Introduction to Computer Science II, 1535.254 Differential Equations or 1535.225 Linear Algebra IV. RATIONALE for COURSE This course is designed to meet the need for computer literacy and for a Computer Science Elective. It comprises part of the requirements for a bachelor's degree in Computer Science. V. COURSE OBJECTIVES A. To extend the student's disciplined approach to programming and problem solving B. To provide a survey of system simulations of engineering models. C. To provide the opportunity for gaining experience in the use of simulation software. VI. STUDENT OBJECTIVES (COMPETENCIES and LEVEL) A. To learn the major methods of applying the computer to simulation of systems B. To gain experience in the use of simulation software. VII. REQUIRED TEXTS ``An Introduction to Simulation Using GPSS/H", Thomas J. Schriber, Wiley-Interscience, New York. VIII. SUPPLEMENTAL READINGS There will be substantial handouts, developed during the period of the Instructor's NSF Grant. IX. TOPICS for COURSE A. Introduction to Simulation B. Simulation Examples C. System Description D. Probability Theory E. Monte Carlo Method F. Discrete-Event Simulation G. Discrete Systems and Markov Processes H. Use of GPSS/H Software I. Tests (3 classes) X. POLICY on CHEATING The stated policy of the College of Physical Science and Technology is rigorously adhered to. XI. GRADING STANDARDS LECTURE EXAMINATIONS = 80% OUTSIDE ASSIGNMENTS = 20% XII. INSTRUCTIONAL RESOURCES and METHODS A. LECTURE: use of transparencies and copies of programs B. OUTSIDE ASSIGNMENTS: The UDC IBM PS-2 Personal Computer Laboratory is used XIII. FACULTY OFFICE HOUR INFORMATION Faculty coordinator: Dr. Ruth Silverman rm 213 KK, phone 274-6280 Office hours: Tuesday and Thursday 12.00-2.00 pm. XIV. INFORMATION ON SUPPLEMENTAL READINGS See Item VIII. 4.4 Preliminary Student Questionnaire National Science Foundation Simulation for Computer Science Majors Program Evaluation Preliminary Student Questionnaire Introduction You have been identified as a participant in the above Program. Please complete this preliminary questionnaire. At the end of the semester you will be asked to complete another questionnaire concerning your experience. Please answer the following questions in terms of your experiences and expectations before taking the course. The first series of questions deals with your decision to participate in the course. 1. How did you first find out about the Course? Circle one number My advisor 01 An instructor other than my advisor 02 The Chairman of the Department 03 A friend 04 The published schedule of courses 05 Other (Please specify)_________________ 09 2. How important were each of the following in your decision to participate in the course? Check one answer for each item. Very Moderately Not Import. Import. Import. I wanted the course for an elective ____ ____ ____ The reputation of the project director attracted me ___ ____ ___ A friend suggested I take the course ___ ____ ____ I wanted to participate in a pilot course ___ ____ ___ I expect to improve my marketability with this course ___ ____ ____ Other (Please specify)______________________________________ 3. How important was it that you learn about each of the following during your course experience? Check one answer for each item. Very Moderately Not Import. Import. Import. Familiarity with computer applications in a particular field ___ ____ ___ Familiarity with theory in a particular field ___ ____ ___ Familiarity with specific techniques or procedures ___ ____ ___ How to integrate knowledge acquired over previous undergraduate program ___ ____ ___ How to communicate more effectively orally and in writing ___ ____ ___ How to work more effectively on a professional team ___ ____ ___ 4. What is your current major as of the beginning of this semester? Circle one Computer Science 01 Information Science 02 Mathematics 03 Electrical Engineering 04 Other Engineering 05 Other Science 06 Business 07 Other Non-science(please specify)____________ 09 5. What is your Grade Point Average as of the beginning of this semester? ______ Questions 6-11 ask about your education and career plans as of the beginning of this semester. 6. What is the highest degree you expect to receive? Circle one. Bachelor's _________________________________________1 MBA ------------------------------------------------2 Other Master's _____________________________________3 Ph.D. or D. Sc. ____________________________________4 MD _________________________________________________5 Other (Please specify) ____________________________6 7. As of the beginning of this semester, how likely do you think it is you will go to graduate or professional school? Circle one. Very likely ___________________________1 Somewhat likely ______________________2 Uncertain_____________________________3 Somewhat unlikely_____________________4 Very unlikely (skip to question 11)___5 8. As of the beginning of this semester, how likely do you think you are to start graduate or professional school immediately after receiving your Bachelor's Degree? Very likely ___________________________1 Somewhat likely ______________________2 Uncertain_____________________________3 Somewhat unlikely_____________________4 Very unlikely (skip to question 11)___5 9. As of the beginning of this semester, what do you think your graduate school major will be? Computer Science 01 Information Science 02 Mathematics 03 Electrical Engineering 04 Other Engineering 05 Other Science 06 Business 07 Other Non-science (please specify)___________ 09 10. As of the beginning of this semester, how certain are you of this choice of major? Positive ___________________________1 Very certain_ ______________________2 Somewhat ___________________________3 Somewhat uncertain__________________4 Very uncertain______________________5 11. As of the beginning of this semester, in what field do you anticipate working after you receive your final degree? Circle one. Computer Science 01 Information Science 02 Mathematics 03 Electrical Engineering 04 Other Engineering 05 Other Science 06 Business 07 Other Non-science (please specify)___________ 09 Teaching - College 11 Teaching - Other (please specify) ___________ 19 12. When do you expect to graduate? ______________ Today's date____________ Your Student Number______________ 4.5 Final Student Questionnaire National Science Foundation Simulation for Computer Science Majors Program Evaluation Student Questionnaire #2 Introduction You have been identified as a participant in the above Program. You have completed a preliminary questionnaire. Now, at the end of the semester you are asked to complete this second questionnaire concerning your experience. Please answer the following questions in terms of your experiences and expectations at the end of the course. 1. How satisfied were you with what you learned about each of the following during this course ? Check one answer for each item. Very Moderately Not Import. Import. Import. Familiarity with computer applications in a particular field ___ ____ ___ Familiarity with theory in a particular field ___ ____ ___ Familiarity with specific techniques or procedures ___ ____ ___ How to integrate knowledge acquired over previous undergraduate program ___ ____ ___ How to communicate more effectively orally and in writing ___ ____ ___ How to work more effectively on a professional team ___ ____ ___ Questions 2-9 ask about your education and career plans as of the end of this semester. 2. What is the highest degree you expect to receive? Circle one. Bachelor's _________________________________________1 MBA ------------------------------------------------2 Other Master's _____________________________________3 Ph.D. or D. Sc. ____________________________________4 MD _________________________________________________5 Other (Please specify) ____________________________6 3. As of the end of this semester, how likely do you think it is you will go to graduate or professional school? Circle one. Very likely ___________________________1 Somewhat likely ______________________2 Uncertain_____________________________3 Somewhat unlikely_____________________4 Very unlikely ........................5 4. As of the end of this semester, how likely do you think you are to start graduate or professional school immediately after receiving your Bachelor's Degree? Very likely ___________________________1 Somewhat likely ______________________2 Uncertain_____________________________3 Somewhat unlikely_____________________4 Very unlikely (skip to question 11)___5 5. As of the end of this semester, what do you think your graduate school major will be? Computer Science 01 Information Science 02 Mathematics 03 Electrical Engineering 04 Other Engineering 05 Other Science 06 Business 07 Other Non-science (please specify)___________ 09 6. As of the end of this semester, how certain are you of this choice of major? Positive ___________________________1 Very certain_ ______________________2 Somewhat ___________________________3 Somewhat uncertain__________________4 Very uncertain______________________5 7. As of the end of this semester, in what field do you anticipate working after you receive your final degree? Circle one. Computer Science 01 Information Science 02 Mathematics 03 Electrical Engineering 04 Other Engineering 05 Other Science 06 Business 07 Other Non-science (please specify)___________ 09 Teaching - College 11 Teaching - Other (please specify) ___________ 19 8. What effect, if any, did your course experience have on your choice of field of (current or planned) employment? Circle one. Caused me to question my original choice No effect Caused me to feel more confident in my original choice 9. In what employment sector do you plan on working, or are you already working? Circle one. Academic Industrial Non-profit Military Government 10. How did your experience in this course affect your interest in science or engineering as a career? circle one. Increased No effect Decreased 11. Was the work of the course Too challenging About right Not challenging enough (circle one) 12. How could your experience in this course have been improved? 13. Please use this space to make any comments you would like about your experience in this course. 14. What is your sex? Circle one. Male Female 15. What is your race/ethnicity? Circle one White, not of Hispanic origin Black, not of Hispanic origin Hispanic Native American or Alaskan native Asian or Pacific Islander 16. Do you have a disability or handicap that limits a major life activity? Circle one. Yes No Today's date____________ Your Student Number______________ 5. Bibliography and References July 9, 1997 **improve format** Aburdene, Maurice, "Computer Simulation of Dynamic Systems", Brown, 1988. Out of print. Banks, J., Carson and Sy, "Getting Started with GPSS/H",2nd ed., 1996, Wolverine Press. Fishwick, Paul, Simulation Modeling, Design and Execution, 1995, Prentice Hall. Henriksen, J. SLX and Proof Animation: Improved Integration between Simulation and Animation, Simulation and Animation '97, ed. by Deussen, O. and Lorent, P., pp 287-294, 1997, SCS International, San Diego. Henriksen, J., An Introduction to SLX, Proceedings of the Winter Simulation Conference '96, pp 468-475, 1996, San Diego. Jain, Raj, Art of Computer Systems and Performance Analysis, 1991, Wiley Publishers. Law, Averill M. and Kelton, W. David, Simulation Modeling and Analysis, 2nd ed., McGraw-Hill 1991. National Science Foundation Program Evaluation Staff, Report 90-58, NSF's Research Experiences for Undergraduates (REU) Program: An Assessment of the First Three Years, May 1990, Washington, DC. National Science Foundation Program Evaluation Staff, Abt Associates Report, A Preliminary Evaluation of the Research Experience for Undergraduates (REU) Program, March 1990, Washington, DC. Ross, Sheldon M., "A course in simulation", Macmillan, 1990. Samadzadeh, Farideh A., Using Simulation to teach operating system functions, Proc. of the Inter. Conf. on Simulation in Engg Educ, (ICSEE '94), pp205-210. [Note: targeted to computer science graduate students]. Sankar, R., Edwards, G. and Roth, P., CSIM Simulation of a Token Ring LAN, Trans. of the Society for Computer Simulation, 1992. v. 9, #1, pp 39-58. Schriber, Thomas J., "An Introduction to Simulation using GPSS/H", Wiley, 1991. Schruben, L., "Sigma", Scientific Press, 1991. Silverman, R., Simulation for Computer Science Majors: A Preliminary Report, Proceedings of the Winter Simulation Conference, 1996, pp 1400-1404. Silverman, R., Simulation for Computer Science Majors: A Progress Report, Accepted by the Winter Simulation Conference, 1997. Stallings, Wm., "Data and Computer Communications", 5th ed., 1996, Macmillan. Student GPSS/H, Release 3.0, Copyright 1996, Wolverine Software Corp, 7617 Little River Turnpike, Annandale, Va. 22003-2603. Tucker, A., editor, CACM (Communications of the ACM), June 1991, v.34 #6, pp69-84. A summary of the ACM/IEEE-CS joint curriculum task force report Computing Curricula 1991. 6 Resources Appendix 1. RUN Software 6.0 describes what RUN does, and 6.1 explains how to obtain RUN. 6.0 What RUN does for the user You can download a set of files that makes it very easy to create, edit, execute, and review the standard GPSS.LIS output from a GPSSH- based program. It uses the DOS editor EDIT to do these tasks. Many variations in a GPSS program are easily and quickly done with it. Given that all required files are in your GPSSH subdirectory and that you can access DOS's EDIT.EXE via your path command, the procedure to begin RUNning Schriber's Figure 2.18 example is given by entering run fig218 The following 4 choices are immediately displayed: Press "e" to execute fig218.GPS under GPSS/H. Press "c" to change/fix fig218.GPS with the editor and then execute it. Press "v" to review the fig218.GPS output and print it if desired. Press "x" to exit this work session with fig218.GPS. Press e to execute, c to change, v to view/print or x to exit. _ Press the letter "c" to begin an EDIT session on FIG218.GPS. At this point all functions for editing, printing, and/or saving the file under EDIT.EXE are available. A new GPSS program file, e.g., TESTX.GPS, can be created by entering run testx When any GPSS program is ready to be executed, Alt with F and then pressing X results in a prompt to save the new or changed file. After Y(es) is selected, the 4 choices are again displayed. The following 3 standard files from GPSS/H are required as noted in Schriber's appendix to AN INTRODUCTION TO SIMULATION USING GPSS/H, Section A.2.5: GPSSH.EXE GPSSHERR.MSG PROFILE.HDB If you have run Schriber's HINSTALL, per Section A.2.4, you will have about 75 files including the 3 shown directly above. Then when you pick "e" from the choice list, GPSSH.EXE will be executed with the file you have edited. It will -- usually in a few seconds -- again give the four choices. One would normally next press the "v" key to view the output. The "v" choice results in the normal GPSS/H output file, FIG218.LIS in the example above, to be displayed in an EDIT session. This file can be edited and printed as desired. A copy of FIG218.LIS is used for this EDIT session; the original created by GPSSH.EXE is kept. 6.1 How to download RUN. You can obtain all the required RUN procedure files via the Web. Enter the following URL: http://nersp.nerdc.ufl.edu/~dicke/runzip.exe When connected to it, your system will prompt you for a subdirectory to store RUNZIP.EXE, e.g., C:\GPSS. When it is downloaded, enter the filename, "runzip", and it will dearchive itself. Put at least the 3 required GPSS/H files in that subdirectory. Then at a DOS prompt enter, run fig218 and you will have the RUN procedure active. The RUN procedure files are as follows. README.1ST Description and requirements. RUN.BAT The main RUN file. RUNEDI.BAT The file CALLed by RUN.BAT to create or edit a file. RUNPRT.BAT The file CALLed by RUN.BAT to view &/or print output. PICK.EXE The procedure giving the user choices. GPSSBLK.ICO A GPSS icon that can be used in Win 3.x, Win 95 or NT. TOP.GPS A header put on new GPSS program files. FIG218.GPS File containing Figure 2.18 from Schriber's text. You can change to the editor you prefer by modifying one line in the two files invoking the editor. The first is RUNEDI.BAT; the second is RUNPRT.BAT: rem This is RUNEDI.BAT with DOS's EDIT. edit %1.gps rem Bye! rem This is RUNPRT.BAT with DOS's EDIT. copy/a %1.lis %1.ptr edit %1.ptr erase %1.ptr rem Bye Change the editor by replacing the name "edit" in the two files. You must have your alternative editor *.EXE in your path command. The file GPSSBLK.ICO is an icon with GPSS blocks in it. It can be used with Win 3.xx, Win 95, or NT. In the former, create a GPSS.PIF file and include a question mark "?" on the third line to have the system prompt you for a file name. In Win 95 and NT, add the "?" on the command line itself. That is, in the icon's Properties click on Program and enter on "Cmd line:" c:\gpss\run.bat ? Then click on the "Apply" box and then the O.K. box. You may also wish to put a check in the "Close on exit" box to always return to your Windows desktop after completing a GPSS/H exercise under RUN.