|
![]() |
|||
|
||||
OverviewMany computer programs, especially those involving scientific computing, are long running and rely on parallel processing. The long run times, as well as the increased probability of hardware failures as the number of processors increases and semiconductor feature sizes shrink, demand a high level of recoverability from hardware failures. To address this, we describe a novel approach to parallel programming based on the large grain dataflow model of computing. This approach provides a number of fault-tolerance features, including two forms of application-transparent rollback recovery, process restart and distributed checkpoint/rollback. We describe a simulator for a large grain dataflow system named COSMOS that was originally developed at NASA's Jet Propulsion Laboratory and was based on a distributed-memory architecture. Using the COSMOS simulator, performance comparisons and tradeoffs are made between process restart and checkpoint/rollback, and an analytical model is developed to validate the empirical results. This is then used to predict the behavior of COSMOS programs in a multi-core environment, with very favorable results. Full Product DetailsAuthor: David CummingsPublisher: VDM Verlag Dr. Muller Aktiengesellschaft & Co. KG Imprint: VDM Verlag Dr. Muller Aktiengesellschaft & Co. KG Dimensions: Width: 22.90cm , Height: 1.30cm , Length: 15.20cm Weight: 0.352kg ISBN: 9783639210194ISBN 10: 3639210190 Pages: 236 Publication Date: 27 November 2009 Audience: General/trade , General Format: Paperback Publisher's Status: Active Availability: In Print ![]() This item will be ordered in for you from one of our suppliers. Upon receipt, we will promptly dispatch it out to you. For in store availability, please contact us. Table of ContentsReviewsAuthor InformationTab Content 6Author Website:Countries AvailableAll regions |