The Collaboratory Inc. Reaction Draft Proposal For A Research Initiative In The Computational Arts, Sciences and Engineering A Reaction Draft Proposal Prepared for the Office of the Vice-President (Research) 11 April, 1994 Foreword In November of 1993, The Collaboratory Inc. was retained by the Office of the Vice-President (Research) to prepare this proposal, in part to address Objective 10 of the Memorial University Information Technology Strategic Plan. The present document is a pre-release "reaction draft" version of the required document, due to be submitted to the University at the end of April, 1994. The format of the present document is somewhat like a business plan and is intended to be used as supporting documentation for targeted applications to external agencies in search of funding. All members of the university community are invited to contribute their input. Comments may be returned to the author via the Office of the Vice-President (Research), c/o the Dept. of Physics (Rm C2044), or by electronic mail (jstacey@morgan.ucs.mun.ca). Business plans are living documents; in other words, they are useless if not accompanied by action. The present document has benefited tremendously from the activities in high performance computing and communications that have been taking place in parallel with the writing of this proposal. There appears to be an imperative within the research community at Memorial University to move forward with or without the present proposal. Ultimately, this is what is required in order to be successful in this venture. Several significant events signal that guarded optimism regarding the future of the present initiative may be appropriate. The present initiative has been ongoing now for two years and represents only the most recent attempt to enhance and enrich the research computing environment at MUN. A standing committee on HPC&C was struck and initiated the present study. An HPC&C Interest Group of academic researchers has formed and has participated with the Department of Computing and Communications in a procurement of computing equipment. Wider participation in this group is being sought and, with time, it is hoped that this group will become an HPC&C Users' Group, charged with the ultimate authority to direct the activities of HPC&C at MUN as proposed in this document. This most recent phase of the MUN HPC&C Initiative would not have been possible without the financial support of the Department of Computing and Communications (Mr. Wilf Bussey, Director), the support of the Office of the Vice-President (Research), and the participation of the research community. Several individuals have provided support to the preparation of this document above and beyond that required as part of their professional activities. These include Ms. Gayle Tapper (C&C), Prof. George Miminis (Computer Science), Mr. Randy Dodge (C&C), Ms. Donna Osborne (Office of the VP/Research), Mr. Steven Millan (C-CORE), Mr. Tony Kocurko (Earth Sciences), Mr. Lloyd Little (Engineering), Mr. Allan Goulding (Physics), Mr. Michael Rayment (Computer Science), Prof. Jeremy Hall (CERR), Prof. Paul Gillard (Computer Science), Prof. Maynard Clouter (Physics), Mr. Wayne Bussey (CCMC), Mr. Harvey Weir (STEM*Net), Dr. Jacek Pawlowski (NRC/IMD), Mr. Wilf Bussey (C&C), and Prof. Kevin Keough (Vice-President/Research). To them all I'd like to express my thanks. James Stacey, The Collaboratory Inc., March 31, 1994 Table of Contents Foreword i Table of Contents iii Executive Summary iv Research Computing Surveys v Needs vi Opportunities vii Proposed Facility viii Mission Statement viii Organization and Management ix Proposed Location x Personnel x Financial Projections x Risks and Benefits xii Risks xii Benefits of a Newfoundland Initiative in HPC&C xiii Introduction 1 Memorial University's Information Technology Plan 2 STRATEGY 3 OBJECTIVE 09 4 OBJECTIVE 10 4 OBJECTIVE 11 4 OBJECTIVE 12 4 Defining HPC&C 4 Requirements For A Healthy Initiative 5 History 8 The NSF Supercomputer Centres Program 8 Supercomputing In Canada: The First Generation 9 Gillespie, Folkner & Associates Study of OCLSC 9 Unrealistic Expectations For Full Cost Recovery 11 Regional Initiatives In Canada 11 HPC High Performance Computing Centre (HPCC) 11 WurcNet 12 Ontario Inititiatives 13 Quebec Initiatives 13 Atlantic Canada Infrastructure Lags Behind Rest of Country 13 Memorial University of Newfoundland 14 Federal Initiatives 15 CANARIE 17 Super*CAN 18 Research Computing Surveys 19 MUN Survey 19 High Performance Computing (HPC) Technology in Education 20 NSERC Survey 22 Research Computing Expenditures at MUN 23 Needs 26 Career Prospects Bleak for Research Personnel 27 Computational Research At Memorial University 28 Technology Assessment 31 Preliminary Observations 32 Migration to Parallel Computing 33 The Promise of Parallel Computing 34 Difficulties With Parallel Computing 34 Transforming Newfoundland's Economy 35 An Atlantic Canada Regional HPC&C Initiative? 35 Canada Lacking In Coordinating A National Initiative 37 No Canadian Facilities Are Internationally Competitive 37 Personnel Most Significant Part of Infrastructure Support 38 Canada Unique In Not Supporting Large Scale Computational Facilities 40 NSERC Initiative on Scientific Computing 40 Canada Falling Behind In Transition To Knowledge-Based Economy 41 Canadian Meteorological And Climate Centres 42 Geographic Economic Disparity And Computer Networks 43 Attitudinal Change 44 Opportunities 45 Cross-Disciplinary Co-operation and Collaboration 46 Technology Opportunities 47 Integrated Voice, Video and Data With ATM 48 What is ATM? 48 CANARIE 48 Division of Educational Technology 49 National Initiative for Scientific Computing (NISC). 49 Research and Technology Building 51 Collaboration With Other Institutions 51 Developing Ph.D.s as Computational Scientists 52 Products and Services 53 Deliverables to the University Community 53 Infrastructure Support For University Research 53 Expanding Co-op programs to Physics, Chemistry and Computer Science 53 Deliverables to Government 54 Support of the NISC 54 Deliverables to Industry 55 The University's Role In Technology Transfer 55 Outreach Programs 55 Industry Partnerships 55 Software Development and Commercialization 56 Proposed Facility 57 Mission Statement 57 Decentralized Organizational Model 58 Department of Computing and Communications 58 Standing Committee on HPC&C 59 Proposed Location 59 Research and Technology Building 59 High Performance Computing Requirements 60 Competitive Procurement 60 Visualization Facility and Teleconferencing 60 Personnel 60 Financial Projections 62 Striking The Appropriate Balance 62 Reaction Draft Cash Flow Projections 63 Cash Receipts 64 Industry Programs 64 Consulting Fees 64 Computer Services 64 Software Sales and/or Royalties 64 Cash Disbursements 65 Salaries 65 Benefits 66 Overheads 66 Space 66 Purchase of Computer Time 67 Purchase of Computer Equipment 67 Power 67 Maintenance 67 Software 68 Communications 68 Programs 68 Audit Fees 68 Taxes (PST + GST) 68 Miscellaneous 68 Financing 69 Memorial University 69 Research Grants 69 User Fees or Contributions 69 Sources of Funding 70 Rationalizing Existing Expenditures 70 Alternative Cost Savings For Personnel 70 Sources of External Funding 71 Vendors As A Source of Inkind Funding 72 Leasing Versus Buying 72 Extraordinary Funding From Government Is Required 73 Risks and Benefits 74 Risks 74 Unrealistic Expectations 75 Avoid "Planning" For Failure 76 Advances in Technology 76 Lack of "Critical Mass" for Programs 77 Political Imponderables 78 Benefits 79 NSF Centers Program "An Extraordinary Success" 79 A National Metacenter 79 A Newfoundland Initiative in HPC&C 79 Appendix 1: Research Projects 81 Appendix 2: Mission Statements 86 The U.S. High Performance Computing and Communications Program 86 Mission Statement: Ontario Centre for Large Scale Computation (OCLSC) 87 Mission Statement: Western University Research Consortium 87 Appendix 3: Technology Assessment 89 Methodology 90 Preliminary Benchmark Results on Compact Applications 90 Cost-Benefit of High Performance Computers 92 Appendix 4: The Collaboratory 96 A. James Stacey¾Biographical Sketch 96 Professional Services 97 Technical Expertise 97 Executive Summary The Memorial University of Newfoundland's High Performance Computing and Communications (HPC&C) Users' Group, the HPC&C Standing Committee and the Department of Computing and Communications (C&C), through the office of the Vice-President (Research) propose a research initiative in the computational arts, sciences and engineering. A fundamentally new model of research activity will be enabled through the power of high performance computation, communications and people. The proposed budget totals $5.3M over 5 years, of which 44% is for personnel, 37% for computers and 9% for a campus-wide high-speed (ATM) voice, video and data network. Existing budget and in-kind contributions of the University total 18% of the budget. Budget realities introduce the constraint that additional support for this proposal from the university budget is not available. The balance of the budget, $4.4M over 5 years, will be sought from external agencies. The primary emphasis of this proposal is on research programs that build on partnerships and collaborations with other disciplines, with other universities, with industry and with government. The activities outlined in this proposal span a spectrum from the pure arts and sciences, to the applied sciences and engineering, to the research, development and commercialization of products and services in the industrial and government sectors. We advocate an appropriate balance between infrastructure and technology. The infrastructure component proposed in this document is not less than the technology component of the overall budget. In this proposal, people are the agents of change (not technology). A fiscally responsible, modest, approach can yield benefits of true HPC&C at a cost that appropriately leverages current expenditures on researcher's salaries, that remains within current university budget commitments, and with a "no hidden costs" balanced approach that will not become an unexpected financial burden. The strategy proposes initially to have a modest local capability for high performance computation linked together by a campus-wide high performance ATM communications network. High-speed regional network linkages feeding into the national CANARIE initiative provide local researchers access to the full range of available high performance computing facilities elsewhere in Canada and in the United States. There have been several attempts since 1985 to transplant the successful U.S. model of high performance computing and communications here in Canada. These attempts have met with (at best) limited success to date, indicating that we have yet to find the appropriate "Made in Canada" model. The experiences gained to date have nevertheless been valuable and the lessons learned have been incorporated in the present proposal. It must be emphasized that the model being proposed in this document has its own potential for both successes and failures. An important feature of this proposal is a process of regular evaluations and the introduction of corrective measures as an expected and normal part of the process. Research Computing Surveys An extensive process of consultation with the wider university community was conducted to establish the needs of the research community at Memorial University that currently depend on HPC&C, to open the door for potential new users of the technology (particularly from outside of the traditional science and engineering communities), as well as initiating an ongoing process of inviting input from those who have concerns about the potential effect of the present initiative on other parts of the university community. A survey of the Memorial University community for researchers with applications requiring access to high performance computers was solicited by the Vice-President (Research). Based on the results of this survey, the following recommendations to the Office of the Vice-President (Research) were made: 1. A high performance computing facility or capability be established on campus. The facility would support a number of high-end workstation class servers (for example, the fastest machines available for under $100K), and include the appropriate level of infrastructure support in the form of personnel who have expertise in the area of Scientific Computation and/or Visualization. 2. That the national funding agencies (primarily NSERC) be lobbied to establish a good national high performance computing facility and put in place a high speed network connecting all universities and other research centres. In 1992 the Vice-President (Research) at Memorial University commissioned a companion study on "High Performance Computing (HPC) Technology in Education: An evaluation of needs, status and potential for Memorial University of Newfoundland, and potential benefits for education in Newfoundland." [1] This report, the Miminis-Tapper Report, complemented the above survey in its evaluation of research computing and its potential utility to both the research and academic infrastructure of the university. This report was released to the university community in 1993. Several concerns and priorities were raised by the university community in response to this report: * Networking is, and should remain, a high priority. * There exists a general concern that inadequately funded HPC activity could siphon funds from other areas. * Technology will not by itself provide a solution to Newfoundland's economic situation. * HPC activity at MUN should be driven by need. * There is a need for better infrastructure support for research computing. Preliminary results from an informal survey of the research community indicates that research computing expenditures at Memorial are on the order of $700K-$750K/year, indicating that research computing is a very important component of the research infrastructure of the university that must be maintained, enriched and enhanced as new technologies emerge. This value was used to establish the baseline for the present proposal. Needs This proposal to enhance the computational and communications infrastructure of the province is driven by several distinct local, regional and national needs. These can be summarized as: 1. Developing, promoting and protecting investments in people. A high priority area is developing the infrastructure that encourages highly skilled individuals to remain in the university community and in the province. 2. Local researchers at Memorial University whose research programs are severely bottlenecked at the computation stage, with turnaround time measured in months and, in one case, years. 3. Continuous evaluation of technology options to identify the most cost- effective solutions to keep researchers at or near the forefront of technological developments. 4. Need to transform the Newfoundland economy from its primary resource based industries to a more diversified economy that includes information technology and knowledge-based industries. 5. Need for an Atlantic Canada regional initiative in HPC&C to match similar developments in Western Canada, Ontario and Quebec. 6. Need for national strategy in HPC&C to eliminate geographic and economic disparities. The current lack of a coherent national strategy strongly impacts on the economic development of smaller provinces like Newfoundland. 7. Need for attitudinal change. Opportunities Research groups at MUN with the greatest need for computation are seeking increased capability at approximately the same time. There is therefore a window of opportunity for these groups to cooperate in a joint venture to obtain significantly greater computational capability than each could obtain by themselves. We are now at the threshold of a new era in the computational arts, sciences and engineering. The power of the traditional "supercomputer, which dominated research computing in the 1980s, is available on (or beside) the desktop at a cost that is now less than the cost of people. We can then ask what level of expenditure on computer technology maximally leverages the university's large investment in salaries and infrastructure support for its "computational scientists"? This is a fundamentally different view of research computing than that represented by the traditional mainframe-based supercomputer centres of a few years ago. Networking enhances the research infrastructure of all sectors of the university community. As high-speed computer networks are also essential to high performance computing, there is an opportunity to accelerate the development of computer networking at MUN and Newfoundland. MUN's Department of Computing and Communication's long-range plan includes a migration to ATM (Asynchronous Transfer Mode) as the carrier of backbone traffic on campus to integrate voice, video and data. The installation of glass fibre on campus is well advanced, with more than 20 buildings having access to this new medium. With funding, the migration to an ATM-based backbone extending across the MUN campus and ultimately into the surrounding research and industrial community could be greatly accelerated. There are no longer any nationally-accessible university-based high performance computing centres in Canada. An initiative based in Newfoundland could stimulate a pan-Atlantic Canada collaboration (similar in scope to those in central and western Canada) that could then mount a credible bid to funding agencies for a similar level of funding ($1M/yr) as that proposed in NSERC's National Initiative in Scientific Computing (NISC). Spencer Hall, recently acquired by the university, is to become the focus of the university's efforts to build a bridge between the university research community and the private and public sectors and could be an ideal location for the facility outlined in this proposal. The building will be the new home for the Memorial's Office of Research, Seabright Corporation and the Canadian Centre for Fisheries Innovation. The building will also include provisions to support a "technology cluster" (or incubator) aimed at high-technology companies, with emphasis on biotechnology, computers and communications, environmental services, medical devices and ocean technologies. The proposed location will ultimately become a hub for high technology activity in Newfoundland. There is also an opportunity to enhance the professional stature and career prospects of Ph.D. graduates and research computing professionals who have acquired high levels of skill in the application of high performance computers in their respective disciplines. The unique qualities these research staff persons bring to research computing are an intimate knowledge of the needs of researchers (having themselves been a part of the research community) as well as extensive experience in computing. Proposed Facility Mission Statement Author's note: To help focus subsequent discussions and stimulate the development of a mission statement,some underlying philosophical issues in this reaction draft are presented below (in no particular order). * Transforming Newfoundland's economy through science and technology; subscribe to and promote the required partnership between industry, government and academe to effect real change; promote attitudinal change. * Place primary emphasis on comprehensive programs developing the human resources pipeline needed to support the knowledge- and information-based economy of Newfoundland in the next century; technology transfer; develop programs and linkages to provide wider access to leading edge technology; * Invest in, promote, and protect the knowledge capital represented in people; promote collaboration and multidisciplinary approaches; provide the required computational tools to maximally leverage this knowledge capital. * Play an appropriate role in advancing the information infrastructure of Newfoundland and Atlantic Canada; display leadership where appropriate * Promotion of the computational arts, sciences and engineering as a third mode of research enquiry (with theory and experiment); strive for excellence in research computation; * Lobby for and promote a national strategy in HPC&C; support national initiatives; promote "citizenship" through collaboration and cooperative programs with other institutions and facilities across Canada * Achieving and maintaining balance between the research components, between pure and applied research, between the arts and sciences. * Full scientific and fiscal accountability; social and fiscal responsibility. * Implement a full and open process of evaluation, renewal, and ongoing improvement; including mechanisms that permit identification and corrections of deficiencies and the implementation of corrections as an integral part of the process. Organization and Management The proposed organizational model is as "flat" as possible. Projects will be managed by principle investigators (PIs) on an autonomous basis. All PIs and their co-investigators are members of the HPC&C Users' Group, which is charged with the authority to set general policies and procedures affecting the operations of the facilities comprising HPC&C activities. As projects in the University community are identified, resources are directed to their support both in technological resources as well as infrastructure support. Long-term multi-disciplinary collaborations are encouraged. Initially, the proposed facility will be a cost centre within the University's Department of Computing and Communications (C&C). C&C provides the required facility management functions following the recommendations of the Users' Group. Significant cost savings are realized by not duplicating many of the services already provided by C&C. It is proposed that there be some separation between the activities of HPC&C at MUN and the activities of C&C in other areas of university computing. This will give the HPC&C initiative an opportunity to establish its own identity. Stakeholders are represented by the University's HPC&C Committee which represents the interests of the researchers as well as the wider community. The HPC&C Committee includes members of the HPC&C User's Group, the Director of C&C, the Vice-President (Research) and members of the wider university community. This body reports to the Vice-President (Research). With funding, this body could then include representation from industry and government. Proposed Location It is proposed that the facility will have offices in Spencer Hall, currently being renovated as the new Research and Technology Building, a centre for research and technological innovation focussing on high-technology research, development and technology transfer. To reduce artificial barriers of location between the Centre and its user community, satellite offices will also be supported throughout the university community. Personnel The primary resource of the facility will be a complement of highly skilled computational scientists with research experience. The computational scientists will be either recent Ph.D.s with extensive experience with a wide range of high performance computing architectures or research computing professionals with equivalent experience. Visualization specialists will enter into an apprenticeship program with the Division of Educational Technology and will gain a wide range of graphics and numeric skills in all areas of TV, video and telemedia productions as well as scientific visualization. Operations and systems support will initially be provided under facility management with the Department of Computing and Communications. Significant technical expertise is also available in the research community and this initiative will coordinate its activities with the strategic priorities of the research community. To obtain the maximum benefits from the proposed facility a minimal management and administrative structure will oversee the day-to-day operations of the facility. Financial Projections External funding will be sought to match the research community's own commitments to HPC&C in a coordinated strategy to more cost-effectively leverage these significant research investments. A five year cash flow summary is shown in the following table. To clarify the presentation of the base budget, unsubstantiated sources of revenue and financial contributions are set to zero. Reaction Draft Cash Flow Summary ('000s) Year 1 Year 2 Year 3 Year 4 Year 5 5 Year Totals (%) Cash Disbursements Personnel $448 $457 $461 $471 $485 $2,323 (44%) (Number of people) (8) (8) (8) (8) (8) Computers $399 $360 $384 $408 $432 $1,983 (37%) Communications $100 $100 $100 $100 $100 $500 (9%) Miscellaneous $112 $100 $100 $100 $100 $512 (10%) Total Disbursements $1,058 $1,017 $1,046 $1,079 $1,117 $5,317 (100%) Opening Bank $0 $0 $0 $0 $0 $0 Cash Receipts $0 $0 $0 $0 $0 $0 Total Cash Available $0 $0 $0 $0 $0 $0 (0%) Cash Receipts Over Disbursements -$1,058 -$1,017 -$1,046 -$1,079 -$1,117 -$5,317 (100%) Financial Contributions Memorial University $191 $192 $192 $192 $192 $959 (18%) Research Grants $0 $0 $0 $0 $0 $0 (0%) User Contributions $0 $0 $0 $0 $0 $0 (0%) Other $0 $0 $0 $0 $0 $0 (0%) Total Financial Contributions $191 $192 $192 $192 $192 $959 (18%) Total Funding Required $866 $826 $854 $887 $925 $4,358 (82%) Closing Bank $0 $0 $0 $0 $0 $0 Under cash disbursements, the budget is summarized into personnel, computers, communications and miscellaneous categories. The costs of personnel (at 44% of total budget) includes salaries (32%), benefits (at 15% of salaries), overheads (at 20% of salaries), and space (at 4% of salaries). The costs of computers (at 37% of total budget) includes purchase of computer time at remote facilities (5%), purchase of computer equipment (20%), maintenance (7%), power (1%), and software (5%). [2] Communications costs are estimated as the cost of implementing a campus wide ATM backbone network ($350K in equipment and $150K in installation costs over 5 years). Miscellaneous costs (as a percentage of total budget) include the costs of preparation of seminars and workshops (5%), PST and GST (5%), auditor's fees (<1%), etc. Under cash receipts, there is no strong component in this proposal for generating revenues through the sale of products and services. Nominal fees for programs like industry workshops and seminars, sponsorships for interns and/or co-op students, for example, could generate cash receipts to offset some of the costs of delivered programs. Financial contributions to the initiative includes the following: Memorial University's contribution to the overall budget includes current line items for HPC&C in the university budget plus full cost accounting for in-kind contributions (primarily space and power). Budget realities introduce the constraint that additional support for this proposal from the university budget is not available. Additional contributions to the initiative may also come from the research community through research grants and/or user contributions. For example, this proposal is eligible for funding under NSERC's infrastructure or collaborative special projects grant programs. The balance of the funding required to meet the proposed budget will be sought externally. Risks and Benefits Risks The primary risks in HPC&C ventures are: 1. Unrealistic expectations 2. The pace of technological change 3. Underfunding 4. Political imponderables 5. Doing nothing Unrealistic expectations have crippled previous HPC&C initiatives in Canada. Obsolete models such as the centralized service bureau model have not been successful in the academic environment and there is no evidence that commercial HPC&C ventures will fare any better. Unrealistic expectations have the crippling effect of imposing impossible goals, introducing constraints that reduce flexibility in operations, and introduces the cycle of failure. Only after experience has been gained in this venture can realistic goals be set. It will take many years of effort to place the facility proposed here on a firm footing. This is in marked contrast with the very short times required to procure and install technology. The 5 years proposed here should be considered a minimum. Providing the maximum benefit for the minimum cost requires continuous technology renewal. Large systems must demonstrate economies of scale and must be renewed or replaced on a 2-3 year cycle. Since large acquisitions have a political cycle that commonly exceeds the technology cycle it is extremely difficult to pursue this strategy without strong political will and leadership. The strategy of following the rate of change in microprocessor technology requires continuous investment in technology renewal on a 6-12 month (or less) time scale. The strength of this proposal is a balance between people and technology. This balance is very sensitive to funding levels. The technology component is set at a lower threshold for providing access to new technological options such as parallel processing, for example. This is balanced with a minimum level of infrastructure support through trained personnel. Inadequate levels of funding will eliminate both technology options and jobs. Political imponderables will have an effect on this initiative. Misconceptions and myths concerning HPC&C abound. The marketing hype surrounding HPC&C can be intense. Through the active involvement of the researchers, the danger that the proposed venture could become isolated from the research community is minimized. The biggest risk is doing nothing. In the final analysis, doing nothing is the most insidious of "strategies." With nothing ventured, there is no debate of issues and no pressure to make difficult decisions. This proposal encourages a full and open process for debate on difficult issues relating to adopting and implementing new technology, but with an imperative that decisions to move forward ultimately must, and will, be made. Benefits of a Newfoundland Initiative in HPC&C Americans have hailed their NSF Centers Program as "an extraordinary success." According to a review of the program in 1992, the program has reaped the following benefits: * Enabled an enormous body of research by providing the national research community with access to state of the art facilities. * Stimulated the development of the computational arts, sciences and engineering by involving many who had not previously used high performance computers. * Contributed in important ways to education ranging from the graduate level to K-12. * Strengthened the infrastructure of the computational arts, sciences and engineering through software development, and through its major role in the development of the national network. * Played an important role in technology transfer through its interaction with vendors and its outreach to the industrial user community. We propose an initiative based in Newfoundland that emulates the successful U.S. model on a small scale for building a high performance computing, communications and people-oriented infrastructure in support of computational research to reap similar benefits for this community. The scale of the initiative is appropriate for the local environment and leverages existing resources as much as possible. The primary benefit of adopting a balanced approach to technology acquisition and infrastructure support is that the benefits delivered to the local community scale well with the expenditures. A knowledge-based economy is a people- based economy above all else and the emphasis in this proposal of balancing people with technology embraces this philosophy. In Newfoundland, we are driven by circumstance to adopt a sense of urgency in reshaping our economy. The level of collaboration and cooperation required across all sectors of the Newfoundland economy to meet this challenge is unprecedented in our history but tremendously exciting as well. This proposal highlights a modest strategy for a local community of researchers in the computational arts, sciences and engineering to participate in meeting this challenge. Introduction The Memorial University of Newfoundland's High Performance Computing and Communications (HPC&C) Users' Group, the HPC&C Standing Committee and the Department of Computing and Communications, through the office of the Vice-President (Research), propose to build an initiative at Memorial University of Newfoundland that will inspire the faculty and students of this institution, entrepreneurs in industry and agencies in government to explore new models of conducting research enquiries in the arts, sciences and engineering. These new models of research activity will be enabled through the power of high performance computation, communications and people. Computers, networks and researchers skilled in their use are revolutionizing research in the arts, sciences and engineering. The range of applications to which these tools can now be applied is vast. The nature of the qualitative changes to research in all disciplines are incredibly diverse. Technology is changing our lives and the pace of change is quickening. One imperative of this proposal is putting in place the required infrastructure not only to cope with rapid technological change but also to exploit it. But technology is not a panacea. The technology component of this proposal is of secondary importance to the focus on people and research programs. These research programs are increasingly built on partnerships and collaborations with other disciplines, with other universities, with industry and with government. The activities outlined in this proposal span a spectrum from the pure arts and sciences, to the applied sciences and engineering, to the research, development and commercialization of products and services in the industrial and government sectors. Choosing an appropriate balance between infrastructure and technology will be crucial to the viability of this initiative. The "people component" proposed in this document is not less than the "technology component" in the overall budget. This approach leverages existing as well as future investments in people and technology more cost-effectively. This proposal shows how a fiscally responsible modest approach can yield benefits of true HPC&C at a cost that appropriately leverages current expenditures on researcher's salaries, that remains within current university budget commitments, and with a "no hidden costs" balanced approach that will not become an unexpected financial burden. The technology strategy adopted in this proposal is aimed to provide the widest range of benefits to the local community at the lowest cost. The strategy proposes initially to have a modest local capability for high performance computation linked together by a campus-wide high performance ATM communications network integrating voice, video and data, with high-speed regional network linkages feeding into the national CANARIE initiative to provide local researchers access to the full range of available high performance computing facilities across the country. Recognizing that a national strategy for HPC&C to support the above model does not yet exist, which is a reasonable model given Newfoundland's size and population, this initiative proposes also to build linkages with similar initiatives across the country to lobby the appropriate provincial, regional and national agencies to establish this model of computing and communications in Canada. Initiatives in HPC&C around the world are typically modelled after the successful high performance computing and communications program in the U.S. There have been several attempts since 1985 to transplant this model of computing and communications here in Canada. These attempts have met with (at best) limited success, indicating that we have yet to find an appropriate "Made in Canada" model. The experiences gained to date have nevertheless been valuable and the lessons learned have been incorporated in the present proposal. It must be emphasized that the model being proposed in this document has its own potential for both successes and failures. An important feature of this proposal is a process of regular evaluations and the introduction of corrective measures as an expected and normal part of the process. Memorial University's Information Technology Plan [3] Computational science incorporating the use of complex computer-driven models is revolutionizing research into many traditional problems and providing opportunities to tackle challenges previously beyond reach. The key components of computational science are computational scientists, high performance computers, high bandwidth networks, simulation algorithms optimized for specialized hardware, and visualization of computational results. While the number of researchers using high performance computing is relatively small, it is growing. High performance computing is an essential component of their work, enabling them to tackle a class of problems that cannot be solved any other way. This technology is increasingly applied in many diverse fields including electronics design, oil and gas recovery, pharmaceutical design, human genome biology, computational ocean sciences, computational chemistry, and speech and vision study. In some scientific disciplines the rate of progress is now governed by the availability of this technology. In addition, advanced graphics/visualization systems are important to many teaching and research activities in the University, and they are compute-intensive. Memorial's community includes a number of researchers who are computational scientists. These researchers are working in biology, biochemistry, earth sciences, engineering, mathematics, medicine, and physics. To acquire the resources necessary to carry out research, these individuals have computed around the globe, purchased high-powered workstations, and used whatever general computing resources that could be made available to them from any source. But their work is impeded by the lack of high performance computing access and support which limits the simulations that can be studied. Consequently, models run much slower, employ 2-D when 3-D analysis is otherwise possible, and depend upon less refined analysis than otherwise feasible with more powerful and advanced equipment. Further, researchers lack the tools to visualize the product of simulations, a capability essential to rapid understanding of the phenomena being studied. There are numerous obstacles to the implementation of high performance computing. Hardware and software complexities, insufficient training, relatively slow wide-area communications speeds, difficult algorithm development, lack of funding, and attitudinal barriers abound. In addition, the issues of computational science are national in scope as Canada addresses the issue of global competition. In recognizing this important research need, Memorial University must also be positioned to attract and retain the best students and faculty. As high performance computing capabilities proliferate, the lack of high performance computing access and support will be increasingly significant impediments in recruitment and retention. High performance computing will also affect educational programs. Whether in the fields of computer science, physics, engineering, or another discipline, the availability of high performance computing will enable faculty to incorporate techniques in advanced courses and teach the use of leading edge technology. STRATEGY Secure access to high performance computing resources and provide the support necessary for effective use. OBJECTIVE 09 As an interim measure, provide access to resources capable of serving the low end of Memorial's high performance computing requirements. OBJECTIVE 10 Working through a users group, create a needs and opportunity assessment, examine models for the provision of high performance computing, assess the technologies available, identify human resource requirements, and create a proposal for funding. OBJECTIVE 11 Actively participate in national initiatives leading to the development of high performance computing infrastructure and the high speed regional and national networks necessary to eliminate communications bottlenecks in resource access and real time visualization. There is a growing population of relatively high performance multi-tasking workstations distributed around the University. With notable exceptions, these systems are typically not used at full capacity, especially during off-hours. Advances in computer performance offer an opportunity to harness those free computing cycles by aggregating their combined capability into one (virtual) homogeneous resource. Emerging software products automate the distribution of processes across the various computers to create a high performance (parallel) processing capability. The available software will be investigated and implemented where feasible. OBJECTIVE 12 Implement a network-wide virtual high-performance computing environment incorporating distributed workstations and servers. Defining HPC&C What do we mean by high performance computing and communications (HPC&C) in this proposal? In this proposal, high performance computing and communications can be defined as follows: 1. Capability: high performance computers are those machines which have a higher level of capability (in speed, memory size, numbers of processors, applications software or other specialized functionality) than is accessible to individual researchers on their desktop machines. 2. Capacity: the machine(s) made available have a higher level of capacity (or throughput) such that there is a net decrease in the turnaround time relative to the same job running on an individual researcher's desktop machine. 3. Resource Sharing: as the machines represent capability and capacity greater than any individual or group can typically afford themselves, the resources of the community have to be pooled to meet the need. 4. Communications: Equitable access to the resources at a level matching the promise of high performance computing technology can only be provided through high performance networking. It is important to avoid a rigid and limiting definition for HPC&C given the pace of technological change. For example, a particularly innovative software program such as a symbolic algebra application running on a PC could significantly accelerate a researcher's investigations and would be worthy of being labelled "high-performance." A particularly innovative piece of hardware such as a virtual reality headset might similarly represent a dramatic change in how a researcher might conduct research. To allow both approaches to have a suitable place in this initiative and allow the programs developed under this proposal to have the widest range of flexibility, we highlight one additional characteristic of HPC&C: 5. Innovation: A focus on innovative applications of computing and communications technology that enhances the productivity of individual researchers. Requirements For A Healthy Initiative The Program Advisory Committee examining future directions of the National Science Foundation's Supercomputer Centers Program outlined the general requirements for a healthy centre: [4] * Stable, proven technology capable of meeting the needs of the bulk of its users. * A fully configured new technology machine capable of enabling significant progress on Grand Challenge problems. * The staff to support its hardware and its user community." The Natural Sciences and Engineering Research Council's (NSERC) Committee on Research Computation (CORC) was more emphatic in its recommended requirements. [5] "The single most important requirement for researchers is a stable, reliable computing environment that is affordable and easy to use." "This requirement has a number of aspects that impinge on how best to provide adequate computing power: * The hardware itself must be reliable and stable. This is really not much of an issue, since today most hardware is very reliable. * Software must be available and reliable. The quality of standard software libraries varies widely among vendors, particularly the more recently established vendors. * The underlying technology must advance at a high enough rate that the changing computational needs of researchers can be accommodated. Alternately, the environment provided for the researcher must accommodate convenient migration from one computing facility to another of greater power. * The cost of acquiring and supporting the technology must be affordable." An NSERC Feasibility Study described in more detail the desired profile of an HPC&C facility, placing more emphasis on the surrounding infrastructure. [6] "Available resources must include the following: a) staff for training, education, administration, marketing, technical programs, and systems support; b) acessible facilities for teaching, training, machine housing, files storage, offices, and visualization and work stations; c) a range of leading edge technologies*at least high power workstations, vector and parallel machines; current, developing and specialized software; d) networks which permit off site transparent and interactive use of the on site technology; e) both central and local support and expertise; f) fast production and turnaround speed; g) maximum memory and storage capacity; h) a resource of knowledge about special (complementary) features of centres elsewhere; i) facilities for maintaining confidentiality of programs and service; This NSERC study recognized that it was unlikely that any one centre could be all things to all users. In this respect, the report states "If a number of centres emerge in Canada they should not be precise replicas of each other but part of a synergistic, collaborative network designed to meet a variety of computing needs with great efficiency." [7] History The NSF Supercomputer Centres Program Most (if not all) activities in HPC&C around the world are modelled after the National Science Foundation's Supercomputer Centers Program in the United States. The impetus for the program was the observation that American researchers in the early 1980s had to travel abroad (typically Europe) to gain access to American-built supercomputers. In 1985/86 five supercomputer centres were established as nationally accessible facilities and linked by a "high- speed" 56 kilobits/s backbone network. The driving force behind the program was the recognition that unprecedented computational power and capability was needed to investigate a wide range of scientific and engineering "grand challenge" problems. The so-called Grand Challenges were identified by Nobel Laureate Kenneth Wilson as being those problems whose solutions were critical to national economic and scientific competitiveness. Examples of grand challenges included: prediction of weather, climate, and global change; determination of molecular, atomic, and nuclear structure; understanding turbulence, pollution dispersion, and combustion systems; mapping the human genome and understanding the structure of biological macromolecules; improving research and education communications; understanding the nature of new materials; and problems applicable to national security needs. [8] The NSF's Supercomputer Centres Program has been hailed as an "extraordinary success." [9] The program quickly exploded to become much more than purely providing researchers with access to supercomputers. The Centres have been at the forefront of hardware and software technology and networking development since their inception. This program now embraces the vision of a single unified computational resource called the "metacomputer" that integrates all computing and communications resources, from desktop PC to massively parallal supercomputers, and makes all resources transparently and seamlessly available on the desktop. The program enjoyed the support of several champions in government, one of whom was then-Senator Al Gore, who pressed forward in establishing high performance computing and communications as the cornerstone of a U.S. strategy to maintain their pre-eminent economic standing into the 21st century. These efforts culminated in the U.S. High Performance Computing and Communications Act of 1991 promoting the development of a National Information Infrastructure (NII). With $2,500 million in funding over 5 years, the U.S. is emphasizing information technology as the cornerstone of its economy in the 21st century. Similar levels of funding underly the long-range infrastructure development of Europe and Japan. Supercomputing In Canada: The First Generation Two initiatives in Canada attempted to emulate the U.S. model. The first of these was established in 1984 at the University of Calgary's SuperComputing Services (SCS) with the installation of a Cyber 205 supercomputer from Control Data Corporation. The second initiative was born in Ontario in 1986 with the establishment of the Ontario Centre for Large Scale Computation (OCLSC) at the University of Toronto. It is difficult to gauge whether the first generation of high performance computing initiatives in Canada were a success or failure. Both the SCS and OCLSC facilities failed to obtain renewed funding in time to prevent their closing in 1990 and 1992 respectively. Yet both Alberta and Ontario did ultimately approve the funding of followon initiatives to establish what are now completely new centres with government funding of $10 million and $29.4 million respectively over five years. Gillespie, Folkner & Associates Study of OCLSC The OCLSC facility, in particular, has been extensively studied and reviewed. One of the last reviews in the life of the Centre was "A Large Scale Computation Evaluation Study For The Council of Ontario Universities" by Gillespie, Folkner And Associates, Inc. [10] Since this study was commissioned by the Ontario Provincial Government, which subsequently approved funding of $29.4 million to continue its support of high performance computing in the province, its findings and recommendations provide an important snapshot (circa 1991) of the lessons learned from OCLSC. While some recommendations presuppose the procurement of a state-of-the-art supercomputer costing many millions of dollars (which is not the focus of the present proposal), a significant subset of the recommendations have direct relevence to this initiative; all the recommendations of this study are reproduced below (those components deemed important are italicized): 1. ***Computational facilities for high performance computing must continue to exist*** as an Ontario activity; 2. Ontario should have its own supercomputer activity; 3. ***Networking at high speeds must continue to receive very high priority***, but not in this initiative; 4. ***Outreach programs to industry researchers should be a high priority;*** 5. ***Activities for the large scale computation organization should be distributed as much as possible to university, government and industrial sites;*** 6. The large scale computation organization should be a not-for-profit organization closely affiliated with the Ontario universities; 7. ***A project to train Ontario researchers in parallel computing must be started as soon as possible;*** 8. ***The Ontario large scale computation organization will be required to develop strategic and evaluation plans using a process that involves members and is well publicized. (In such joint projects, it is important that all reasonable ideas are heard and discussed, that participants know the process and that ownership be established);*** 9. ***There should be a remote visualization activity;*** 10. ***There should be an ongoing investigation of the techniques for moving complex models (or parts of models) from expensive large scale computation equipment to RISC systems;*** 11. ***Procurements should be based on a stable funding policy. (Stable funding means a consistent amount for each year of the funding period. This allows (encourages) the responsible organization to make arrangements with vendors to keep modern and up-to-date systems, rather han a large purchase which is then out of date before the next system can be obtained.);*** 12. A decision should be made for the long term on the future of the Ontario large scale computation organization as soon as possible; 13. ***The Executive Director of the Ontario large scale computation organization should be a respected researcher;*** 14. ***An evaluation plan and process should be developed.*** Unrealistic Expectations For Full Cost Recovery In 1986, OCLSC's business plan proposed full cost recovery through sale of computer time to industry. This source of revenue was expected to offset the subsidies delivered to academic researchers. The Gillespie, Folkner and Associates study reported that for the period of 1 January 1987 to 1 July 1991, a total of 541.73 cpu-hours were delivered to industry, with benchmarking totalling 166.3 cpu-hours. The balance of 375.43 cpu-hours represented billable revenue. Assuming billing rates somewhere intermediate between the published contract research rate of $300/cpu-hour and the commercial rate of $500/cpu-hour, the total revenue from industry sources is something on the order of $113K-$190K over the first 4.5 years of operations. (Commercial activity was most significant in the final year of operations and much of this is not included in this reported figure. We believe that a reasonable uncertainty in this number could be a factor of 2. Another significant industrial contribution not accounted for in revenues was the value of specialized equipment purchased by industry partners and installed at OCLSC) The expectation of generating revenue from the sale of computer services to industry at OLCSC proved to be unrealistic. More than any other factor, the expectation that the facility at the University of Toronto could sustain itself through revenues proved to be the primary measure by which the centre was judged. This unrealistic expectation was not unique to OCLSC. The same assumption has been, and still is, part of the business plans of many other centres both elsewhere in Canada and in the U.S. Regional Initiatives In Canada HPC High Performance Computing Centre (HPCC) In Alberta, an industry-led consortium consisting of Pulsonic Corporation, ACTC Technologies Inc., and Fujitsu Systems Business of Canada (FSBC) submitted a $37 million proposal to the Alberta government to establish a supercomputing service bureau in Calgary. Both the Alberta and Canadian governments (through the Western Economic Diversification Fund) contributed $5 million each to establish the facility. The technology installed was the Fujitsu VPX240 with a peak processing capability of 2.5 Gflop/s. The new Centre, called the HPC High Performance Computing Centre (HPCC), opened its doors in June of 1993. At present, HPCC is the only publically accessible supercomputing facility in Canada. HPCC is actively marketing the VPX240 to clients across Canada and is participating in several national initiatives. WurcNet The first university-based supercomputing centre was created at the University of Calgary in 1985 with a Cyber 205 vector supercomputer. This centre, which offered services primarily to the Alberta research community, was unable to secure renewed funding and was closed in 1990. As described above an industry-led consortium subsequently submitted a $37 million proposal to the Alberta and Canadian governments which created the HPC High Performance Computing Centre (HPCC). The shortage of infrastructure support for the academic community at HPCC has provided the impetus for a consortium of Western Canadian universities to prepare a $5.5 million proposal to NSERC to support academic high performance computing. The consortium, called the Western University Research Consortium Network or WurcNet, is seeking support from NSERC ($1.1M/yr over 5 years) and CANARIE ($954,000 in 1993). [11] WurcNet's submission to CANARIE proposes the building of an ATM trial network that will provide 45 Mbits/s of network bandwidth in early 1994 connecting six sites in Alberta, Saskatchewan and Manitoba. Later in 1994 this initial network will be expanded to provide full ATM communications at 155 Mbits/s to all of these sites plus 4 new sites in British Columbia. The goals of WurcNet researchers and WurcNet Inc. companies are * High Performance Networking: to connect the universities in Western Canada and the WurcNet Inc. members by a 1.2 gigabit per second ATM network (initially to be a 45 megabit per second network); * High Performance Computing: to provide access on this high performance network to a variety of scarce computing resources, intially HPCC's Fujitsu VPX240, the BC Systems Corp.'s 8096 node DEC MasPar MP-2 and the University of Alberta's IBM workstation farm; * Commercialization: to develop HPC and HPC applications that will be commercialized through WurcNet Inc. members; and * Collaboration: to promote collaboration among producers and consumers of high performance networking and computing, and between university researchers and WurcNet Inc. members. The WurcNet proposal is supported by 89 researchers representing 75 research projects. Ontario Inititiatives In Ontario, $29.4 million in funding for high performance computing has recently been signed off by two departments of the Ontario government. This proposal was initiated in 1989 by users of the Ontario Centre for Large Scale Computation (OCLSC), initially for ongoing support and upgrading of that facility. This academically-based centre, which provided access to a CRAY X- MP/28 to the Canadian research community, was closed in March of 1992. Shortly afterwards, the Ontario government announced that it would continue to support high performance computing and a collaboration of industry, government and university representatives have been preparing the required business plan for what is now a completely new centre. When the OCLSC was closed, the remaining staff of that centre were absorbed by the University of Toronto. After some internal reorganization, the University of Toronto Instructional and Research Computing (UTIRC) department was established. After some experimentation with cluster computing, the University of Toronto research community supported the procurement of a $0.7 million 32-processor parallel computer from Kendall Square Research with infrastructure support provided by UTIRC. Quebec Initiatives In Quebec, there are two notable initiatives. One of these is the Centre de Researche Informatique de Montreal (CRIM) with approximately $8 million per year of provincial government funding. The primary focus for CRIM is information technology but there is a small component of activity in parallel computing. A second high performance computing initiative in Quebec, le CEntre de Recherche en Calcul Appliqué (CERCA) specializing in computational fluid dynamics applications, is funded by the provincial government at the level of $12 million over 5 years. The members of CERCA comprise 8 principal investigators, 6 invited members, 7 industrial partners and 1 industrial associate. In addition to the principal investigators, their students and post-doctoral fellows, CERCA employs a research staff of 12 employees. Atlantic Canada Infrastructure Lags Behind Rest of Country In Atlantic Canada, there has been only one previous initiative in high performance computing, a facility established at Dalhousie University in Halifax in 1991 with $1.6 million of funding over 2 years from the Atlantic Canada Opportunities Agency. The technology adopted was an Alliant FX2800/16, a parallel computer with 16 processors using the 1989-vintage i860 microprocessor. This initiative was handicapped almost from the outset when Alliant declared bankruptcy shortly after this facility opened. Nevertheless, the installed system is currently used at capacity, primarily by researchers at Dalhousie University. There is no funding available to upgrade the Dalhousie facility. Since the original vendor no longer exists, this site must ultimately replace the current system with a completely new computational platform. A software development laboratory was also established in conjunction with the above high performance computing facility. One milestone achieved for this facility was the development of a prototype parallel processing development environment called SPEFY. This software product has been demonstrated at Supercomputing Symposium '93 in Calgary. A company called Hypercomp was also formed to promote high performance computing in industry. This organization markets itself commercially in advising clients in the application of high performance computing with a special focus on cluster computing. The University of New Brunswick employs an IBM 3090 mainframe computer with an attached vector facility for its high performance computing needs. Even with a vector facility, this system is no longer competitive with desktop or deskside systems. At UNB, the department of computer science is active in exploring the capabilities of parallel computing. Memorial University of Newfoundland [12] The Memorial University of Newfoundland's (MUN's) high performance computing and communications initiative is the only current initiative in Atlantic Canada committing significant resources to the development of the high performance computing and communications infrastructure of Atlantic Canada. Prior to 1980, computing at MUN was performed on an IBM 360 operated by the Newfoundland and Labrador Computer Services (NLCS). This mainframe computer was inadequate for research computing and researchers had to travel elsewhere to obtain access to the appropriate facilities. The purchase of a VAX 8800 in the mid '80s provided faculty, for the first time, with sufficient computational power to conduct serious research at Memorial. At the same time, the Department of Computing and Communications (C&C) at Memorial provided and supported various mathematical and graphical libraries that contributed significantly to the computing resources. This hardware and software support were provided at no cost to researchers. Two years after the purchase of the 8800, the demands placed on it by the university community had greatly increased. Unfortunately these demands were not met by any sustained upgrading of the facilities and the university supported facilities rapidly became "hopelessly inadequate." Through a combination of departmental and individual NSERC operating and equipment grants, researchers began purchasing their own computer systems in the late '80s. A key element in the creation and continued use of these systems was the communications network installed and operated by C&C. While these local systems dramatically enhanced the computing available to individual researchers, they conferred no particular competitive advantage relative to other researchers at other universities in Canada or abroad. A significant step forward was the procurement of a Convex C-1 mini- supercomputer in 1986 by the department of Earth Sciences. Surprisingly, this machine is still in use today as a tape server for researchers but has fallen well behind even the smallest of desktop workstations in its computational power. These acquisitions marked a profound shift in the manner in which research based on high-end computing was conducted at Memorial. In the mid 1980s, hardware, software and support were provided (at no cost) by the university. In recent years, by contrast, hardware and software together with maintenance costs are paid mainly by NSERC funds, with some support from departmental and decanal funds. Significant personal and departmental resources are devoted to routine support activities. Only a small fraction of the total research computing needs is currently being provided by C&C. Several researchers at MUN have attempted to access and use remote high performance computing facilities at Atmospheric Environment Services (AES), Toronto, Calgary and Halifax. Network linkages from Newfoundland at that time were very rudimentary and by the time a national network was put in place across the country these facilities either had closed (Toronto and Calgary) or were no longer available. Presently, access to HPCC in Calgary is severely impeded by the low 56 kilobits/s bandwidth of Canada's national network CA*Net. This situation may be alleviated somewhat when CA*Net is upgraded to 1.544 megabits/s in June 1994, but there has been a pent up demand over many years for network bandwidth to support the client-server model of computing employing graphical user interfaces that is anticipated to quickly saturate this upgrade in bandwidth. Federal Initiatives There is no counterpart in Canada of the very strong federal commitment to high performance computing and communications programs that is provided to U.S. researchers by the National Science Foundation. The combined federal contribution to high performance computing in Canada (when the centres in Montreal, Toronto and Calgary were open) were typically less than 3% of the U.S. contributions on a per capita basis (i.e. the actual federal expenditures are less than 0.3% of those in the U.S.). This is in marked contrast to the corresponding provincial contributions to high performance computing which were on par with state contributions in the U.S. Recently, isolated examples of federal support have been provided to a Western Canada initiative ($5 million to the HPC High Performance Computing Centre from the Western Economic Diversification Fund) and a Nova Scotia initiative ($1.6 million to Dalhousie University from the Atlantic Canada Opportunities Agency). Unfortunately, these expenditures are not part of any over-arching federal strategy for the support of high performance computing and communications in Canada. Intense lobbying of the Natural Sciences and Engineering Research Council (NSERC) by supporters of the previous generation of high performance computing centres led to a higher awareness of the need for NSERC to adopt a policy with respect to the support of HPC&C in Canada and its response was to strike an ad hoc Committee on Research Computation (CORC), which issued a report in September of 1990. [13] This committee concluded that: * the provision of an internationally-competitive computational and communications infrastructure is essential to the future vitality of research in Canada; * such an environment is also critical to the ability of the university sector to prepare future generations to enter a highly trained work force ready to participate in an increasingly information and information-technology dominated world; and * a research and university sector that has a competitive computing and communications environment can more effectively contribute to the future health of Canadian industry and the Canadian economy through both training and technology transfer." The CORC concluded by recommending that "a National Initiative for Scientific Computing (NISC) should be undertaken, with leadership from NSERC, with the goal of assuring the availability to Canadian researchers of a comprehensive support package for research computation. In the Committee's view, the provision of support for research computation as outlined in this proposal is essential to the conduct of innovative science now and in the foreseeable future." [14] Elements of this National Initiative for Scientific Computing (NISC), all of which were deemed essential, include: * at least one significant, large-scale computation facility offering services to researchers from across the country, to be established following a peer- reviewed competition; * a national data communications research network; * increased funding for the infrastructure supporting the use of computers and networks by the research community; and * the provision of facilities based on newer architectures to ensure the orderly and informed exploitation of these machines as their software systems mature and become fully useful and production-oriented research environments. CANARIE CANARIE, the CAnadian Network for the Advancement of Research, Industry and Education, is Canada's answer to developing the high-speed communications infrastructure needed to compete in the 21st century. CANARIE's mission is [15] "To support the development of the communications infrastructure for a knowledge-based Canada, and thereby contribute to Canadian competitiveness in all sectors of the economy, to prosperity and job creation and to our quality of life." The objectives of the CANARIE business plan are * Upgrade the existing R&D/E network to gigabit/s capabilities * To promote the use of the network to users * To establish a high-speed experimental gigabit/s test network * To stimulate the development of new networking technologies, products, services, software and applications by Canadian organizations * To support the migration to operational networks of commercially viable networking technologies, products, services, software and applications CANARIE's goals are * To enhance the competitiveness of Canadian industry through the development and use of communications technologies * To accelerate the development of future generations of standards-based, open systems networking products, services, software and applications by the Canadian IT sector * To support more effective research, development and education through enhanced collaboration and access to the information and resources worldwide The president of CANARIE Inc., Dr. Andrew Bjerring of the University of Western Ontario, is one of the authors of the CORC Report and the proposed NISC. As the only national initiative currently addressing the serious shortcomings in Canada's computing and communications infrastructure there is the danger that expectations of CANARIE are too high. To place CANARIE in perspective, the proposed 1.544 megabits/s upgrade in June 1994 of the CA*net backbone is itself a small fraction of the old 10 megabits/s Ethernet standard. 45 megabits/s (T3) is the bandwidth of choice for upgrades to regional networks such as the one being now being implemented by WurcNet in Western Canada. T3 will not be available in CANARIE, according to its current plan, until Phase 2 in 1997. Super*CAN Super*CAN is Canada's national association for high performance computing. Established in 1987 by Supercomputing Services (SCS) at the University of Calgary, Super*CAN has since grown to include participants from coast to coast. Super*CAN has no sources of funding and members of the executive contribute their time voluntarily with the financial support of their employers (for travel to meetings, for example). Since 1987, this association has sponsored a national symposium which has served as a forum to promote high performance computing and communications in Canada. Research Computing Surveys An extensive process of consultation with the wider university community was conducted to establish the needs of the research community at Memorial University that currently depend on HPC&C, to open the door for potential new users of the technology (particularly from outside of the traditional science and engineering communities), as well as initiating an ongoing process of inviting comments on the present initiative from those who have concerns about the potential effect of the present initiative on other parts of the university community. MUN Survey A survey of the Memorial University community for researchers with applications requiring access to high performance computers was solicited by the Vice-President (Research). [16] This study highlighted many users of HPC who currently conduct their research with difficulty with the available resources on campus, at other universities, abroad or in the U.S. 34 researchers or groups were contacted regarding their research computing needs. Each researcher was given a questionaire. 16 submissions were returned, 14 from individuals (representing Engineering, Chemistry, Physics, Computer Science, Medicine, Psychology, and OSC) and 2 from research groups (Earth Sciences and Condensed Matter Group in Physics) representing a total of 18 researchers. One submission was received from the NRC/Institute for Marine Dynamics. Of the 16 researchers not returning a submission, 7 indicated that they currently had no need for HPC (researchers in Earth Sciences, Mathematics, Engineering, Medicine) and did not return their questionnaires, 2 expressed interest and/or support (Chemistry, Sociology), and the balance did not respond (Computer Science, Business, Engineering). If the latter two groups are categorized as "no response" a total of 25 of 34 researchers responded to the above survey, a response rate of 74%. The 16 submissions were separated into 3 categories. Category A (8 researchers) had demonstrated need for access to a supercomputer, based on stated needs for high speed computations and/or large memory requirements. Category B (9 researchers) had needs that could be met by workstations. The balance, Category C (3 researchers) had needs that were difficult to categorize. Based on the results of the above survey, the following recommendations to the Office of the Vice-President (Research) were made: 1. A high performance computing facility or capability be established on campus. The facility would support a number of high-end workstation class servers (for example, the fastest machines available for under $100K), and include the appropriate level of infrastructure support in the form of personnel who have expertise in the area of Scientific Computation and/or Visualization. 2. That the national funding agencies (primarily NSERC) be lobbied to establish a good national high performance computing facility and put in place a high speed network connecting all universities and other research centres. High Performance Computing (HPC) Technology in Education In 1992 the Vice-President (Research) at Memorial University commissioned a companion study on "High Performance Computing (HPC) Technology in Education: An evaluation of needs, status and potential for Memorial University of Newfoundland, and potential benefits for education in Newfoundland." [17] This report complemented the above survey in its evaluation of research computing and its potential utility to both the research and academic infrastructure of the university. This report is attached as Appendix XX. (The report is available to the University through "Gopher.") The report on HPC in Education was distributed to the University community and responses to the report were invited. Many of the responses observed that the report dealt with only a small part of the question of research computing needs on campus. The responses were nevertheless very useful in indicating the priorities of the wider university community. The following summarize the essence of the responses received. For brevity, only selected passages from each author were selected (and sometimes edited) and these are presented anonymously. "I am in favour of high performance computing in general because it can help bring Newfoundland and Memorial University into the 21st century. I am very skeptical about the intent of the report and its orientation to education. Given the high cost (of HPC), the university might better be served by ... enhanced networking, connecting all offices to the Ethernet, and encouraging faculty to become technologically literate, functional and productive." "We should provide access to HPC to those who are able to demonstrate a capability to contribute some share of the costs: the facility does not have to be locally situated. The most important thing is suitable networking. The next most important thing is to be able to bring the excitement of learning through computation into the classroom..." "The issue of service quality is CRITICAL, and must be addressed before we consider adding to the responsibilities of what seems to be an understaffed (C&C)." "The members of the Committee ranked HPC as number three on their priority list, with the completion of a campus wide network and computer literacy placed in first and second positions respectively." "My only concern is a practical (and possibly parochial) one. If large sums of money are dedicated to the development of HPC, it is possible that money might not be available to the Department and other Departments in the faculty to renew and/or develop our own projects." "I just do not think that a technological fix will provide the type of solutions that we so desperately need." "If someone needs high end computing, or any other leading edge technology, for their research, it is up to them to go out and get it from external sources." "If this project is not driven by need it is doomed to fail. Who can't get their work done now without such a facility?" "My only comments on this are my usual howls about the inadequacy of the social science component." "I would like to have a HPC facility here because I would use it myself. However, I would rather have an incoming population of high school students who can read and write properly and who have been exposed to some rigorous math, physics and chemistry... It would be very useful to start a national dialogue on the topic of national HPC facilities (if such a dialogue has not already started). I am unconvinced that MUN is the obvious place to house such a facility but we certainly need access to one." To summarize the concerns and priorities in the above responses, the following conclusions can be drawn: * Networking is a high priority at the university. * There exists a general concern that inadequately funded HPC activity could siphon funds from other areas. * Technology will not by itself provide a solution to Newfoundland's economic situation. * HPC activity at MUN should be driven by need. * There is a need for better infrastructure support for research computing. NSERC Survey The NSERC Committee on Research Computation performed a survey of their research community regarding the need for high performance computing. Of a total of 7000 NSERC-supported researchers, there were 300 respondents to the survey. The following table is a summary of the responses. [18] Total NSERC-supported researchers: 7000 Sample size: 920 Respondents (response rate): 300 (33%) Supercomputer users: 5% (rising to 8% in a few years) Priorities of researchers: Exclusive use of the machine Fast turnaround Large memory capacity Availability of display graphics Mass storage Interactive graphics Coding practices: 66% write own codes or use third-party software 25%fine-tuned codes for specific machine(s) 10% used each of vectorization and parallel processing Consulting needs and assistance: 83% needed some consultation and assistance in using computer hardware and software from on-campus sources 47% needed some from off-campus sources. Average expenditures: $16K on direct computing activities. Hardware was biggest expenditure, software and purchase of CPU time, in that order. Anticipated growth: Migration away from mainframes Increasing use of workstations and superworkstations Increase of 5% to 8% in users of supercomputers Use of major and minor networks: ~66% made use of networks ~66%reported email as most common application Need for more training and support Inadequate bandwidth (for Space/Astronomy and Subatomic Phyics). Research Computing Expenditures at MUN External funding will be sought to match the research community's own commitments to HPC&C and thereby provide new capability in a coordinated strategy to more cost-effectively leverage these significant research commitments. An informal survey of the major research computing groups on campus was done to approximately measure the level of research computing activity on campus. The idea was not to perform an audit of the individual groups but to arrive at a reasonable estimation of the current scale of research computing expenditures. This approximate figure is then used as a baseline to establish a reasonable scale for the present proposal. For this initial survey, the major research groups or departments engaged in computing activities that could be labelled as being research oriented were the Oceanography and Condensed Matter Groups in Physics, and the departments of Earth Science, Chemistry, Engineering and Computer Science. Of these groups/departments, only Oceanography, Condensed Matter, Earth Science and Engineering had computer system(s) that could be labelled as research computing servers. Of the major computer systems, the Convex C1 in Earth Science had the highest purchase price (in 1986) but this system was deemed to have been fully depreciated by FY93 and FY94. Nevertheless this system still contributes significantly to expenditures through its continued maintenance charges. Of the remaining systems, their yearly costs were assigned as the purchase price spread over 3 years. With respect to research personnel, there were 6 research assistants and 4 more- senior research computing specialists identified whose duties were oriented entirely to research computing support. To preserve a measure of confidentiality for this survey, the research assistants were arbitrarily assigned salaries of $27K and the research computing specialists were assigned salaries of $35K. Of the 4 senior specialists, we excluded the one specialist from Computer Science since this department did not support a single identifiable server system but rather many more distributed systems. The research assistants were typically paid from research grants and the specialists were typically paid from departmental or university budgets. A more detailed breakdown of expenditures provided by the research community was used to calculate overhead contributions. These figures included costs of office equipment, furniture, telephones, etc. and were then indexed to salaries. The value obtained from this was 20% of salaries for overheads. A survey of office space occupied by support personnel was done and similarly indexed to salaries (200 sq.ft for $60K of salary) using value assigned at the standard rate of $12/sq.ft./yr (yielding 4% of salaries). Benefits were calculated at the standard rate of 15%. Total overheads then comes to 39% of salaries. The Oceanography group provided a detailed breakdown of power used by their compute server (4.8 kW), workstations (10 at 3.0 kW) and X terminals (15 at 1.0 kW). We arbitrarily assumed that this group represented 25% of the power requirements of research computing on campus. Further uncertainty is introduced by not knowing what fraction is sustained power consumption. Computers are generally left on 24 hrs/day but monitors can be switched off outside of working hours. It was thought reasonable to estimate the power as being consumed only during a standard 8 hr day, 240 days per year. Thus 4 major consumers at 50 kW, 8 hrs/day, 240 days/yr and $0.06/kW-hr totals $23K/yr in total power expenditures by the research computing component of the university. Not included in this survey are the contributions to research computing in equipment and technical support provided by the Department of Computing and Communications. The FY94 contribution exceeds $130K. The following table is a summary of these activities. Reaction Draft FY93 And FY94 Research Computing Expenditures ('000s) FY93 FY94 Cash Disbursements Salaries 267 267 Benefits (@15% of salaries) 40 40 Overheads (@20% of salaries) 53 53 Space (@4% of salaries) 11 11 Computer Equipment 134 154 Power 23 23 Maintenance 88 100 Software 46 55 Communications 24 24 Taxes (PST+GST) 28 32 Miscellaneous Total Disbursements $714 $759 Financing Memorial University (Inkind & budget) 175 175 Research Grants 509 554 User fees or contributions 30 30 Other Funding Total Financial Contributions $714 $759 The conclusions that can be drawn from the above figures is that there is a substantial commitment invested in research computation by the MUN research community. The majority of the expenditures is in non-computer related items, with research support personnel being the biggest fraction of the total expenditures. If one includes the value of the research programs themselves being leveraged by the available computing power, the expenditures on computing resources is extremely modest in relation to the total value. Since total investments in computing technology are modest, overall benefits delivered to the research programs are very sensitive to the performance of the computing technologies. Needs This proposal to enhance the computational and communications infrastructure of the province is driven by several distinct local, regional and national needs. These can be summarized as: 1. Developing, promoting and protecting investments in people. A high priority area is developing the infrastructure that encourages highly skilled individuals to remain in the university community and in the province. 2. Local researchers at Memorial University whose research programs are severely bottlenecked at the computation stage, with turnaround time measured in months and, in one case, years. 3. Continuous evaluation of technology options to identify the most cost- effective solutions to keep researchers at or near the forefront of technological developments. 4. Need to transform the Newfoundland economy from it primary resource based industries to a more diversified economy that includes information technology and knowledge-based industries. 5. Need for an Atlantic Canada regional initiative in HPC&C to match similar developments in Western Canada, Ontario and Quebec. 6. Need for national strategy in HPC&C to eliminate geographic and economic disparities. The current lack of a coherent national strategy strongly impacts on the economic development of smaller provinces like Newfoundland. 7. Need for attitudinal change. Career Prospects Bleak for Research Personnel One of the core values of this proposal is a higher regard for issues relating to personnel than issues of technology. The career prospects of many Ph.D. graduates and research computing professionals is bleak despite the fact that these individuals represent some of the most talented and highly skilled members of our work force. One of the skill areas that many graduates in the pure and applied sciences and engineering acquire over many years of training is a high level of skill in the application of high performance computers to solving research problems. Frequently, these talented individuals can obtain no permanent employment in their fields of expertise but, rather, must subsist on a year-to-year basis with contractual employment in the university environment, either through post- doctoral fellowships or through low-paying research assistantships. These individuals comprise a sub-class of the academic research infrastructure that isn't generally visible or appreciated by society at large yet many highly valued research programs could not proceed without this support. A fraction of this community of highly skilled but low paid personnel provide substantial research computing infrastructure support to research computing at MUN. To the individual researchers whose research funding is used to pay their salaries, they make an indispensible contribution to the computational research being conducted at Memorial. Although the salaries for each individual may be low, their salaries are a significant fraction of many researchers total grant funding. Frequently, these individuals abandon these low level positions whenever an appropriate opportunity arises, with the effect that the substantial training costs incurred by the researchers is then lost. The lack of any career path for these highly trained professionals leads to a self-defeating employment cycle. There is a need to raise awareness of the value of the research computing support staff. The dependence on these individuals in both the research and educational areas of the university is great and is increasing. The present system of contract employment is not a dependable mechanism since critical areas in both research and education can be severely compromised if key individuals suddenly leave. Research programs alone at MUN approach almost a million dollars per year of value and much of this value hinges on the support of a few individuals. In the research community, research computing staff working under contract do not enjoy the same level of remuneration, professional recognition or job security that is enjoyed by those employed in supporting administrative computing at the Department of Computing and Communications, for example. Inequity in the present system is causing a brain drain from the research computing community into other parts of the computing community that is unnecessarily causing tensions that will compromise future endeavours, such as the present initiative, that depend on creating and sustaining a collegial relationship between all sectors of the university community. There is therefore a need to implement a system of job classifications for research computing staff that parallels the classifications of C&C staff but also recognizes the unique and distinctly different roles these research computing staff play in the support of computational research at the university. The challenge is to incorporate a classification structure that recognizes both experience and educational background, incorporating both the graduate in Computer Science as well as the Ph.D. in Chemistry. Such a system was found to be necessary at the University of Toronto in the early years of OCLSC. At OCLSC, the need was established when the decision was made to employ Ph.D.s to provide the highest level of research support. Without an appropriate level of professional recognition for the unique qualities of both a research background and the highest level of expertise in a wide range of computational skills, it would not have been possible to staff the facility and provide the level of support required. At that facility, the position of Computational Scientist was created and incorporated within the job classification scheme of the university. Once implemented, this system worked well in retaining valued staff at the centre. Although the exact classification scheme would still need to be devised, other candidate positions could be Research Computing Specialists, Visualization Specialists, Scientific Programmer/Analysts, and so on. The value of introducing a system of job classifications for research computing staff at Memorial is the recognition on the part of the university of the value of these professionals through the provision of well-defined professional positions and a visible career path. Current contract positions employing valued research support staff could become permanent positions. This initiative seeks to attract a significant level of external funding to build on the foundation that the present research computing staff represent. This cannot be done if the foundation within the university is structurally unsound. None of the additional staff in this proposal are intended to replace any current positions. This would be counterproductive. On the contrary, it is essential that this proposal build on and extend the existing infrastructure of the research community. It is universally acknowledged that the current model of computing has moved away from a centralized computing model to a distributed computing model. It is necessary for human resource policies to make a similar shift. Computational Research At Memorial University The NSERC CORC report emphasizes that "The most critical need for the researcher is to get computational results back quickly enough that the entire research cycle of pre-processing, computation, post-processing, analysis, pre- processing, etc., is not so severely bottlenecked at the computation stage, that either the research suffers or the scope of research has to be restricted." [19] A significant number of researchers at MUN are conducting research on problems that can be classified as "grand challenge" problems. These are applications in computational chemistry, rational materials design, oil reservoir modelling and global ocean/climate modelling. These researchers must compete with colleagues in the U.S. and abroad who have ready access to a full range of computational capability from desktop workstations to the most powerful supercomputers. The research programs currently being conducted at Memorial are described in a companion document entitled "Research Projects at MUN." The following table identifies some of the research programs currently being attempted with available technology. The extremely long turnaround times in some cases (measuring in months) has arisen because many of these applications had been forced onto workstation platforms when the centres in Toronto (CRAY X-MP) and Calgary (Cyber 205) closed. Most such research programs have simply been abandoned or suspended until the appropriate computing capability becomes available. Principal Discipline Application Platform CPU Time Investigator (typical job) Bass Engineering Ship dynamics and vortex DEC 5000 >100 hrs dynamics Brooker Chemistry Ab initio quantum chemistry with MUNGAUSS, G90, AMBER de Bruyn Physics Study of phase transitions, SGI R4000 Not feasible chaotic and ordered states on current machines Greatbatch/ Physics Circulation and variability in SGI R3000 (6 >1 mth de Young the North Atlantic Ocean: cpus), IBM modelling and data studies, RS/6000 320 global ocean modelling Lamb Physics Internal gravity waves and CRAY X-MP Not feasible stratified flow over topography on current machines Lagowski Physics Rational materials design CRAY X-MP >5 mths SGI R4000 Lines Earth Science 3-D oil reservoir modelling Convex C-1, >1 mths SUN SuperSparc McNamara Educational Rendering of graphics images 486/66 MHz 1 day/frame Technology for slides, video and TV (30 frames/second) Miminis Computer Inverse eigenvalue problem Science (pole placement) in Control Engineering, numerical simulation in cancer research Orr Medicine Molecular modelling of small steroids to study steroid chemistry. Large macromolecules involved in steroid metabolism and transport Pickup Chemistry Ab initio calculation of the Not feasible electronic structures of on current conducting polymers machines Poirier Chemistry Ab initio computational CRAY X-MP >1 yr chemistry, full gradient SGI R4000 geometry optimizations of transition states Rabinowitz Psychology SUN Sparc10 Richardson Medicine Molecular modelling of polysaccharide structures Sharan Engineering Neural networks in robotic manipulations Swamidas Engineering Fatigue crack initiation and propogation in planar and multiplanar joints, detection and prediction of cracking in structures Taylor Biochemistry Molecular modelling and x486 Days or dynamics of peptides and weeks lipids, rational drug design Whitehead Physics Magnetic properties of high SGI R4000 temperature superconductors Whitmore Physics Equilibrium statistical SGI R4000 mechanics of phospholipid- water systems and polymer blends Technology Assessment An important need area is the continuous and ongoing evaluation of technology options to identify the most cost-effective solutions to keep researchers at or near the forefront of technological developments. In Appendix 2 is a preliminary technology assessment. [20] Rather than being a review of current technology which would be out of date almost before this proposal could be released, the technology assessment reviews the important architectural features of a wide range of modern high performanc computing platforms, including the reduced instruction set computer (RISC), the pipelined ("vector") computer, as well as parallel computers. More importantly, the technology assessment includes a suite of benchmark codes extracted from research applications at MUN that measures the performance of research computers at the University. The benchmarks highlight features of systems that could prove to enhance or hamper the performance of high performance computers on researcher's applications. As experience with these systems increases, the benchmarks being performed will themselves become more sophisticated. It is important to note that performance analysis is an emerging discipline in its own right and is undergoing continuous development and refinement. The motivation behind the development of a benchmark suite is to introduce some measure of objectivity into an area in which scientific rigour has traditionally been sorely lacking. The marketing hype surrounding high performance computing is enormous. Since high performance computers represent the leading edge of technology developments there is little information available apart from the claims of vendors and marketing brochures on which to base a purchasing decision. Another important conclusion that can be drawn from the technology assessment being performed is a strategy for enhancing performance. Although the performance can obviously be enhanced by purchasing a faster computer (yielding a factor of 2 or more), it may not be so obvious without a performance analysis that one may be able to attain similar or greater performance improvements through changes in the code itself through optimization techniques or calls to appropriate subroutine libraries. Preliminary Observations On the basis of a small number of research codes which were instrumented to measure absolute performance, it was possible to make the following tentative observations: 1. Research computing expenditures at MUN are approximately $700K-$750K per year. With a total of 14 CPUs at 5000 cpu-hours/year each, the average cost for a cpu-hour on campus is $10/cpu-hour. If only the purchase price amortized over 3 years is considered, the average "machine" cost drops to $2/cpu-hour. 2. The average performance of the few research codes at MUN studied to date can be well approximated as half the Linpack DP 100 by 100 performance rating of the computer. 3. The average cost/performance of research codes on local microprocessor- based servers at MUN for the period 1990-93 at the full cost of $10/cpu- hour and half the weighted Linpack performance rating of 7.5 Mflop/s is not significantly less than the cost/performance of a CRAY Y-MP8 at a market rate of $100/cpu-hour operating at its measured average performance levels of 70 Mflop/s. 4. Given two computers of equal cost-performance, the computer with the faster turnaround (this isn't necessarily the computer with the higher performance!) leverages research investments more effectively; i.e. delivers higher benefit per unit cost. 5. Paying a premium cost for a computer to obtain faster turnaround can be justified, but only when the costs of the research programs being leveraged are higher than the costs of the computer. 6. The "lifetime" of a computer delivering a factor 10 faster turnaround time at 10 times the cost delivers higher benefit per unit cost for only 2-3 years, assuming alternative computers are doubling in performance per unit cost every year and assuming the value of the research programs being leveraged are at least equal to the value represented by a principal investigator's salary. 7. An overloaded computer is instantly obsolete. The rationale is that, with computer performance doubling every year, a stretch time (defined as the ratio of wall-clock time to cpu-time) of 4 is equivalent to 2 years of depreciation. 8. In the absence of saturation and assuming that the value of the research can be related to the performance of the available technology, maximum benefits per unit cost are obtained with a policy of continuous technology renewal; "big" computers on a 2-3 year cycle and "small" computers as frequently as possible (at least 6-12 months). For many, it may be surprising that, at least for the period 1990-1993, we are still in a state of transition between centralized mainframe and distributed microprocessor systems where we have not yet realized either substantial cost/performance savings or increased benefits per unit cost, particularly when a full cost hourly rate is calculated. The faster turnaround, if this is delivered, favors the mainframe environment. The higher level of responsiveness of local computer support and the higher level of local control is favored in the distributed environment. The reason for this is that it is not sufficient for a mythical computer to deliver 1/10th the performance for 1/100th the cost since the majority of research expenditures are not computer related. It is debatable whether a mythical computer delivering the same unit of performance for 1/10th the cost is truly cost-effective, for the same reason. The mythical computer will deliver increased benefits when we can derive 10x the computing for the same unit of cost. The rate at which microprocessor based technologies are overtaking mainframe based technologies is moderated by the pace at which mainframe technologies are also advancing. The Y-MP technology alluded to above has long since been superseded by new technology (the C90 introduced in 1992) that is 2.5 times more powerful per processor with a factor of 2 increase in the number of processors. Regardless of the technology, one trend is clearly emerging: the trend towards increased levels of parallelism. Emerging technology now promises significantly increased performance (nearly a 1989 Y-MP equivalent of computing capability) per CPU for the same level of cost as the current generation of technology. If the downsizing revolution is to be truly culminated, there is a need to aggressively move to these new technologies that provide significantly higher benefits per unit cost or we will fail to realize the favorable cost-benefits of these technologies. Migration to Parallel Computing In 1991 a paradigm shift in computer architectures occurred when parallel computers entered the mainstream of high performance computing technology. [21] Accompanying this shift is the potential for a corresponding "revolution" in computational science. All of the NSF Centres now feature parallel machines to supplement their traditional HPC mainframes. Two of the Centres (NCSA and CTC) have indicated that they will be phasing out their traditional platforms in favor of parallel platforms over the next two years. Any transition to parallel computing is risky. The technology is immature and the technical challenges are great. There are little or no prior experiences to draw upon in evaluating the technology. Parallel computers are easy to build and there are a plethora of parallel computer vendors, many more than the current market can support. There is no indication that more than a small fraction of the current crop of parallel vendors will survive. On a small number of real world benchmarks, parallel machines are only beginning to be cost- effective relative to mainframes. [22] No standard parallel architecture has yet emerged, meaning that applications on one parallel platform are not generally portable to other parallel platforms. Cost-effectiveness of parallel systems is highly application-dependent. Nevertheless, parallel computing is universally regarded as the next evolutionary step in high performance computing. The Promise of Parallel Computing The accompanying figure illustrates the promise of parallelism. [23] Difficulties With Parallel Computing Early experiences with the use of parallel computers has shown that it is a difficult technology. Woodward et al. summarize the difficulties this way: [24] "The software issues arising in using scalable parallel systems are as immense as the performance potential of these systems. Scalable parallel processors require entirely new classes of algorithms from those that have been successful on vector processors. Problems are solved in very different ways. The performance achieved with parallel processors is highly sensitive to software details to an extent almost unimaginable by someone who has not had direct experience with these systems. These software details affect the programmer, the compiler, the runtime support, and the operating system. One area of particular importance is the mapping of the program to the processors and memories of the machine. Parallel architectures differ from one another to a much greater extent than do vector architectures. Thus, not only are new algorithms required, and not only are their efficient implementations extremely subtle, but the porting of these applications between architectures, for example, as one architecture supercedes another, is difficult. *These software issues can only be addressed by a coordinated, multidisciplinary effort.* (our emphasis)" Transforming Newfoundland's Economy According to the Annual Report of the Economic Recovery Council (ERC), "Newfoundland and Labrador is at a crucial crossroads in its economic development." [25] It has been recognized for many years that Newfoundland must diversify and transform its economy from one that is natural resource based to one based more in the information technology and knowledge-based industries. According to the ERC, transforming the Newfoundland and Labrador economy requires "first and foremost that we strengthen our human resource base through education and training and through the timely acquisition of new knowledge and technology." [26] Furthermore, the ERC advocates that "as much capital expenditure as possible be devoted to what we can call 'strategic infrastructure.' By this we mean infrastructure that can contribute to long term economic development." One example of strategic infrastructure given by the ERC was "improving the communications links between the province and other parts of the world." [27] The importance of computing and communications as part of this recent recognition of the importance of strategic infrastructure has been recognized also by the new federal Liberal government. The Government's focus on creating jobs through its $2 billion program in building up the infrastructure of the country now includes provisions for building up the information infrastructure of the country. An Atlantic Canada Regional HPC&C Initiative? If one compares expenditures in high performance computing and communications in Atlantic Canada with Quebec, Ontario and Western Canada, it is immediately evident that the development of the high performance computing and communications infrastructure in Atlantic Canada seriously lags behind developments in the rest of the country. The following graph summarizes the regional expenditures on HPC&C in the four regions of the country as outlined in the historical survey: Note that in this chart, we include only the following contributions: Western Canada Actual $37M (HPCC) Proposed $12M (WurcNet) Ontario Actual ? Proposed $29.4M (Approved by gov't) Quebec Actual $40M (CRIM over 5 years) $12M (CERCA over 5 years) Proposed ? Atlantic Canada Actual $1.6M (Dalhousie over 2 yrs) Proposed ? Not included in the above are contributions due to provincial or federal centres of excellence which contribute to the HPC&C infrastructure of Ontario and Quebec (although CRIM in Quebec has most of its emphasis in the IT sector, not HPC&C) or the contribution of Research Computing British Columbia in B.C. It should also be noted that the above are 5-6 year commitments currently in place or proposed. The only exception is the Atlantic Canada contribution at Dalhousie where the $1.6M in funding expired in 1993 (hence the actual current commitment to HPC&C in Atlantic Canada is zero). This proposal is based on Memorial University initially being the hub of an expanding sphere of activity extending first out into the local community, elsewhere in the province and ultimately, when a solid foundation has been laid, to then build linkages elsewhere in Atlantic Canada and across the country. In our view, the ideal model would be one where similar initiatives elsewhere in Atlantic Canada would be similarly supported with an eye towards moving towards the stronger integration of each component into a well-identified regional activity. It would only be this regional initiative which could justify lobbying for equivalent levels of support for HPC&C that is currently enjoyed elsewhere in Canada. In Newfoundland, local industry does not have any local base of support in the development of information and knowledge based technologies that corresponds with the well funded "centres of excellence" in other parts of Canada. Canada Lacking In Coordinating A National Initiative In the U.S. the High Performance Computing and Communications Act of 1991 is promoting a National Information Infrastructure (NII). With $2,500 million over 5 years, the U.S. is emphasizing information technology as the cornerstone of its economy in the 21st century. Similar levels of funding underly the long- range infrastructure development of Europe and Japan. The U.S.'s National Science Foundation Supercomputing Centres program has been hailed as a great success. These centres have been playing a leadership role in the development of the NII and have spawned similar initiatives at the State level throughout the U.S. In a drive to bringing all the nation's supercomputing resources to every researcher's desktop, the four NSF centres have fostered a closer relationship through cooperative agreements and integrated programs in a drive to promote the idea of a "national machine room" or "metacentre". The emphasis is balanced between computation and communications. The development of such a large collaborative venture is supported by strong federal government support through the NII. In Canada there is currently no government program to integrate the nation's computational resources as part of a corresponding initiative. Dr. Larry Smarr, Director of the National Centre for Supercomputing Applications at Urbana- Champaigne, Illinois, commented prior to Supercomputing Symposium '93 in Calgary that "Canada is at least five years behind the United States in high performance computing." [28] No Canadian Facilities Are Internationally Competitive The authors of NSERC's proposed National Initiative for Scientific Computing observed that "None of these facilities (in Toronto or Calgary) or approaches have provided Canada with an internationally competitive supercomputing facility, and there is legitimate concern that Canadian scientists and engineers who require this type of computing capability do not have secure access to appropriate computational facilities." [29] Of the three existing academic supercomputing centres at that time, considered by the CORC as not being "competitive" with their counterparts in the U.S. and around the world, two have since been closed. In the CORC Report, the problems at the Ontario Centre for Large Scale Computation and SuperComputing Services at the University of Calgary were due in part to "budgetary problems" and "uncertainty of funding." [30] Personnel Most Significant Part of Infrastructure Support One of the benefits of technological change is that the power of the traditional mainframe supercomputer (the 1977 vintage CRAY-1 or 1982 vintage CRAY X-MP for example) is now available on (or beside) the desktop for a cost that is now less than the salary of the person sitting at that desk. One of the drawbacks of these developments is that this desktop technology requires similar levels of support that the previous mainframe technology required and because of the increased diversity of technologies available the demands for infrastructure support are actually increasing. The NSERC CORC Report examined the question of infrastructure to support research computing in universities and observed that support personnel were the most significant component of infrastructure support: [31] "The most obvious components of infrastructure support are hardware maintenance, hardware for connecting systems to local networks, the local networks themselves, licence fees, software upgrades and support personnel. ... The most significant component of this list in terms of both cost and competitive advantage, however, is personnel. ... Much of the desk-top computing equipment acquired by researchers is being under- supported and ineffectively used. ... Possibly worst of all, scientists and engineers sometimes are forced to use their own time (or the time of their students) to perform the functions of system managers and technicians, time that could otherwise be used to conduct productive research activity." For the majority of researchers the capability delivered to their applications by powerful desktop technology has actually increased their research productivity. For a smaller minority of researchers at MUN (and, by extension, across Canada) the closures of previous high performance computing centres has had a severe impact on their research programs, with desktop machines typically now delivering only about 1/10 of the capability they had employed only 2 or 3 years ago with no significant savings in costs. The closure of the academic centres in Calgary and Toronto had another effect* that of eliminating infrastructure support for research computing. Many users of these previous centres were introduced to high performance computing for the first time and, with the help of these centre's staff, were able to measure the performance of their applications and receive assistance in optimizing their applications for higher performance. Extensive optimization on the traditional supercomputers also led to improved performance on advanced RISC workstations since these machines employed many of the architectural features of their mainframe predecessors. The majority of researchers in Canada today, and particularly their students, have not been exposed to the high performance machines of the previous centres and have little knowledge what performance levels or efficiencies their applications are running at currently. When these researchers attempt to use new-generation technology with far higher levels of pipelining or parallelism in their architectures, their unoptimized codes typically perform "poorly." The most worrisome aspect of this is that most researchers are unaware of the absolute performance of their applications since performance is not measurable on the majority of current architectures. Not knowing the absolute performance of an application can be expensive. The following anecdote illustrates the problem. The act of instrumenting a benchmark code for the technology assessment part of this proposal (by counting the number of floating point operations by hand and inserting the appropriate timing calls), a highly efficient code obtained a performance level of 1.3 Mflop/s on one machine and 6.0 Mflop/s on a second, newer, machine. The second machine appeared to be a far better buy. But the performance on the first machine was a smaller fraction of its peak performance than expected and in trying to discover the reason, it was found that a change in compiler switches (a few minutes work) yielded a performance of 7.4 Mflop/s, a factor of almost 6 times. Similar switches were found for the second machine, raising its performance level to 12.7 Mflop/s, an additional factor of 2. Instrumenting a second benchmark code led to the discovery that a public domain C compiler (obtainable free on the network) yielded a speedup of 1.7 times the performance of the vendor's own compiler for which a rather substantial fee was paid. The question is: how many tens of thousands of dollars are being spent in gaining a factor of 2 improvement in performance through the buying of new technology, through lack of knowledge of the potential performance of their current machine? Note that nothing was done with the code itself as yet. Optimization efforts can yield similar or greater performance improvements with very modest expenditures in terms of infrastructure support. The key to higher performance is obviously a balanced strategy of technology procurement and appropriate infrastructure support. Canada Unique In Not Supporting Large Scale Computational Facilities The issue of recruiting and retaining first-class researchers in Canada is a strategic priority at many institutions across the country, particularly at Memorial University which must overcome some measure of geographic isolation. The NSERC CORC Report warns that "Canadian institutions already are losing promising faculty members to the U.S. and other centres." [32] One reason given for the exodus was the observation "Canada is unique among developed countries in not providing substantial support for large-scale computing facilities for scientists and engineers..." [33] The NSERC CORC Report also linked the availability of large scale computation facilities to the quality of Canadian science, observing that "one third of the NSERC Steacie Fellowship holders in the last ten years have been large-scale computer users." [34] NSERC Initiative on Scientific Computing Following the release of the NSERC CORC Report, NSERC drafted a proposal entitled "National Initiative on Scientific Computing*A Proposal For Implementation." In particular, recommendation number 4 states: "That NSERC allocate $4.0 million per year for partial support of a small number of Canadian large-scale computational facilities. The funding would be used to help provide the necessary staff and support infrastructure to ensure that the facility can be used effectively by the Canadian research community. The number and level of grants to be awarded will be determined following peer-reviewed competition." [35] The above proposal, more commonly known as the NISC, was subsequently approved by NSERC Council but, to date, no line of funding has yet been formally established. The NISC Proposal concludes with the following statements: "The fundamental conclusion reached by CORC relating to large- scale computational facilities (LSCFs) is that certain types of research require ready access to such a resource to be competitive." [36] "The Committee itself concluded that (the research fields requiring this kind of investment) were too important to simply ignore, and that NSERC should ensure that Canadian scientists and engineers had guaranteed access to at least one state-of-the- art facility." [37] At present, no such facility exists in Canada. Two previous university-based facilities in Calgary and Toronto closed in 1991 and 1992 respectively. The pace of technological change has been so great, particularly in the increased performance of desktop machines, that the consequences due to a complete lack of a national strategy in HPC&C has been moderated by the ability of individual researchers to use their operating grants or to obtain equipment grants from their granting agencies to purchase their own personal computing resources. However, it appears that even this trickle of funding for computational support may be drying up. Of four recent equipment grant applications for upgraded computing resources in the Departments of Physics and Chemistry at MUN, for example, none were funded. Canada Falling Behind In Transition To Knowledge-Based Economy In a memo to university presidents outlining NSERC's strategy for the next five years, the President of NSERC observed that "Canada's transition to a knowledge-based economy ... has fallen behind that of our major competitors." [38] NSERC's President linked technical literacy with the country's future prosperity and noted the mounting recognition on the part of the public that Canada's position relative to its competitors has been deteriorating. "Public anxiety over Canada's competitiveness and the level of technical and scientific skill in the country's workforce is mounting, with good reason, since never in our history has a technically-literate population been more crucial to our prosperity." [39] The NSERC Strategy Document distributed to university presidents outlined the essential building blocks for a knowledge-based economy as: 1) coherent research and development strategy, 2) research and training programs, 3) optimal use of major national facilities. In the context of the present initiative in high performance computing and communications, Canada is seriously lacking in all three areas. Canada cannot hope to muster the resources that would be necessary to emulate the U.S. initatives in high performance computing and communications without an unprecedented level of collaboration and cooperation in all regions of the country. It can be argued that CANARIE is Canada's response to providing a comprehensive strategy on the communications side. A drive at the grass-roots level has promoted some movement towards developing a Canadian equivalent of the U.S.'s "national machine room" concept. Some of these developments provide the context for any new initiative, such as one based in Newfoundland. Canadian Meteorological And Climate Centres Canada enjoys world-class stature in the application of high performance computing to weather prediction and climate modelling. The Canadian Meteorological Centre (CMC) has been Canada's premier centre for high performance computing, competitive with any centre in the world. With a budget of approximately $40 million [40] per year, the CMC has been able to acquire state-of-the-art high performance computers. In 1991, the CMC leased an NEC SX-3/44 machine with a peak performance of 22 Gflop/s. Although this machine is devoted to its primary mission of weather prediction and is not generally open to the university research community, some researchers in the academic ocean and climate modelling community are able to obtain some time on this machine. The lease arranged with NEC is very aggressive, with substantial upgrades every 2-3 years over a span of 7 years which guarantees that AES will be at the forefront of technology over the lifetime of their contract with NEC. Infrastructure support for the CMC SX-3 is less than ideal for researchers not directly supporting the operational models. Because of the mission critical nature of the CMC's mandate, all applications other than the operational weather model are of lower priority. This includes some departments even within AES itself; for example, the Canadian Climate Centre (CCC). For external organizations that have strong ties to AES, such as the Bedford Institute of Oceanography, the lower priority accorded their applications and the lack of infrastructure support makes using this machine more difficult. Government restructuring has led to the relocation of the CCC from Toronto to Canada's west coast. An important reason behind the move is to integrate the CCC's Global Climate Model (GCM) with ocean models developed by a strong ocean modelling community at the University of Victoria. The goal is to predict long-term global climate trends (such as global warming and the ozone hole) with increased accuracy. The negative effect of the CCC move to the west coast is the potential that the strong ocean modelling community in St. John's will become increasingly isolated from the mainstream of development of coupled global ocean/climate models in Canada. There is currently strong collaboration between the Oceanography group at Memorial (Greatbatch, deYoung, Lamb) and the University of Victoria group that must be maintained. The need is to provide equivalent levels of support to the east-coast ocean modelling community as will be enjoyed by the west coast community. Stronger network links such as those that might be provided by CANARIE could facilitate collaboration between the east and west coast ocean communities. A strong computational platform for the continuing development of the coupled climate/ocean models at Memorial could also promote a peer relationship with the new west coast based CCC. Geographic Economic Disparity And Computer Networks Regional need has driven the implementation of 45 Mbits/s (T3) links in some regions of our country today (e.g. WurcNet in Western Canada). It has been only days since fibre-optics became available from coast to coast, but the capabilities of this new medium and the almost limitless communications bandwidths this offers are driving these exciting developments. One of the dangers of this accelerated progress in CANARIE is that there is no corresponding move to eliminate the geographic disparities in network communications with an appropriate network topology. The consequences to Newfoundland could be severe if high bandwidths are available in some regions (i.e. Central or Western Canada) and not in others (Atlantic Canada). Key networking technologies in Newfoundland in distance learning and telemedicine, for example, are state-of-the-art products developed in Newfoundland that could establish a significant Newfoundland presence in Canadian networking. Newfoundland currently is located far from the "centre" of the network and economic opportunity. There is no technical reason why communications links between Newfoundland and the mainland need to be constrained by geography as they are today. It may well be that maximum economic opportunities for the province are given by direct links into major metropolitan markets like Montreal, Toronto or Calgary. For example, British Columbia enjoys a direct connection to Toronto. From the point of view of high performance computing, Canada's only publically accessible high performance computing centre is located at HPCC in Calgary. Access to Calgary via the present network (56 Kilobits/s) to HPCC from Newfoundland is extremely poor, primarily because the "logical distance" (number of shared communications links) of the computer network between St. John's and Calgary mimics the "geographical distance" between the two cities. Because computer networks are currently shared (like having a "party line" stretching across the country), the network bandwidth available deteriorates significantly with distance. Attitudinal Change The Economic Recovery Council recognizes an additional need for Newfoundlander's as well as others in the rest of Canada to modify their perceptions of Newfoundland which are very damaging in attracting business and/or investors to the province. Media attention typically focusses on the negative and little attention is given to some exciting areas of development in the province. High technology and the knowledge-based industries are high growth industries throughout Atlantic Canada and the general public just does not get the sense of excitement that is enjoyed by meeting the challenges in these industries. Opportunities Research groups at MUN with the greatest need for computation are seeking increased capability at approximately the same time. There is therefore a window of opportunity for these groups to cooperate in a joint venture to obtain significantly greater computational capability than each could obtain by themselves. In the past year or two several significant technology milestones have occurred. One of the most significant is that the cost of a so-called "CRAY equivalent" has dropped below the costs of infrastructure (i.e. people). The question that then becomes most important is the following: what level of expenditure on computer technology maximally leverages the university's large investment in salaries and infrastructure support for its "computational scientists"? Answering this question leads to a fundamentally different definition of high performance computing than that represented by the traditional mainframe-based supercomputer centres of a few years ago. It would now seem reasonable to find a level for high performance computing expenditures somewhere intermediate between the costs of a PC and total budget expenditures on research programs. Networking enhances the research infrastructure of all sectors of the university community. As high-speed computer networks are also essential to high performance computing, the opportunity to accelerate the development of computer networking at MUN and Newfoundland is a significant component of this proposal. Long-range planning by MUN's Department of Computing and Communications includes a migration to ATM (Asynchronous Transfer Mode) as the carrier of backbone traffic on campus. The installation of glass fibre on campus is well advanced, with more than 20 buildings having access to this new medium. With funding, the eventual migration to an ATM-based backbone extending across the MUN campus and ultimately into the surrounding research and industrial community could be greatly accelerated. With the closures of SCS in Calgary and OCLSC in Toronto, there are no longer any nationally accessible university-based high performance computing centres in Canada. An initiative based in Newfoundland could stimulate a pan- Atlantic Canada collaboration with an initiative similar in scope to that of WurcNet could then mount a credible bid to NSERC for a similar level of funding ($1M/yr) as that proposed in the NISC. Spencer Hall, recently acquired by the university, is to become the focus of the university's efforts to build a bridge between the university research community and the private and public sectors. The building will be the new home for the university's Office of Research, Seabright Corporation and the Canadian Centre for Fisheries Innovation. The building will also include provisions to support a "technology cluster" (or incubator) aimed at high-technology companies, with emphasis on biotechnology, computers and communications, environmental services, medical devices and ocean technologies. [41] The proposed location will ultimately become the hub of high technology activity in Newfoundland and could be an ideal location for the facility outlined in this proposal. There is the opportunity to use computational science to enhance the professional stature and career prospects of those Ph.D. graduates who have acquired high levels of skill in the application of high performance computers to their respective disciplines. The unique qualities these support staff bring to the position of computational scientist is an intimate knowledge of the needs of researchers (having themselves been a part of the research community), extensive experience in computing, and a significantly different attitude to client support. Cross-Disciplinary Co-operation and Collaboration Historically, computational researchers at MUN obtained computer cycles elsewhere to pursue their research programs. In recent years, research groups in Earth Science, Oceanography, Chemistry and Condensed Matter Physics have attempted to satisfy their computational needs by acquiring and supporting their own computer systems. These systems quickly were saturated. The useful life of these systems is typically only 3 years. In the case of the Convex C-1 in Earth Sciences, seven years of useful service (bridging two generations of mini-supercomputer technology) has been squeezed from this system through several upgrades to its memory and peripherals. This system is still performing useful work primarily because important software for the department is available on the C-1 and no other platforms. The Silicon Graphics Inc. multiprocessor system in Oceanography is at the end of its useful life and the recent acquisitions by the Condensed Matter Group in Physics will need significant upgrading within a year. All of these research groups at MUN with the greatest need for computation are seeking increased capability at approximately the same time. There is therefore a window of opportunity for these groups to cooperate in a joint venture to obtain significantly greater computational capability than each could obtain by themselves. Technology Opportunities The following figure illustrates the new economics of high performance computing: Traditional so-called "supercomputer" costs are universally perceived to be relatively constant at about $30 million (or $6-7 million per year for 4-5 years). The performance of this class of machines has been growing at a power law with increasing use of parallel processing for a relatively fixed level of expenditure. To justify the expense, very large communities of researchers must be supported on these machines (typically communities up to about 3000 users). Similarly, the costs of people and the costs of personal workstations or PCs are themselves relatively constant. Technology developments have yielded significant decreases in costs for a fixed level of performance. When performance is fixed, cost then decreases by a power law. The efficiency of application codes on high performance computer architectures tends to steadily decrease as the performance of these machines increases, primarily due to the fact that there is some fraction of almost all codes that is unable to take advantage of the high performance features of the new architecture (this fact has come to be known as Amdahl's law). With cost decreasing as a power law and the performance limited by Amdahl's law, the result is that most applications tend to migrate to increasingly more cost- effective platforms; first to departmental minicomputers in the early 80s, to RISC platforms in the late 80s and early 90s and, ultimately, to PCs in the late 90s. The new economics of high performance computing then becomes one of determining what fraction of total budget expenditures is to be invested in ongoing technology renewal which, if one fixes the costs, yields performance improvements at a power law rate and appropriately leverages current expenditures in the research program? It would now seem reasonable to find a level for high performance computing expenditures somewhere intermediate between the costs of a PC and total budget expenditures on research programs. Integrated Voice, Video and Data With ATM Networking enhances the research infrastructure of all sectors of the university community. As high-speed computer networks are also essential to high performance computing, the opportunity to accelerate the development of computer networking at MUN and Newfoundland is a significant component of this proposal. The MUN backbone was originally planned to be upgraded to support "Fast Ethernet" technology at 100 Mbits/s. Long-range planning by MUN's Department of Computing and Communications includes a migration to ATM (Asynchronous Transfer Mode) as the carrier of backbone traffic on campus. The installation of glass fibre on campus is well advanced, with more than 20 buildings having access to this new medium. With funding, this "fast ethernet" upgrade could be skipped in favor of going directly to an ATM-based backbone extending across the MUN campus and ultimately into the surrounding research and industrial community. What is ATM? Asynchronous transfer mode, or ATM, is a technology that promises to integrate voice, video and data in a common format. The technology promises to provide a wide range of opportunities in the development of multimedia applications, computer-supported collaborative work, as well as local and wide area computer networking. High speed networks are essential to support the distributed computing model of high performance computing as well as the client-server model. As bandwidths of communications increase and the latencies decrease, applications that can exploit multiple computing platforms can be more tightly integrated. CANARIE CANARIE promises, through high speed computer networks across the country, to provide researchers in Newfoundland with improved access to research facilities across the country. MUN is now a member of CANARIE and intends to participate fully in this national initiative. This proposal provides a means of supporting local researchers using remote facilities across the country wherever these may be available and accessible. CANARIE will be the mechanism supporting the emergence of a Canadian "national machine room." Division of Educational Technology [42] The trend today in communications is to talk about the converging technologies. Speculation is that television signals will be delivered on the same electronic pathway as voice and computer data. Signal "transmission" as we know it today will disappear. While much attention is being paid to the technical implications of such communications convergence, little attention is being paid to the implications these changes will have on the creative aspects. At MUN's Division of Educational Technology, the quality of the product is still paramount, regardless of the method of delivery. Computer graphics is the area that seems to be the most affected. The digital environment has greatly enhanced the role of the computer in graphic arts and animation. The division of Educational Technology is fortunate in having two of the finest graphic artists in Newfoundland and the work they are producing is outstanding. The methods and the computer systems they utilize have changed dramatically in just the past few years. Using graphic arts as an example, there is an opportunity to provide more training for those involved in the creative aspects. The Division could provide space for up to three "apprentices" or "interns" to work with our graphic artists. After a year of working with our staff in the wide variety of activities that Educational Technology is involved in, including scientific visualization, any graphic arts specialist would no doubt find immediate employment. Such a project should not be expensive. Funding is required to finance the students. Space and facilities could be provided by the Division. In conclusion, nurturing the creative development of our human resources is necessary to take advantage of technological change, else we will miss out on the primary benefits of such developments. National Initiative for Scientific Computing (NISC). With the closures of SCS in Calgary and OCLSC in Toronto, there are no longer any nationally accessible university-based high performance computing centres in Canada. An initiative based in Newfoundland could stimulate a pan-Atlantic Canada collaboration with an initiative similar in scope to that of WurcNet could then mount a credible bid to NSERC for a similar level of funding ($1M/yr) as that proposed in the NISC. Unfortunately, the closure of the Calgary and Toronto centres removed any coordinated organizational response to the window of opportunity opened by the NISC proposal. Despite the fact that both the NSERC CORC Report and the NISC were approved by NSERC Council, no funds were secured for the program. The authors of the NISC proposal, one of whom was the NSERC Projects officer responsible for dispensing NSERC funds, have since moved on to other responsibilities. Depressingly, the new NSERC project officer, when contacted about the NISC, had never heard of the program. This indicates that, despite many years of lobbying by the SCS and OCLSC research communities, much of the ground that was gained in the first generation of high performance computing centres in Canada has been lost. A new generation of centres will have to renew efforts to secure federal support for HPC. The following excerpts from the CORC Report highlight NSERC's thoughts on the support of large scale computational facilities (LSCFs): "the realities of the NSERC budget suggest strongly that NSERC funds should not be used to support the purchase of large-scale computational hardware." "NSERC funds could, however, be used to help defray the costs (of) specialised technical staff to operate the facility and provide user support, maintenance costs, costs of providing network access." (p.13) "the potential cost to NSERC for each LSCF supported could average in the order of $1M per year." (p. 13) "Support could be provided either through a sub-program of the Infrastructure Grants Program or through the Collaborative Special Projects Program." (p.14) "The choice of Centres to be supported would be made following peer review of proposals, and would take into account factors such as: * the calibre of the researchers who currently propose to use the facility, the quality of their research proposals and the need for a LSCF to undertake the research; * the provision of national access; * the community's anticipated need for the type of equipment to be made available; * the nature and quality of support services to be provided to users; * cost to users; * management structure and business plans for the facility; * cost-sharing with and degree of involvement of other partners." (p.14) Research and Technology Building Spencer Hall, recently acquired by the university, is to become the focus of the university's efforts to build a bridge between the university research community and the private and public sectors. The building will be the new home for the university's Office of Research, Seabright Corporation and the Canadian Centre for Fisheries Innovation. The building will also include provisions to support a "technology cluster" (or incubator) aimed at high-technology companies, with emphasis on biotechnology, computers and communications, environmental services, medical devices and ocean technologies. [43] The proposed location will ultimately become the hub of high technology activity in Newfoundland and could be an ideal location for the facility outlined in this proposal. Up to 1200 square feet of fully serviced office space in Spencer Hall is tentatively available to this initiative. Other facilities include a meeting room (25-30 people), an attached kitchen, an audio-visual room, a seminar room, a reception area as well as service space (closets, bathrooms, etc.). The offices will all be serviced with telecommunications links (twisted pair cable for up to 100 Mbit/s communications capacity) with fibre optic cable connecting each floor to the University backbone and CANARIE. Seabright Corporation and the Canadian Centre for Fisheries Innovation are to share the 2nd floor of Spencer Hall, the technology cluster will occupy the 3rd, with the Office of Research occupying the main floor. Collaboration With Other Institutions A significant impetus in the development of the present proposal was provided by the institutional support being provided by research institutions in and around Memorial University as well as elsewhere in Atlantic Canada. In preparing this proposal, discussions were initiated with the following institutions: * Seabright Corporation * STEM~Net * Institute for Socio-Economic Research (ISER) * NRC/Institute for Marine Dynamics (NRC/IMD) * Bedford Institute For Oceanography (BIO) * Department of Fisheries and Oceans (DFO) * Canadian Centre for Marine Communications (CCMC) * Centre for Cold Oceans Resource Engineering (C-CORE) * Newfoundland Alliance of Technical Industries (NATI) * Newfoundland Ocean Industries Association (NOIA) The present proposal has initially focussed on the immediate needs of the local university community. Nevertheless there are significant opportunities for developing linkages with the research institutions currently in Newfoundland as well as with local industry through their technical associations. Many partnership programs exist between the university, government institutions and local industry. As the proposal process moves forward it will be a priority to forge the appropriate linkages for inviting wider participation in this initiative. A secondary objective of this proposal is to create the infrastructure that could support a credible bid for an Atlantic Canada based nationally accessible facility with a more substantial technology platform. In this case the systems acquired as part of this proposal could then serve as a frontend file server or gateway machine to the larger pan-Atlantic Canada facility linked via a very high speed computer network. Developing Ph.D.s as Computational Researchers In the U.S. and Canada, the computational arts, sciences and engineering are becoming increasingly recognized as a distinct mode of research enquiry. However, the application of computational techniques to the arts, sciences and engineering span far too many disciplinary boundaries to realistically expect a new discipline to emerge in a manner similar to Computer Science, for example. Although there does not appear to be a sufficiently well-defined focus to establish a separate discipline, there is the opportunity to enhance the professional stature and career prospects of those Ph.D. graduates who have acquired high levels of skill in the application of high performance computers to their respective disciplines. This model for the development of computational researchers in supporting high performance computing was employed successfully at the Ontario Centre for Large Scale Computation, where Ph.D.s from several disciplines were used to provide research support to researchers in their use of the Cray X-MP. The unique qualities these support staff brought to the position of computational scientist were an intimate knowledge of the needs of researchers (having themselves been a part of the research community), extensive experience in computing, and a significantly different attitude to client support. Products and Services Deliverables to the University Community Infrastructure Support For University Research One of the significant deliverables to the research community at Memorial University that is the commitment to provide a high level of infrastructure support direct to targeted research activity at the university. The training and development of a core group of highly trained peer level research staff provides a measure of infrastructure support to researchers above and beyond the traditional systems and operations support. Computational scientists who acquire expertise in new technologies can provide very high levels of technical support to individual researchers or groups in minimizing the start up time and effort required to exploit a new technology or migrate from an old technology. The high performance model now emerging requires an almost continuous investment in optimization and improvements to one's application codes in order to track technological developments. In most cases, optimization efforts reap benefits across all platforms as well as enhancing portability. Expanding Co-op programs to Physics, Chemistry and Computer Science Memorial's Faculty of Engineering and Faculty of Business have highly successful co-operative education programs. The opportunity to encourage the extension of these programs into physics, chemistry and computer science is a real opportunity. Like other institutions in and around the university, this proposal includes continuing expenditures to support co-op students. The major difficulty is securing adequate levels of employer participation in the program. It may well be that a full co-op program for the sciences is premature at this time. But this doesn't preclude the possibility of initiating an internship program which doesn't require the same level of commitment. Deliverables to Government This initiative must be a well-crafted program that measurably provides benefits to the Newfoundland economy. Some of the objectives of an HPC&C initiative would be to 1) attract students to enter a career in the computational arts, sciences, and engineering 2) attract graduate students and researchers into the local research community, 3) retain graduates in well-paying jobs in the local community, and 4) provide opportunities for graduates to create their own businesses as well as nurture them during the difficult startup phase (high technology incubator). A significant need of government are quantitative measures of the effectiveness of their investments. Many proposals reflect on the importance of science and technology to the economy and to our global competitiveness. A well crafted program of the sort outlined above would have to implement the appropriate mechanisms from the outset to measure its effectiveness. A significant effort could be justified to retain expertise in critical areas here in Newfoundland. The importance of science and technology to the economy is undeniable. Yet it is critical for the long-term viability and sustainability of this project to be able to justify its sustained support across several generations of technology renewal. It is not sufficient to requote U.S. assertions of 40% return on investment for similar investments. Support of the NISC One deliverable to NSERC could be support for the NISC. If funding under the NISC could be provided to this proposal, direct support could be provided to our most highly skilled graduates as they participate in the most challenging computational science projects in the country. The strategic plan outlined by the NSERC ad hoc Committee on Research Computation (CORC) as implemented under the NISC is supported in this proposal on several fronts. The emphasis in this proposal on leveraging existing resources and supporting the emergence of a national infrastructure for research computation through the model of a national "meta-centre" is at the heart of this proposal. Deliverables to Industry The University's Role In Technology Transfer The primary "product" of the university is people. Exposing students to state- of-the-art technology prepares them for increasingly demanding roles in both the public and private sector. Relatively few go on to academic careers. "university researchers train students, who enter the labour market and bring new ideas to Canadian industry and society." [44] "Without this investment in high quality research and people, no new strategy for knowledge transfer can be sustained." [45] "Canadian university research expertise remains relatively underexploited." [46] "the need for increased use of the university "knowledge capital" has never been greater." [47] Outreach Programs One of the important roles that this facility can play is that of providing industry with access to new technology. Small companies in particular do not have the resources to undertake significant levels of research and development on their own and must rely on universities and government to assist them in these areas. Experience at other centres has shown that outreach activities must be a high priority. Industry is very hard pressed to devote significant amounts of time in learning about new technology. It is important that the process of implementing new technology be accelerated as much as possible. Locating the proposed facility in Spencer Hall (the proposed Research and Technology Building) provides accesss to additional programs. These will include additional services such as organizing seminars and workshops, intellectual property management, access to market intelligence databases and, through linkages with the Faculty of Business Administration, access to training and business planning. Planning for the Research and Technology Building is well advanced with significant linkages into the local business community. Industry Partnerships Probably the best model is one where the nature of the partnership is defined by industry itself. This could be via a model where projects are proposed by industry in the area of modelling or simulation and separate business plans are written for each project. Resources could then be allocated on a yearly basis to selected projects. Accountability is assured throughout. Software Development and Commercialization Implementing a distributed model of high performance computing exploiting state-of-the-art communications technologies provides numerous opportunities for development of software products. A great deal of the research output from the computational science being performed at Memorial is in the form of computer software, some of which are software packages of general utility. Significant effort is required to bring the products of research from the laboratory into the marketplace. This effort is not traditionally recognized as part of the academics role and, thus, many promising technological developments remain unrealized. Similarly, a small company frequently lacks the resources to do extensive development of a product. As the proposed facility gains critical mass, more resources can be devoted to assisting the commercialization of products that could be licensed to the private sector. Successful institutional models for this type of activity exist in the Newfoundland community. The mechanism by which the research output of the university is made available to the private sector could be through licensing or the products could be made available on a royalty basis. The latter method requires no significant financial outlay on the part of the private company and has been quite an attractive means of technology transfer. Proposed Facility Mission Statement Author's note: To help focus subsequent discussions and stimulate the development of a mission statement,some underlying philosophical issues in this reaction draft are presented below (in no particular order). * Transforming Newfoundland's economy through science and technology; subscribe to and promote the required partnership between industry, government and academe to effect real change; promote attitudinal change. * Place primary emphasis on comprehensive programs developing the human resources pipeline needed to support the knowledge- and information-based economy of Newfoundland in the next century; technology transfer; education. * Invest in, promote, and protect the knowledge capital represented in people; promote collaboration and multidisciplinary approaches; provide the required computational tools to maximally leverage this knowledge capital. * play an appropriate role in advancing the information infrastructure of Newfoundland and Atlantic Canada; display leadership where appropriate * promotion of the computational arts, sciences and engineering as a third mode of research enquiry (with theory and experiment); strive for excellence in research computation; * lobby for and promote a national strategy in HPC&C; support national initiatives; promote "citizenship" through collaboration and cooperative programs with other institutions and facilities across Canada * Achieving and maintaining balance between the research components, between pure and applied research, between the arts and sciences. * Promote a role in developing the human resources pipeline; develop programs and linkages to provide wider access to leading edge technology; * Full scientific and fiscal accountability; social and fiscal responsibility. * Implement a full and open process of evaluation, renewal, and ongoing improvement; including mechanisms that permit identification and corrections of deficiencies and the implementation of corrections as an integral part of the process. Decentralized Organizational Model The proposed organizational model is as "flat" as possible. The following figure illustrates the peer relationships within the university community. There is significant overlap between the HPC&C Users' Group, the HPC&C Committee and the Department of Computing and Communications. The Director of C&C and the Vice-President (Research) and several representatives of the HPC&C Users' Group are members of the HPC&C Committee, for example. Projects are managed by principle investigators (PIs) on an autonomous basis. All PIs and their co-investigators are members of the HPC&C Users' Group, which is charged with the authority to set general policies and procedures affecting the operations of the facilities comprising HPC&C activities. Subcommittees of the Users' Group are charged with the task of formulating policies regarding technology acquisition, time allocation, etc. for subsequent review by the Users' Group. As projects in the University community are identified, resources are directed to their support both in technological resources as well as infrastructure support. Long-term multi-disciplinary collaborations are encouraged. Department of Computing and Communications Initially, the proposed facility will be a cost centre within the University's Department of Computing and Communications (C&C). C&C provides the required facility management functions following the recommendations of the Users' Group. Significant cost savings are realized by not duplicating many of the services already provided by C&C. It is proposed that there be some separation between the activities of HPC&C at MUN and the activities of C&C in other areas of university computing. The rationale for this is to give this HPC&C initiative an opportunity to establish its own identity and "culture." Standing Committee on HPC&C Stakeholders are represented by the University's HPC&C Committee which represents the interests of the researchers as well as the wider community. The HPC&C Committee includes members of the HPC&C User's Group, the Director of C&C, the Vice-President (Research) and members of the wider university community. This body reports to the Vice-President (Research). With funding, this body could then include representation from industry and government. Proposed Location Research and Technology Building It is proposed that the facility will maintain some offices in Spencer Hall, currently being renovated as the new Research and Technology Building, a centre for research and technological innovation focussing on high-technology research, development and technology transfer. The Research and Technology Building will also house the Office of Research, Seabright Corporation and a Technology Cluster. The HPC&C facility will thus have intimate associations with the university's research infrastructure, its technology transfer arm and the industrial community through the technology cluster and its outreach activities into local industry. To reduce artificial barriers of location between the Centre and its user community, satellite offices will also be supported throughout the user community. Advanced personal workstations supporting multi-media and the CSCW (computer supported cooperative work) model of workgroup computing can potentially be exploited to further reduce barriers between the centralized and distributed components of the facility. High Performance Computing Requirements The technology strategy adopted will subscribe to the following requirements: [48] 1) superior price/performance, to support (the) need to reduce costs 2) speed or ease of programming, to support just-in-time application code development 3) small overall flow time to execute problems, to support (the) requirement for reduced cycle time 4) mature systems that work as advertized, with high reliability and good system software 5) adherence to widely accepted standards, to ease the problem of software portability 6) available as distributed systems that can be under local control of owners. 7) some ability to solve very large problems, but the latest and greatest supercomputer will not, in general, be needed. Competitive Procurement In conformance with university and government regulations, system(s) will be acquired through a rigorous competitive procurement process. Visualization Facility and Teleconferencing Existing facilities for visualization and analysis of data in the Division of Educational Technology at Memorial University are excellent. This facility has the best TV and related equipment east of Montreal. The facility has already produced many state-of-the-art visualizations and animations of computational science research in Newfoundland. A fully functional teleconferencing facility with fibre access and satellite links is available. The Division is capable of producing a wide range of media support, including documentation and manuals, video-based course-ware and access to TETRA and its distance education programs. Personnel The primary resource of the facility will be a complement of highly skilled computational scientists with research experience. The computational scientists will be either recent Ph.D.s with extensive experience with a wide range of high performance computing architectures or persons of equivalent experience. Visualization specialists can enter into an apprenticeship program with the Division of Educational Technology and would gain a wide range of graphics and numeric skills in all areas of TV, video and telemedia productions as well as scientific visualization. Operations and systems support will initially be provided under facility management with the Department of Computing and Communications. Significant technical expertise is currently available in the research community and this initiative will coordinate its activities with the strategic priorities of the research community. To obtain the maximum benefits from the proposed facility a minimal management and administrative structure will oversee the day-to-day operations of the facility. An important component of the programs that could be supported by the facility would be to participate fully in the co-operative education and/or internship programs of the university. Co-op students would be paired with computational scientists and given hands on training in a wide range of research activities at the university. An outreach program with local industry could then promote the transfer of skills and knowledge into the local community. Since funding is uncertain, positions will initially be contract positions (for example, a 2 year contract for Ph.D. level computational scientists would conform with the normal 2 year post-doctoral fellowship cycle and a 1 year apprenticeship for visualization specialists). These positions would be anticipated to become permanent if the initiative reaches a level of sustainability that could support this. Financial Projections Striking The Appropriate Balance Problems with previous high performance computing centres in Canada can be traced in part to an inappropriate balance of expenditures between technology and infrastructure. High performance computers, being new and sometimes untried technology, require significant infrastructure support before the machines can be effectively used. So what is the appropriate balance? Typical budget allocations in the data processing (DP) sector indicates that only a small fraction of the full cost of providing computing services is actually devoted to hardware. [49] 26% of expenditures is represented by hardware and 43% is in personnel. In our survey of the local university HPC&C community, about 35% of expenditures were in computers with about 50% of expenditures in personnel. In the HPC&C budget proposed here, 42% of the budget is in personnel and 32% is in hardware. The following table illustrates the comparisons: DP MUN HPC&C (Computer Economics) (Research Computing) (Proposed) Personnel 43% 50% 42% Hardware 26% 35% 32% Software 12% 7% 5% Communications 9% 3% 9% Other 10% 5% 10% Reaction Draft Cash Flow Projections A five year cash flow projection is presented below. Explanatory notes follow in subsequent sections. To clarify the presentation of the base budget, unsubstantiated sources of revenue and financial contributions are set to zero. Reaction Draft Cash Flow Projections ('000s) Projected Year 1 Year 2 Year 3 Year 4 Year 5 Totals Opening Bank $0 $0 $0 $0 $0 $0 Cash Receipts Industry Programs 0 0 0 0 0 0 Consulting Fees 0 0 0 0 0 0 Computer Services 0 0 0 0 0 0 Software Sales and/or Royalties 0 0 0 0 0 0 Total Cash Available $0 $0 $0 $0 $0 $0 (0%) Cash Disbursements Salaries 322 329 332 339 349 1,671 (31%) Benefits (@15% of salaries) 48 49 50 51 52 251 (5%) Overheads (@20% of salaries) 64 66 66 68 70 334 (6%) Space 13 13 13 14 14 67 (1%) Purchase of Computer Time 50 50 50 50 50 250 (5%) Purchase of Computer Equipment 250 200 200 200 200 1,050 (20%) Power 6 6 6 6 6 30 (1%) Maintenance 30 54 78 102 126 390 (7%) Software 63 50 50 50 50 263 (5%) Communications 100 100 100 100 100 500 (9%) Programs (e.g. seminars/workshops) 50 50 50 50 50 250 (5%) Audit Fees 3 3 3 3 3 15 (0%) Taxes (PST+GST) 59 47 47 47 47 247 (5%) Miscellaneous (0%) Total Disbursements $1,058 $1,017 $1,046 $1,079 $1,117 5,317 (100%) Cash Receipts Over Disbursements -$1,058 -$1,017 -$1,046 $1,079 -$1,117 -5,317 (100%) Financing Memorial University (Inkind & budget) 191 192 192 192 192 959 (18%) Research Grants 0 0 0 0 0 0 (0%) User Contributions 0 0 0 0 0 0 (0%) Other 0 0 0 0 0 0 (0%) Total Financial Contributions $191 $192 $192 $192 $192 $959 (18%) Total Funding Required $866 $826 $854 $887 $925 4,358 (82%) Closing Bank $0 $0 $0 $0 $0 $0 Cash Receipts Industry Programs Industry programs could, for example, include industry sponsorships for co-op students, industrial post-doctoral fellows, graduate student awards, contributions to joint research projects and similar programs. Consulting Fees Computer Services No academically-based HPC facility of the type being proposed here has suceeded in sustaining its operations through sales of computer time to government or industry. Recognizing this, this proposal includes no dependence on revenues from this source. The Gillespie, Folkner & Associates Final Report on OCLSC reported that OCLSC delivered a total of 375 cpu-hours to government and industry for the period 1 January 1987 to 1 July 1991 (excluding 166.3 hours of benchmarking) [50]. While the contracted rates were confidential, it can be assumed the actual commercial rates charged were between the contract research rate of $300/cpu-hr and the nominal commercial rate of $500/cpu-hr for the bulk of this time, implying total revenue over 4.5 years totalling between $113,000 and $190,000. Industry will not be a "cash cow" for HPC&C. Industry participants in joint programs might be charged (at nominal cost) for computer services. Any funds from this source should then be applied to enhance the technology used by industry or subsidize industry outreach programs. In this way, industry sees that its contributions are then reinvested for their benefit. Serious consideration of the option of not charging industry for use of computer equipment could also be entertained, provided the time is spent in research and development of modelling applications in partnership with a university researcher. If a viable commercial opportunity does arise from these activities, there are resources available through Seabright Corporation and other agencies that could then be used to exploit this opportunity through an independent business entity. Software Sales and/or Royalties The computationally intensive research programs being supported by this proposal combined with new projects submitted by the research community are expected to provide significant opportunities for the development of unique software with potential commercial value. The organizational structure outlined in this proposal is intended to nurture pre- commercial R&D. A close association with Seabright Corp., the technology transfer arm of the university, promotes the transfer of such technology into the private sector. CCMC and C-CORE are successful models for this type of commercialization activity. The mechanism of royalties for software rather than pursuing software sales might be a revenue generating option. The advantage of collecting royalties on software over software sales is that it facilitates transfer of the technology into industry. Companies can then acquire the technology easily with no upfront commitment of resources. Cash Disbursements Salaries The following table indicates the level of additional staffing proposed for the facility. The staff complement is then combined with the corresponding salaries to obtain the net salaries. The current complement of research staff supported through research grants for selected research groups are included for comparison. Note that the proposed staff are intended to complement the research staff currently in the research community. Research Staff Proposed Staff Personnel Categories Median FY93 FY94 Year Year Year Year Year Salary 1 2 3 4 5 Management and Administration Director $75 Exec./Acad. Director $60 1 1 1 1 1 Administrator $35 1 1 1 1 1 Admin. Secretary $23 /Pub'ns/Media Scientific Staff Sr. Computational Scientist $60 1 Computational Scientist II $50 1 1 2 1 Computational Scientist I $43 2 1 1 Visualization Specialist II $46 1 1 1 Visualization Specialist I $43 1 1 Research Comp. Spec. $35 3 3 Research Asst. II $30 Research Asst. I $27 6 6 Post-doctoral Fellow $28 Graduate Student $15 Co-op Students $27 2 2 2 2 2 Systems and Operations Staff Manager(s) $48 1 1 1 1 1 Sr. Sys. Eng./Sys. An. $46 Sys. Eng./Sys. An. $43 Sr. Prog. An./Sys. Prog. $39 Sr. Prog. $35 Prog. $33 Facility Management Contract $100 Consulting Contract $100 Total Personnel 9 9 8 8 8 8 8 Computational scientists are compensated at roughly equivalent industrial rates (such as NSERC's industrial post-doctoral fellowship program). Research staff with Ph.D.s are compensated similarly to senior analysts and managers with equivalent years of experience. The salaries for research staff do not include any consideration of academic staff salaries. Note that the research personnel funded here does not include salaries for professors, post-docs, or graduate students. Research staff are typically paid out of research grants with a small contribution from departmental budgets. Academic staff are paid out of university budgets. This facility intends to participate fully in the University's co-op education programs. The equivalent FTEs in the above table should be multiplied by 3 work terms per calendar year to obtain the total number of students participating in the HPC&C program. Benefits Benefits are calculated at the standard rate of 15% of salaries. Overheads Overheads include provisions for office equipment at 11% of salaries (e.g. workstations/PCs amortized over 3 years), training and professional development at 5% of salaries (1 five day conference with tutorials modelled after the IEEE Supercomputing conferences and 1 three day conference modelled after Canada's supercomputing symposia, or approximately $3000 for every $60K of salary), furniture at 1% of salaries ($2000/200 sq.ft./5yrs), telephones at 2% of salaries, and office supplies, etc. at 1% of salaries. Total overheads excluding benefits and space is then 20%. Space Space is broken out of overheads since space is a significant inkind contribution of the university. Space is calculated at equivalent commercial rates corresponding to fully serviced space. The rates used are $6.33/sq.ft/yr. for the office space and $5.70/sq.ft./yr. for operating costs (heat, light, cleaning, etc.). [51] These costs do not include costs of land acquisition or development (e.g. costs of building renovations). Space rates are indexed to salaries assuming 200 square feet of office is allocated for every $60K of salary. With this breakdown, space comes to 4% of salaries. Purchase of Computer Time Time at remote facilities in Canada or the U.S. is estimated at $50,000/yr. Use of remote facilities will be an option for those research applications that can obtain higher benefits per unit cost than can be provided by the local facilities or that require significantly shorter turnaround time. Purchase of Computer Equipment Technology procurements are directed by the Users' Group. The following description of technology is illustrative only. A configuration that could be labelled as a high performance platform using the current generation of technology (1994) might be a configuration of perhaps 4-8 processors with aggregate performance of approximately 1-2 peak Gflop/s with an aggregate memory of 1 GByte and 60 GBytes of disk. Cpus are estimated at $250,000 for an initial system. 1 GByte of memory at $150/MByte is $150,000. 60 GBytes of disk at $1500/GByte is $90,000. Total system cost is then $490,000. Academic discounts are typically 35%. Special rates for this level of purchase could be higher (up to 50%). This corresponds approximately to the $250K expenditure in Year 1. We propose a model of technology acquisition where maximum leverage of technology developments is provided by a stable funding policy and expenditures spread over the life of this proposal. The optimistic assumption of performance per unit dollar doubling each year implies that, if a stable funding policy is implemented, 1 unit of performance in Year 1 could become 1+2+4+8+16=31 units of performance by Year 5. With emerging technology, one unit of performance is roughly equivalent to the power of the two-processor CRAY X-MP installed at OCLSC in 1986. Power Power consumption is arbitrarily assumed to be equivalent to 1/4 of the estimated power used by the research computing community, or $6K/yr. This makes the installed equipment in this proposal draw the same level of power as one large research group or department on campus. Maintenance Maintenance is assumed at 1% of hardware costs per month. As the hardware investment accumulates the maintenance charges increase accordingly. After three years, maintenance charges typically decrease dramatically (or optionally these charges can be deferred as equipment is phased out). This is not modelled. If commodity microprocessor systems are "farmed," foregoing maintenance charges in favor of increased numbers of processors (i.e. so-called "hot spares") is an aggressive strategy that exploits the high reliability of modern technology. This could be an option to consider. Software Software requirements in the academic environment are significantly less than for a commercial environment (where they typically are about 50% of machine costs). We assume a budget for software indexed at 25% of computer costs. Communications Line items for communications will be examined in coordination with the Department of Computing and Communications. The proposed budget includes upgrading all buildings serviced by fibre to support ATM over 5 years. Some very tentative cost estimates follow: [52] * A one-time equipment cost of $95,000 for implementing an ATM backbone with 2 hubs (one in Physics/Chemistry, the other in Engineering). The cost of approximately $50,000 per hub is dependent on the network fan-out at the hub. * A hub in telemedicine is estimated at 2/3 the cost of a Physics/Chemistry hub; a hub in Computing and Communications is estimated at 1/2 the above cost; with the balance of the 17 buildings on campus connected with fibre estimated at 1/4 the above cost. * The total one-time cost for installing ATM throughout the campus is therefore on the order of $350,000. Installation charges are not included. We arbitrarily assume $150,000 for installation expenditures. Programs A flat budget of $50,000/yr is proposed for symposia, workshops, courses and media preparation. Audit Fees Professional fees of an auditor are estimated to be $3000/yr. Taxes (PST + GST) PST at 12% and GST at the academic rate of 2.3%. Miscellaneous Miscellaneous expenditures include consulting fees applied to this proposal. Financing Memorial University Memorial University provides significant support in the form of space, power and communications (assumed at $22,500/yr). C&C has a $98,000 line item in their FY94 budget allocated to HPC&C. Beginning in FY95, an allocation for an HPC&C specialist (assumed at $50,000/yr) is included in their budget. We cannot assume this level of ongoing commitment from C&C over the life of the proposal. Nevertheless, direct support of the university at the level of $150K is consistent both with the current C&C budget and independently through university support of research computing professionals, space, and power now commited to support the research community. Thus we believe $150K/yr is a reasonable estimate of the level of commitment that MUN traditionally devotes to research computing. Research Grants In the last two fiscal years, researchers have covered most of the costs for research computing out of research grants (apart from space, power and communications as noted above). This is a marked departure from the early '80s when research computing was provided to researchers at no charge by the university. It is expected that through the auspices of the HPC&C Users' Group an application to NSERC for either an infrastructure grant or collaborative special projects grant could be made in support of this proposal. Under the NISC, up to $1 million/yr was proposed for up to 4 regional facilities, for example. It is therefore advantageous to promote a pan-Atlantic HPC&C initiative (similar to WurcNet in Western Canada) in which the present initiative in Newfoundland could play a part. In the past, NSERC has not funded HPC&C activities anywhere near the levels proposed in the NISC. OCLSC, for example, represented a community of 500 academic users from coast to coast but never obtained more than $200,000/yr of funding despite many years of intense lobbying by upwards of 60 high profile researchers. User Fees or Contributions The issue of users fees is a policy issue for the HPC&C Users' Group to decide. The Department of Earth Sciences has the policy of requiring its researchers to pay $20/cpu-hr for computer time on departmental servers. Elsewhere in the university, researchers have never subscribed to a user fee structure. The level of discretionary funds available in individual operating grants is not expected to be very large. With a (justifiably) higher priority being applied to the support of graduate students, this implies that only a few thousand dollars per researcher at most could be expected from users' fees in the wider university community. If users' fees were to be instituted, it could be structured similarly to the bulk purchase plan implemented at OCLSC with relatively large blocks of time being allocated for very modest expenditures. The attraction of assigning a nominal cost to computer usage is that computer time then has "value" which encourages efforts in optimization to improve the efficiencies of research codes. For similar reasons, contributions from external institutions and agencies is also expected to be small due to budgetary pressures. Budgets for capital expenditures are typically significantly smaller than budgets for operations. It would therefore be easier for external institutions to allocate small portions of their operating budgets in support of their internal computational needs, but these expenditures would themselves be limited to a small fraction of the threshold costs of hiring staff or procuring workstation technology. This implies external sources of revenues will not likely exceed a few 10s of thousands of dollars per agency or institution if commitments are not secured from the outset. Sources of Funding Rationalizing Existing Expenditures Memorial University has had a long-standing relationship with the Newfoundland and Labrador Computing Services (NLCS) for the provision of computing services to the university. The university's expenditures on administrative computing with NLCS are substantial. These expenditures have, in part, tracked recent advances in technology through substantial increases in delivered administrative computing services to the university over the past several years for a fixed level of expenditure. On the other hand, support for research computing through the university and NLCS had declined to zero many years ago. The opportunity to redress this imbalance may be provided once again by emerging technology. With NLCS being privatized, collaborative arrangements between MUN and NLCS in technical computing could open up creative financing opportunities in computing for researchers. It has been suggested that this type of arrangement could potentially eliminate the need to seek external funding for this proposal. Alternative Cost Savings For Personnel There are other sources of "funding" which could be applied against the costs of this proposal. In the area of offsetting costs of personnel, in-kind support could be provided in the operations and systems support by both C&C and through the technical expertise that is currently distributed around the university. Similarly, the appointment of an academic researcher as director for the facility could then save the cost of the director's salary. This would of course increase the in-kind contribution of the university to this proposal since academic salaries are paid out of the university budget. The computational scientist positions proposed here have salaries roughly equivalent to entry level academic positions. A purely academic research computing facility has the option of expanding personnel through the funding of post-doctoral fellowships. Fellowships are typically contractual appointments for two years and provide recent Ph.D.s with valuable research experience and the opportunity to stay within the academic employment stream. Salaries are typically about 30% lower for post-doctoral fellows and there is additional flexibility provided by the use of contract employment. The risk of using post-doctoral fellowships as a means of staffing the proposed facility is that these positions are highly transient. Experience has shown that it takes several years for a recent Ph.D. to acquire the breadth of experience necessary to be effective as a facilitator and collaborator in the research computing community. Although contract employment may be necessary during the start up phase, it is not recommended as a long-term strategy since the training costs that are lost through turnover outweigh the costs of providing the more permanent jobs that attract and retain highly skilled individuals. Sources of External Funding It is expected that the balance of the proposed funding would come from external sources. In Newfoundland, there are numerous provincial and federal government programs under which this proposal may be eligible. Some of these agencies and programs are: 1) Atlantic Canada Opportunities Agency (ACOA) 2) CANARIE 3) The Human Resource Development (HRD) Agreement 4) NSERC 5) Offshore Development Fund 6) Strategic Investment and Industrial Development (SIID) Agreement With the possible exception of ACOA, none of the above sources would be expected to fund this proposal in its entirety. The following table indicates those parts of the present proposal that could potentially be eligible for funding from the corresponding agency: Personnel Computers Communications Other ACOA X X X X CANARIE X HRD X NSERC ? ODF ? ? SIID ? X Vendors As A Source of Inkind Funding Vendors of computing and communications equipment themselves represent a potential "partner" in HPC&C initiatives. Special relationships with the vendor provided the foundation for many HPC&C initiatives in Canada, including HPCC in Calgary (Fujitsu), Dalhousie (Alliant), NRC's Institute for Marine Dynamics (Alex Informatique), University of Toronto (Kendall Square Research), University of Victoria (IBM), etc. In some cases, benefits come back to the university in the form of research grants, either in cash or in terms of computer time. Academic discounts for both hardware (35-50%) and software (up to 100%) represent an almost "funny money" source of "funding" of HPC&C. These "discounts," however, become extremely important if the focus of the facility becomes in any way a commercial venture. There is also the costs of systems integration. The major vendors can provide complete solutions and absorb the costs of systems integration provided the scale of the proposal justifies it. This approach has the advantage of providing new initiatives with a quick start and provides the opportunity of developing local expertise at a more leisurely pace. This approach is exploited by the major centres like AES in Canada and the major centres in the U.S., where the size of the contracts leverage far more than a simple computer purchase. Creative financing arrangements can be exploited more readily when larger fractions of the overall facility are bundled together in the procurement process. Hidden costs can quickly escalate with a piecemeal approach to HPC&C. The major danger of the "do it yourself" approach is that the users of the facility become the "guinea pigs" of the experimental approach. This approach must be considered carefully since the support of the user community is critical to the success of the venture. Leasing Versus Buying Another option that merits consideration is leasing computer equipment as opposed to buying computer equipment. The leasing option should definitely be considered if a large equipment acquisition is being considered. The idea would be to negotiate the most favorable lease contract possible, with clauses that ensure that equipment is maintained at the state-of-the-art. This is the arrangement that AES has entered into with HNSX Supercomputers Inc. It is not clear that there is any advantage to the leasing of microprocessor based computer equipment since, in effect, their lifetime is so short. Extraordinary Funding From Government Is Required All experience in similar ventures has shown that HPC&C initiatives can only proceed with government funding outside normal research or commercial avenues (i.e. extraordinary funding). Risks and Benefits Risks The primary risks in HPC&C ventures are: 1. Unrealistic expectations 2. The pace of technological change 3. Underfunding 4. Political imponderables 5. Doing nothing The strength of this proposal is a balance between people and technology. This balance is very sensitive to funding levels. The technology component is set at a level that is at a lower threshold for providing access to new technological options such as parallel processing, for example. This is then balanced with a minimum level of infrastructure support through trained personnel. Inadequate levels of funding will eliminate both technology options and jobs. Political imponderables will have some effect on this initiative. Misconceptions and myths concerning HPC&C abound. The marketing hype surrounding HPC&C can be intense. The biggest risk is doing nothing. In the final analysis, doing nothing is the most insidious of "strategies." With nothing ventured, there is no debate of issues and no pressure to make difficult decisions. This proposal encourages a full and open process for debate on difficult issues relating to adopting and implementing new technology, but with an imperative that decisions to move forward ultimately must, and will, be made. Unrealistic Expectations Unrealistic expectations have crippled previous HPC&C initiatives in Canada. Obsolete models such as the centralized service bureau model have not been successful in the academic environment and there is no evidence that commercial HPC&C ventures will fare any better. Unrealistic expectations have the crippling effect of imposing impossible goals, introducing constraints that reduce flexibility in operations, and introduces the cycle of failure. Only after experience has been gained in this venture can realistic goals be set. The start up phase is particularly expensive and difficult since it is at this time that demands are high and the ability to meet the demand is low. Many of the problems that plague similar initiatives in later years arise in the start up phase, which is unrealistically thought to be finished with the delivery of the first machine. Many of the problems that will arise are unavoidable issues that can only be resolved through open debate. Many of the problems cannot be anticipated until after the technology is installed. Contentious issues that cannot be avoided include: * choice of technology * resource allocation * organization and management * general vs restricted access * accounting policies Ownership of resources is a primary driving social force. Difficult tradeoffs and tensions occur whenever resources are centralized. These occur regardless if the scale is resource sharing in a small group, a university, or across the country in a national facility. Current technology trends favor empowerment of individuals, not groups. Centralized service bureaus in the academic environment have commonly failed in being responsive to the needs of their clients. If a centralized service is being provided, the same level of "ownership" must be provided to the client as is available in a desktop workstation. This is very difficult. The appropriate focus must be established from the outset of the venture with a clear client-oriented mission statement. Processes that encourage consensus building must be set in place. Regular evaluation and modification of policies should be an integral part of the process. It is difficult to reconcile the few hours it takes to install technology and the many years it takes to develop the appropriate infrastructure surrounding that technology. Avoid "Planning" For Failure It is very important to avoid burdening the initiative with crippling constraints due to unrealistic expectations which have very high probabilities of failure associated with them. The clearest example of a crippling constraint is the requirement of full-cost recovery through sales of computer time, for example. No academic facility has ever succeeded in this venture. This is not to say that no revenues will come through use of technology by outside users. It just shouldn't be a priority area. More benefits accrue from focussing on high-quality outreach programs and collaborative activities. The only model that has worked (at NCSA, and this at a level of only about 10-15% of total operational expenditures) is one of collaborative special projects. It is this mechanism of providing technology transfer to industry that is part of this proposal. Conducting highly productive research programs in exciting areas of the arts, sciences and engineering using state-of-the-art tools also develops highly skilled people who transfer their knowledge to the outside community through subsequent employment. If cost recovery through revenues becomes the focus, these activities would have to be cut. Once the "cycle of failure" is set in place by an unrealistic expectation, no amount of expenditures on operational reviews, consultancy studies, or high- powered management board appointments will rectify the situation. In many cases, a more realistic expectation would have shown that a so-called "failed venture" could more justifiably have been labelled a "qualified success." It is only possible to set targets once sufficient experience has been gained to be able to define a reasonable expectation. The key is to build on small successes and avoid setting oneself up for a grand failure. Advances in Technology Providing the maximum benefit for the minimum cost requires continuous technology renewal. Large systems must demonstrate economies of scale and must be renewed or replaced on a 2-3 year cycle. Since large acquisitions have a political cycle that commonly exceeds the technology cycle it is extremely difficult to pursue this strategy without strong political will and leadership. The easier strategy of following the rate of change in microprocessor technology requires continuous investment in technology renewal on a 6-12 month (or less) time scale. 7 years ago, Canadian initiatives in high performance computing (at SCS in Calgary and OCLSC in Toronto) were based on creating a research centre delivering a "CRAY equivalent" with a capital expenditure on the order of $10 million. 3 years ago, a Nova Scotia initiative proposed the creation of a research centre delivering a "CRAY equivalent" with a capital expenditure on the order of $1.6 million. This proposal at Memorial proposes to deliver the same "CRAY equivalent" with a proposed capital expenditure in the first year of about $0.1-0.2 million. The so-called "CRAY equivalent" has become synonymous with the term "supercomputer." At OCLSC, the average performance delivered to researchers on the CRAY X-MP exceeded 50 Mflop/s, or 0.050 Gflop/s. With workstation technology, it is now possible to match this performance level and hence it is tempting to call workstations "supercomputers." The current state of the art in high performance computing was demonstrated at Supercomputing '93 in Portland in November 1993, with numerous real world applications delivering 6-10 Gflop/s (a factor exceeding 120 times) on so-called "traditional" supercomputers (the CRAY C90/16) and several real-world applications delivering upwards of 60 Gflop/s (a factor exceeding 1200 times) on massively parallel machines. What this means is that the modest approach in this proposal is still a factor of 100-1000 behind the state-of-the-art. The present proposal is only aimed at preparing a foundation for beginning the process of closing this gap. It is nonsense to think that simply purchasing the same technology as the U.S. will allow us to close this gap. Previous experiences in Canada have amply illustrated the fallacy of this approach. A balanced strategy is needed. With the emergence of parallel processing as the new paradigm of high performance computing, learning curves will only get steeper and the gap more costly to close. Without appropriate levels of investment in infrastructure support, migrating to MPP will likely be impossible for many researchers. Advancing technology is also removing support for the bridge technologies. Users are in danger of finding it increasingly expensive to migrate to higher performance architectures. Without a constant investment in application development and optimization, choosing the most cost-effective platform will be limited to desktop machines. The significant benefits to one's research program that accrues from orders of magnitude improvements in turnaround time will then be lost. Technology is not a panacea. In fact, high performance computing technology by itself will yield correspondingly few benefits to the research community or the surrounding economy (as compared to the potential benefits) unless a corresponding emphasis is placed on programs and people. Lack of "Critical Mass" for Programs There is a minimum size for a high performance facility below which lack of a "critical mass" of personnel and infrastructure causes initiatives of high promise to be stillborn. Some of the danger signs of an undersized initiative are: * lack of balance between technology acquisition and infrastructure support * almost continuous technology debates. * a facility that is reactive to day-to-day crises involving isolated parts of the research community, rather than proactive in the development of programs of wide ranging benefit * lack of a long-range plan What staffing level constitutes a critical mass? The previous academic based centres in Calgary and Toronto each had approximately 10-12 people with additional support provided through maintenance and facility management contracts. State-funded facilities in the U.S. have approximately 20-25 staff. The NSF funded centres have approximately 150-200 staff. Experience has shown that 10-12 people does not constitute the required critical mass for a national facility, whereas the other extreme represented by the NSF-funded centres with 150-200 staff definitely does. This proposal currently has fewer people than would naively be required for establishing and sustaining a standalone HPC&C venture. The imperative for any long-range plan would be to integrate the activities undertaken in this initiative with parallel activities throughout the university community, in the other Atlantic provinces and in the rest of Canada. Political Imponderables Technological obsolescence and the political process strongly biases against procurement of "expensive" technology, regardless of the cost-performance. Large procurements take a long time to move from inception of a proposal to procurement of technology. In Ontario, the proposal process to renew funding for OCLSC began in 1989 and, at the end of 1993, is still incomplete, with the Centre itself closed in March of 1992. By the time the political cycle is completed the technology aspect of the proposal is obsolete. If this initiative is to succeed in transcending the level of simple technology procurement(s), there must be a commitment on the part of the research community to support this initiative. This proposal is advocating a fundamentally different model of computation, one that is driven directly by the needs of researchers. To be successful, this model must have the active support of the research community as signatories to this initiative and, in return, the organizational model must empower the research community to control all aspects of the process. Through the active involvement of the researchers in the initiative the danger that the venture could become isolated from the research community is minimized. Regardless of the merit of the research programs, the management and organizational structure of the facility, the cost-effectiveness of the technology, or a host of other quantifiable measures, high performance computing and communications ventures in Canada are vulnerable to negative perceptions and competing political agendas. The best defense is a clear mission statement, less emphasis on the technology, more emphasis on people, programs, and education with clear leadership supported by a unified strategic vision. Benefits NSF Centers Program "An Extraordinary Success" The Americans have hailed their NSF's Supercomputer Centers Program as "an extraordinary success." According to a review of the program in 1992, the program has reaped the following benefits: * Enabled an enormous body of research by providing the national research community with access to state of the art facilities. * Stimulated the development of the computational arts, sciences and engineering by involving many who had not previously used supercomputers. * Contributed in important ways to education ranging from the graduate level to K-12. * Strengthened the infrastructure of the computational arts, sciences and engineering through software development, and through its major role in the development of the national network. * Played an important role in technology transfer through its interaction with vendors and its outreach to the industrial user community. A National Metacenter The NSF Centre's Directors have proposed to build a "National Metacenter" consisting of * A "national machine room" to allow users seamless access to all of the program's resources. * A mechanism for involving other organizations * A framework for cooperation among the Centers and for planning on a national basis." A Newfoundland Initiative in HPC&C We propose an initiative based in Newfoundland that emulates the successful U.S. model for building a high performance computing, communications and people-oriented infrastructure to support research. The scale of the initiative is appropriate for the local environment and leverages existing resources as much as possible. The primary benefit of adopting a balanced approach to technology acquisition and infrastructure support is that the benefits delivered to the local community scale well with the expenditures. A knowledge-based economy is a people- based economy above all else and the emphasis in this proposal of balancing people with technology embraces this philosophy. It is tremendously daunting when the pre-eminent economy in the world today is driven by the highest sense of urgency to embrace high performance computing and communications in building its national information infrastructure to maintain its economic leadership into the 21st century. The history in Canada of attempts to emulate these activities have met with limited success to date. In Newfoundland, we are driven by circumstance to adopt a similar sense of urgency in reshaping our economy. The level of collaboration and cooperation required across all sectors of the Newfoundland economy to meet this challenge is unprecedented in our history but tremendously exciting as well. This proposal highlights a modest strategy for a local community of researchers in the computational arts, sciences and engineering to participate in meeting this challenge. Appendix 1: Research Projects The following information is of great value to the proposed document "Research Projects at MUN." As there are insufficient resources to edit responses, please provide information in a format that could be released for publication. Identification Name, Institution, Department, Address, Phone, Electronic Mail Address. Description of Research Project Project Title. Field of Study. Summary of research program, including objectives and goals (suggested length not greater than a single page, oriented to an audience that is not necessarily familiar with the field of study). Indicate the significance of the research being performed. Identify collaborators/co-investigators with affiliations. If appropriate, describe cross-disciplinary aspects of the work and/or industry associations that arise naturally from the research programs being conducted. Attach recent list of publications. Include copies of publications that directly acknowledge research computing support or contributions provided by staff of Memorial University, either through joint authorship or acknowledgement. If a publication should have credited a significant contribution but did not, please so note. Computational Methodology Decribe the computational methodology or algorithm(s). Describe your current computational resources. Specify machine(s) with model number if available (e.g. IBM RS/6000 Model 560) to aid in determining rated performance on industry standard benchmarks (if known, specify the Linpack DP 100x100 rating of your computer(s)). Performance of code in Mflop/s (if performance level is not known, indicate "Not known") or, alternately, the typical turnaround time for a single job, or for completing a given computational study. Indicate what, if any, limitations are encountered with your current computational environment. Mention efforts, accomplishments and experiences or plans in optimizing codes. If known, indicate what type of computer architecture is currently most suitable for executing your code. If available, briefly relate experiences with other architectures. Training of Professionals Describe the contributions of your research program(s) to the training of professionals. Indicate the number of research associates, post-doctoral fellows, graduate students, and summer/co-op students. Identify the level of research support staff directly supported by your research program whose duties are primarily devoted to support of your local research computing. High Performance Computing Requirements Describe needs for computational resources. Specify hardware requirements (cpu(s), memory, disk, tape(s), graphics, networking, etc.), software requirements (operating system, compiler(s), third- party applications, subroutine libraries, etc.) and infrastructure support requirements (systems managment, operations support, high-level computational support, etc). If possible, indicate requirements in terms of capabilities delivered to your research applications (for example, cpu performance could be related to expected decreases in turnaround time, memory in terms of larger model sizes using 3-D vs 2-D that could be addressed, high-level computational support for optimization and/or migration to new architecture, and so on). List other computational support received (for example, from other facilities or centres, elsewhere in Canada, the U.S., or abroad). Appendix 2: Mission Statements (Author's note: Some mission statements in HPC&C from elsewhere follow below; these will be deleted from the proposal after a mission statement is drafted.) The U.S. High Performance Computing and Communications Program "The goals of the High Performance Computing and Communications Program are to: * Extend U.S. technological leadership in high performance computing and computer communications. * Provide wide dissemination and application of the technologies both to speed the pace of innovation and to serve the national economy, national security, education, and the global environment. * Spur gains in U.S. productivity and industrial competitiveness by making high performance computing and networking technologies an integral part of the design and production process. These goals will be realized by achieving: computational performance of one trillion operations per second (10**12 ops, or teraops) on a wide range of important applications; development of associated system software, tools, and improved algorithms for a wide range of problems; a national research network capable of one billion bits per second (10**9 bits, or gigabits); sufficient production of Ph.D.s and other trained professionals per year in computational science and engineering to enable effective use and application of these new technologies." The goals will be met through coordinated government, industry, and university collaboration to: * Support solutions to important scientific and technical challenges through a vigorous R&D effort. * Reduce the uncertainties to industry for R&D and use of this technology through increased cooperation between government, industry, and universities and by the continued use of government and government-funded facilities as a prototype user for early commercial HPCC projects. * Support the underlying research, network, and computational infrastructures on which U.S. high performance computing technology is based. * Support the U.S. human resource base to meet the needs of industry, universities, and government. [53] Mission Statement: Ontario Centre for Large Scale Computation (OCLSC) To make the research and development communities -- universities, business, and industry in Ontario -- internationally competitive, through the provision of facilities for, and development of expertise in, applications of supercomputing technology. Goals and Objectives * To provide a centre of excellence with expert staff to enable researchers to use new high performance computer hardware, software, applications and graphics. * To increase the number of leading researchers who have direct experience in the application and use of computationally intensive techniques. * To provide a level of computational resource which is unlikely to be available within individual universities or small consortia within the next five years. * To ensure that all Ontario university researchers with need for the best supercomputing capacity can access the Centre, and that university researchers outside Ontario can access the centre on the basis of availability. * To assist in developing a high bandwidth communication network which will provide Ontario university users with the equivalent of local access. Mission Statement: Western University Research Consortium "The mission statement for the proposed Western Canadian Research Consortium (WURC) reflects the need for both HPC users and computer scientists to see themselves as full partners, and the need to facilitate communications at all levels. Its mission is thus threefold: * to support access to HPC by high quality university researchers; * to support research in parallel software algorithms and technologies that advance the field of high performance computing; and * to support research in, and access to, high performance networks capable of sustaining rapid communication among geographically distributed researchers and HPC machines." [54] (p.9) Appendix 3: Technology Assessment An important need area is the continuous and ongoing evaluation of technology options to identify the most cost-effective solutions to keep researchers at or near the forefront of technological developments. An important companion study accompanying this proposal is this technology assessment. Rather than being a review of current technology which would be out of date almost before this proposal could be released, the technology assessment reviews the important architectural features of a wide range of modern high performance computing platforms, including the reduced instruction set computer (RISC), the pipelined ("vector") computer, as well as parallel computers. But more importantly, the technology assessment includes a suite of benchmarks that measures the performance of research computers and attempts to highlight some features of systems that could prove to enhance or hamper the performance of these machines on researcher's applications. As experience with these systems increases, the technological assessments being performed will themselves become more sophisticated. It is important to note that performance analysis is an emerging discipline in its own right and is undergoing continuous development and refinement. The motivation behind the development of a benchmark suite is to introduce some measure of objectivity into an area in which scientific rigour has traditionally been sorely lacking. The marketing hype surrounding high performance computing is enormous. Since high performance computers represent the leading edge of technology developments there is little information available apart from the claims of vendors and marketing brochures on which to base a purchasing decision. Another important conclusion that can be drawn from the technology assessment being performed is a strategy for enhancing performance. Although the performance can obviously be enhanced by purchasing a faster computer (yielding a factor of 2 or more), it may not be so obvious without a performance analysis that one may be able to attain similar or greater performance improvements through changes in the code itself through optimization techniques or calls to appropriate subroutine libraries. Methodology For the technology assessment we have adopted the philosophy and methodology of the PARKBENCH (PARallel Kernel BENCHmarks) Committee. [55] Preliminary Benchmark Results on Compact Applications The ultimate test of a computer system is how the system performs on a user's research application. For this reason, the best benchmark is the full research code itself running a typical research problem. Unfortunately, for many researchers the run-time for a full research problem is measured in days, weeks, months and, occasionally, years. For this reason, compact applications based on a subset of the full research application is extracted to represent the application on a typical problem size. One of the problems with modern high performance desktop systems is that there is no way of measuring the absolute performance of these systems without a hardware (or software) performance monitor and the majority of systems do not have this capability [56]. One cannot make an informed decision to purchase a system without knowing whether you can expect to obtain 50% or 5% of the available performance of the system. The compilation of a suite of compact applications to accompany this proposal has begun. To establish an approximate level of performance relative to the peak performance of the machine (i.e. measure the efficiency), the tedious hand counting of arithmetic operations in the compact applications has been started. For some applications with thousands of lines of code, this process takes some time. The arithmetic operations count provides a normalization factor for the true measure of performance, the inverse of the execution time. The following table indicates candidate compact applications being considered for inclusion in the technology assessment (others are being sought): Symbol Description Discipline OM Ocean modelling Oceanography XY 2-D magnetic system Superconductivity ST Spectral transform Climate modelling FF Fuzzy-flops Earth Science DM DeMon Molecular Modeling AB ab initio Quantum chemistry The following table indicates the technology now used as computational servers at MUN: Group/Dept System CPU Number Linpack DP of CPUs 100x100 (Mflop/s) Oceanography SGI 4D/360 R3000 (2 x 25 MHz, 6 3.9 (25 MHz) 4 x 33 MHz) 5.0 (33 MHz) IBM 320H RS/6000 320H (25 MHz) 1 12 Cond. Matter SGI Crimson R4000 (50 MHz) 1 16 SGI Indigo R4000 (50 MHz) 1 16 Earth Sciences SUN SS1000 SuperSparc (50 MHz) 4 27 Engineering DEC 4000/610 Alpha AXP (160 MHz) 1 35 The average Linpack performance averaged over the 14 cpus in the above list is 15 Mflop/s. The Linpack 100 by 100 double precision benchmark is the industry standard. It is a "good" benchmark in that it correlates well with the sustainable performance of the machine on codes with above average performance, but with a problem size "small enough" to clearly illustrate the diminishing returns provided by the more expensive high performance machines. The compact applications were run on platforms currently available on campus and, in some cases, on the Fujitsu VPX240/10 supercomputer at HPCC. The following table is a preliminary summary of the relative performance of servers on campus: Platform om.f st.f xy.f ff.c tr.f (Mflop/s) (Mflop/s) (Mflop/s) (Mflop/s) (MW/s) SGI 4D/360 (R3000 1 cpu) 2.0 (R3000 2 cpus) 2.4 (R3000 4 cpus) 2.6 SUN SS1000 6.3 9.9 SGI Crimson (R4000) 7.8 7.4 2.23 Challenge XL (R4400) 12.4 10.4 3.65 DEC 3000/400 (133 MHz) 14.7 12.7 6.9 4.70 DEC 4000/610 (160 MHz) 11.0 10.5 5.5 IBM RS/6000 370 23.0 12.2 *Single precision IBM RS/6000 590 59.2 20.5 *Single precision Fujitsu VPX240/10 17.0 143.3 225.7 The only preliminary conclusion drawn from these figures sofar is the observation that performance on real-world applications is typically about half of the machine's Linpack rating. This is consistent with the measured performance levels at supercomputing centres in Canada and the U.S. The weighted average performance is then approximately 7.5 Mflop/s, or 1/10th of the average measured performance of a 1989 vintage CRAY Y-MP. Cost-Benefit of High Performance Computers The costs of technology have traditionally dominated discussions in high performance computing to date primarily because expenditures in technology have been very much greater than expenditures on everything else. A new generation of microprocessor technology have now attained a significant fraction of (or even exceed in isolated cases) the performance of the single-processor traditional supercomputer (circa 1984) at a cost ($100K amortized over 3 years at 5000 cpu-hrs/yr yields $7/cpu-hr) that is now less than the costs of people. Another factor is the value of the research program being leveraged by the computing technology. Traditional cost/performance calculations of computers typically do not include this. A survey of research computing expenditures at MUN indicates that more than $0.8 million/yr is currently being spent to support computational research. This includes the cost [57] of approximately 14 cpus of various vintages used primarily as servers (excluding workstations sitting on desks), the salaries of computing support staff, maintenance, software, space, power, and networking. This figure does not include the salaries and overheads for academic staff, post- doctoral fellows, graduate students or the like, or any other costs pertaining to the research programs being pursued (this is the factor being leveraged). If the above cost of computing is divided by the aggregate throughput of the 14 cpus (at 5000 cpu-hrs per cpu per year), the average cost per cpu-hr is about $10/cpu-hr. To most, this number will appear high since there are significant infrastructure and overhead costs now included*hidden costs not normally accounted for. How does this compare with the traditional supercomputer? In 1993, it was possible to purchase an 8 processor 1990-vintage CRAY Y-MP for about $5 million. [58] (Original list price in 1990 was about U$30 million. Current rates of depreciation are 50%/yr, yielding U$3.75 million after 3 years. The price quoted by the vendor is thus consistent with market value) The operational and support costs for this class of machine are high. A good approximation for the full cost is provided by the actual $2.5 million/yr required to support the CRAY at OCLSC in Toronto. This includes a staff of 16, on-site maintenance, facility management, space, power, front-end computers, and networking. Amortizing the machine costs over 3 years in the same way as for the workstations and summing the throughput of 8 cpus, the cost for the Y-MP is about $100 per cpu-hour. Preliminary benchmarks indicate that the average performance of the research codes at Memorial is very well approximated as 1/2 the rated Linpack double precision performance rates for the cpus, or about 7.5 Mflop/s per cpu. The actual average performance reported at OCLSC on the X-MP (scaled to the Y- MP) as well as the actual reported average performance per cpu of the Y-MP at U.S. centres is about 70 Mflop/s. This latter number is somewhat less than half the Linpack performance rating for this machine (161 Mflop/s). These numbers indicate only that an appropriately priced mainframe "supercomputer" can have the same order of magnitude cost/performance (at $100/cpu-hr delivering 70 Mflop/s) as the microprocessor-based server (at $10/cpu-hr delivering 7.5 Mflop/s). Which system provides the best cost-benefit for a given research program? This is an impossibly subjective question. What we can give is the value which gives equal cost-benefit of the two systems, using the formula (a + $10/hr)/7.5 = (a + $100/hr)/70 where a is the value of the investment being leveraged. The "breakeven" point is $0.80/hr. What this means is: if the value of the research being leveraged is greater than $0.80/hr, the faster system provides the best cost-benefit to the research program. Assuming we start with equivalent cost/performance in year 0, what would be the lifetime of the above mainframe be if workstations doubled in performance every year for the same cost and requiring higher benefits from the mainframe over the entire life of the system? To calculate this we perform the same calculation as above, but multiply 7.5 Mflop/s by a power of two for each year. This gives the following table for equal leverage: Year Equal Leverage ($/hr) 0 0.80 1 14.55 2 57.50 3 530.00 To put the numbers in the table in perspective, a salary of $27,500/yr (post- doctoral fellow) is equivalent to $16/hr. With benefits and overheads (at 50%), the cost to the research program is higher, or about $24/hr. The cost to the university of a university professor including salary ($60,000), overheads and benefits (at 50%) is about $54/hr. When the yearly research expenditures of the principal investigators using high performance computers in whole or in part to leverage their research programs, the equivalent "hourly rate" easily doubles or triples. Of course, not all of the "value" of these research programs is leveraged by computers (this is very difficult to calculate), but the working assumption used in this proposal is that a substantial fraction (say 25%) of the total cost of the research programs outlined in this proposal is leveraged by high performance computers and the expenditures on computing technology in the proposed budget are indexed to this value. The higher performance computer in this example leverages these research costs more effectively for a period of only 2-3 years. With performance doubling every year, the lifetime of the mainframe is therefore no more than 2-3 years in this example. Technology renewal on a 2-3 year cycle is therefore essential. Note that as soon as the systems become saturated and the "stretch time" for a job increases, the cost/performance of both the workstations and supercomputers increases dramatically. With the exception of compute servers recently acquired in Earth Sciences and Engineering, the servers at MUN are saturated with stretch times of 3-4 times the CPU time. A stretch time of 4 is equivalent to 2 years of depreciation, which means that a workstation with a lifetime of 2 years becomes instantly obsolete. Workstation servers must therefore be continuously upgraded. Can a premium in cost be justified in order to get increased performance to better leverage a given research program? To answer this question simply yes or no, we can assume a simple model where the cost for increased performance accelerates sharply with a power of the unit performance (for example, cost growing as the square of the performance). This model encompasses both extremes in technology, the mythical ultra-efficient computer delivering 1/10th the desired performance for 1/100th the cost and the mythical "supercomputer" delivering 10 times the performance for 100 times the cost. The cost/performance in this model is then c/p = (a + b*p*p/2)/p Here a is the unit cost of the research program being leveraged, b is the unit cost of the computer system, the factor of 2 is the single year depreciation cost of the computer (50%/yr) assuming performance doubles every year, c/p is the cost/performance and p is the performance. The minimum in the cost/performance occurs for c = p*p = a/(b/2) What this means is that, when the yearly expenditures on research programs (a) are greater than the yearly costs (b/2) of the computer, you can justify a premium cost for the computer if it provides faster turnaround time that directly leverages the value of a research program. This is partly why computational scientists procure RISC systems rather than PCs. If the computer costs are significantly greater than the cost of the research programs being leveraged, a highly undesirable situation, then the cost-effectiveness of the computer is the over-riding concern. The value of the research being leveraged by the technology is crucial to any arguments of cost-benefit. It shows that the most cost-effective technology may not provide the best leverage of a particular investment (i.e. the most cost- effective computer may be a false economy). It also shows that paying a premium to deliver a significant performance gain can be justified if one doesn't spend more on technology than on the research programs. What is also necessary is that when research programs are limited by the performance of the available technology, the technology must be continually upgraded to provide minimum costs with maximum benefits; otherwise, the research programs themselves quickly become non-competitive. Appendix 4: The Collaboratory A. James Stacey*Biographical Sketch Dr. Stacey is a computational scientist with more than 10 years of experience working with state-of-the-art computing technologies. Dr. Stacey obtained his Ph.D. in Physics from the University of Toronto in 1987. During his tenure as a Ph.D. candidate, he obtained a broad range of experience in the development of high performance computing applications in the high energy physics research community. Most notable amongst these experiences was his involvement in an early generation parallel computing project at the University of Toronto based on emulating the IBM 370/168 and harnessing a network of such parallel processing engines in particle physics event reconstruction and Monte Carlo simulations. From January 1987 through January 1992, Dr. Stacey was Computational Scientist at the Ontario Centre for Large Scale Computation, Canada's de facto national supercomputing centre until it was closed in March of 1992. While the closing of OCLSC marked the end of the first generation of academic supercomputing centres, this hiatus in the Ontario supercomputing programme provided the impetus for the foundation of the Collaboratory Inc, which continues to support national initiatives promoting state-of-the-art technologies in addressing pressing scientific grand challenge problems. In January of 1992, Dr. Stacey founded The Collaboratory Inc., becoming a Canadian Corporation in February of 1992 registered in Ontario. In January of 1993, The Collaboratory relocated to St. John's, and registered in the province of Newfoundland and Labrador. In addition to his tenure at OCLSC, Dr. Stacey has participated in the commissioning of the NEC SX-3/44 supercomputer at AES under contract with HNSX Supercomputers Inc. as well as participating in the startup of the HPC High Performance Computing Centre in Calgary. The HPC&C initiative at MUN is Dr. Stacey's fourth involvement in a high performance computing and communications venture. Dr. Stacey has acquired a wide range of expertise in parallel computing on parallel computing arrays and pipelined CRAY, NEC and Fujitsu supercomputers; RISC; networking; UNIX and VAX/VMS; FORTRAN and C; numerical methods (most notably spectral methods in the solution of non-linear differential equations) in ab initio quantum chemistry and atmospheric physics; distributed computing and user interfaces with the X Windows System; and scientific visualization. Prior to entering graduate studies in 1981, Dr. Stacey was employed as a Jr. Engineer in the Nuclear Generation Division of Ontario Hydro. At Ontario Hydro, Dr. Stacey performed reliability modelling and analysis of safety systems in the Reactor Safety Section of the Radioactivity Management and Environmental Protection Department. Dr. Stacey is a member of the Executive Board of Super*CAN (Canada's national association for high performance computing and communications) and an associate member of the Institute of Electrical and Electronics Engineers (IEEE). Professional Services The range of professional services provided by The Collaboratory Inc. is oriented to the immediate needs of industrial, government and academic clients. Some of the areas in which the Collaboratory is active include: * Contract research and development; * Supercomputing consulting; * Technology assessment; * Proposal preparation; * Project management; * Strategic partnerships and collaborations; * Industry, government and academic liaison; * Venture founding and enterprise development; * Support of national initiatives. Technical Expertise With the combination of a strong educational background at the Ph.D. level in the physical sciences, over 10 years of experience in state-of-the-art computing technology as well as his own research contributions in particle physics, computational chemistry and atmospheric physics, Dr. Stacey provides unique capabilities and knowledge to clients of the Collaboratory. To remain at the leading edge of the quickly changing supercomputing marketplace, the Collaboratory is committed to maintaining the highest level of technical expertise on behalf of its clients. The following is a partial list of the areas in which Dr. Stacey has acquired expertise: * Emerging Standards for High-Speed Local Area Networking, Supercomputing '92, Minneapolis. * Archival Storage for Supercomputing Environments; Supercomputing '92, Minneapolis. * Interconnected Heterogeneous Multiprocessors, Supercomputing '92, Minneapolis. * Computational Services in the UNIX Supercomputing Center, Supercomputing '92, Minneapolis. * Programming Massively Parallel MIMD Computers, Supercomputing '91, Albuquerque. * Super and Parallel Computers: Architecture and Performance, Supercomputing '90, New York. Footnotes and References 1 "High Performance Computing (HPC) Technology in Education: An evaluation of needs, status and potential for Memorial University of Newfoundland, and potential benefits for education in Newfoundland," George Miminis and Gayle Tapper, Memorial University of Newfoundland, March 1993. 2 Rounding errors contribute 1% to the sum of the individual entries. 3 Author's note: Memorial University's Department of Computing and Communications (C&C) initiated in 1992 a process of extensive consultation with the university community in drafting an information technology strategic plan. High performance computing forms a small part of this ambitious document and that part of the plan dealing with HPC&C is reproduced in this section. 4 "Report of the Program Advisory Committee to The Division of Advanced Scientific Computing on Future Directions for the NSF Supercomputer Centers Program," Paul Woodward, Chair, June 1992. 5 "Report of the ad hoc Committee on Research Computation," A Report to The Natural Sciences and Engineering Research Council, J. Alan George, Chairman, September 1990, Section 1.2, p.3. 6 "Feasibility Study: High Performance Computing Facilities in Canada," (A study commissioned by the Natural Sciences and Engineering Research Council), Elizabeth Pearce, November, 1992. 7 Ibid, p.9. 8 "Grand Challenges: High Performance Computing and Communications, The FY 1992 U.S. Research and Development Program, A Report by the Committee on Physical, Mathematical, and Engineering Sciences, Federal Coordinating Council for Science, Engineering and Technology, Office of Science and Technology Policy. 9 "Report of the Program Advisory Committee to The Division of Advanced Scientific Computing on Future Directions for the NSF Supercomputer Centers Program," Paul Woodward, Chair, June 1992, p. 1. 10 A Large Scale Computation Evaluation Study For The Council of Ontario Universities, Final Report, Gillespie, Folkner & Associates, Inc., 6 September 1991. 11 A Proposal for an NSERC Collaborative Special Project Grant, Volume 1, Western University Research Consortium on High Performance Computing and Networking, November 1993. 12 This section is largely based on a document entitledComputing at Memorial, by J.P. Whitehead, J.B. Lagowski, M.D. Whitmore, Dept. of Physics, September 1992. 13 "Report of the ad hoc Committee on Research Computation," A Report to The Natural Sciences and Engineering Research Council, J. Alan George, Chairman, September 1990. 14 Ibid, p. 2. 15 CANARIE Workshop, presented by Thomas Grandy, NGL Nordicity Group Ltd., December 1993. 16 K.G. Lamb, Department of Physics, 1993 (unpublished). 17 "High Performance Computing (HPC) Technology in Education: An evaluation of needs, status and potential for Memorial University of Newfoundland, and potential benefits for education in Newfoundland," George Miminis and Gayle Tapper, Memorial University of Newfoundland, March 1993. 18 "Report of the ad hoc Committee on Research Computation," A Report to The Natural Sciences and Engineering Research Council, J. Alan George, Chairman, September 1990, Section 1.2, p.10. 19 "Report of the ad hoc Committee on Research Computation," A Report to The Natural Sciences and Engineering Research Council, J. Alan George, Chairman, September 1990, p.8. 20 The real work of writing the technology assessment will begin only after this document is released. 21 For many observers, the era of parallel computing was ushered in by Danny Hillis of Thinking Machines Corp. in his Keynote Address at Supercomputing '91 in New York. 22 NAS Parallel Benchmark Results 10-93, David H. Bailey, Eric Barszcz, Leonardo Dagum and Horst D. Simon, RNR Technical Report RNR-93-016, October 27, 1993. 23 "The 1992 MPCI Yearly Report: Harnessing the Killer Micros," Lawrence Livermore National Laboratory, August 1992. 24 "Report of the Program Advisory Committee to The Division of Advanced Scientific Computing on Future Directions for the NSF Supercomputer Centers Program," Paul Woodward, Chair, June 1992, p. 10. 25 Towards a New Economy in Newfoundland and Labrador, Economic Recovery Commission Annual Report, September 1 1992 to August 31 1993, from the Foreword. 26 Ibid, p. 3. 27 Ibid, p. 16. 28 Calgary Sun, Sunday, June 6, 1993. 29 "National Initiative for Scientific Computing (NISC): A Proposal For Implementation," A Submission to the Natural Sciences and Engineering Research Council, J. Alan George, Andrew Bjerring, Pardeep Ahluwalia, June 1991, p. 13. 30 Op. cit., p. 9. 31 Op. cit., p.27. 32 Op. cit., p. 23. 33 Ibid, p. 23. 34 Ibid, p. 23. 35 "National Initiative for Scientific Computing (NISC): A Proposal For Implementation," A Submission to the Natural Sciences and Engineering Research Council, J. Alan George, Andrew Bjerring, Pardeep Ahluwalia, June 1991, p. ii. 36 Ibid, p. 11. 37 Ibid, p. 12 38 NSERC Strategy Document, Memo to University Presidents, Vice-Presidents (Research), and Deans of Graduate Studies, by Peter Morand, NSERC President, November 12, 1993, p.2. 39 Ibid, p. 2. 40 B. Attfield, iAi Consulting Inc., private communication. (Mr. Attfield was a former Director of Informatics at AES). 41 "New Life for Spencer Hall as a technology access facility," by Sarah Drinkwater, The Gazette, Jan. 13, 1994. 42 Craig McNamara, Director, Division of Educational Technology, Memorial University of Newfoundland, January 17, 1994. 43 "New Life for Spencer Hall as a technology access facility," by Sarah Drinkwater, The Gazette, Jan. 13, 1994. 44 NSERC Strategy Document, Memo to University Presidents, Vice-Presidents (Research), and Deans of Graduate Studies, by Peter Morand, NSERC President, November 12, 1993, p. 3. 45 NSERC Strategy Document, op. cit., p. 3. 46 NSERC Strategy Document, op. cit., p. 4. 47 NSERC Strategy Document, op. cit., p. 4. 48 "Statement of Paul E. Rubbert before the Subcommittee on Science, House Committee on Science, Space and Technology," October 26, 1993. 49 DP Budget, Computer Economics, January 1990. 50 A Large Scale Computation Evaluation Study For The Council of Ontario Universities, Final Report, Gillespie, Folkner & Associates, Inc., 6 September 1991, p.23. 51 Figures provided by Mr. Aidan Kiernan, Director, University Works. 52 Cost estimates were provided by the Department of Computing and Communications. 53 "Grand Challenges: High Performance Computing and Communications, The FY 1992 U.S. Research and Development Program," A Report by the Committee on Physical, Mathematical, and Engineering Sciences, Federal Coordinating Council for Science, Engineering, and Technology, Office of Science and Technology Policy, Supplement to the President's Fiscal Year 1992 Budget. 54 Western University Research Consortium On High Performance Computing, Draft Proposal, July 20, 1993. 55 Public International Benchmarks for Parallel Computers, PARKBENCH Committee: Report- 1, assembled by Roger Hockney (chairman) and Michael Berry (secretary), Computer Science Department, University of Tennessee, CS-93-213, November 1993. 56 Performance monitors are available on CRAY and NEC supercomputers; none of the commodity microprocessor based RISC or PC systems have performance monitors. 57 Cost is defined as the purchase price amortized over 3 years 58 Cray Research Canada Inc., private communication.