Next: Operating Systems Laboratory Up: Department of Computer Previous: Language Processing Systems

Distributed Parallel Processing Laboratory

/ N. N. Mirenkov / Professor
/ S. G. Sedukhin / Associate Professor
/ Shietung Peng / Assistant Professor
/ Sergey V. Ten / Assistant Professor

Our general program is high-performance computing and networking. We are undertaking research efforts in parallel and distributed algorithms/architectures, visual languages and tools. We lead a joint research project ``Massively Parallel Computation: Synchronous vs. Asynchronous Designs'' and take part in two other projects: Aizu Supercomputer and Efficient Algorithmic Design Techniques for Parallel Computing. We are also developing a top-down education project related to distributed parallel processing. Our facilities include the university workstation network, CM-5 and a multi-transputer system, as well as the corresponding software: PVM, LINDA, p4, LAM, HeNCE, PORTAL, AVS, Khoros, etc. S4CAD system developed in the Lab for systematic designing and analyzing systolic algorithms and structures is also used. We are studying the problem of parallel program portability and developing the supporting tools. Our theoretical work is related to efficient algorithms for parallel numerical computation including matrix multiplication, Fourier transformation, and Singular Value Decomposition as well as for fault tolerant routing on certain interconnection networks, like hypercubes, star graphs, and torus.

The important question of our research is parallel program transparency and accessibility of massively parallel computers to large user populations. We are developing a visual language paradigm for the interactive specification of application algorithms. The approach is based on a set of computational schemes (``shapes'' of computation) presented by color figures, pictures and animation films with sound accompaniment. Each film is related to a series of frames (computational steps) and it reflects some knowledge about data processing. Each frame ``brightens'' up a substructure of data for which operations should be specified. As a rule, this substructure is a set of points and/or moving objects in a multi-dimensional space-time. A user embeds his algorithm into computational schemes by making these schemes more precise. In fact, he defines the specification by creating his new film. The corresponding program (sequential or parallel) is generated automatically.

Each computational scheme itself is considered as a special-purpose application for which a parallel program is developed in advance as an item of the system library. The specification of a problem is considered as input data for such a parallel program. However, this data can introduce a composition of a few computational schemes, an irregularity of computation, special sizes of data, etc. It means that to reach efficiency it is impossible to have only one variant of the scheme implementation. So, each computational scheme is supported at least by a few parallel programs from the system library. Different programs are related to different conditions being defined by the specification. In other words, a computational scheme has a special space of decision-making. This space takes into account levels of problem granularity (coarse-grained, middle-grained and fine-grained), irregularity (small, essential, great), dynamics of the data decomposition (static, hybrid and fully dynamic), and some others related to algorithm features. Because each computational scheme reflects certain knowledge about data processing, we select the best possible solutions for the implementation of this data processing under different conditions and on the basis of the best hand-made algorithms. As a result, in the space of decisions for a computational scheme, all points are not used, only a part or even a few are used. For these selected points we are developing scalable parallel programs as items of the system library. It is important to note that for each such program a conventional environment is choosed for the final stage of implementation. Now, extensions of Fortran and C as well as AVS, PVM and Linda are considered as a set of such an environment.

Refereed Journal Papers

N. Mirenkov. Parallel language developments in Russia. Computing &Control Engineering Journal, 4(1):37-44, 1993.
A brief overview of parallel programming languages developed in the former USSR is given. First some historical remarks and early results are described. Then application-oriented languages are presented and programming systems supporting the partnership of universalism and specialisation are considered.
S. Peng, A.B. Stephens, and Y. Yesha. Algorithms for a core and k-tree core of a tree. Journal of Algorithms, 15:143-159, 1993.
K-tree core of a tree which is a natural generalization of a core of a tree is defined in this paper. Two efficient algorithms for finding a k-tree core of a given tree are given. The time complexities of the algorithms are O(kn) and O(n log n), respectively, where n is the size of the tree. The algorithms have applications on resource allocation in a distributed database.
S. Peng and W. Lo. A simple optimal parallel algorithm for a core of a tree. Journal of Parallel and Distributed Computing, 20:388-392, 1994.
A core of a tree T is a path P in T which minimizes the sum of the distances from the nodes in T to path P. An optimal parallel algorithm for finding a core of a tree is given in this paper. The algorithm runs in O(log n) time using O(n/(log n)) processors on the EREW PRAM model.
S. Peng and W. Lo. The optimal location of a structured facility in a tree network. Journal of Parallel Algorithms and Applications, 2:43-60, 1994.
In this paper, efficient parallel algorithms for finding an optimal path-shaped or tree-shaped facility with a specified size in a tree network are presented. The optimization criteria considered are minimizing or maximizing the sum of distance or the eccentricity. The parallel computational model used is an EREW PRAM.

Refereed Proceeding Papers

W. Händler and N. Mirenkov. Multiprocessors: Why we favour pyramids. In V. Malyshkin, editor, Parallel Computing Technologies, pages 1-24, Obninsk, Sep. 1993. ReSCo J.-S.C0.
The subject of the paper is pyramid structures (topologies) for multiprocessors, which show good properties with respect to scalability, adaptability, generality, performance and fault tolerance. The conceptual basis of our approach is the following: tight coupling, restricted neighbourhood, distributed shared memory, regularity, homogeneity, hierarchy and layer specialization. We consider a model of a connection-network, a construction kit for multiprocessors as well as potential efficiency of the pyramid system, features of pyramid links, and functional specialization of pyramid subsystems.
N. Mirenkov and A. Vazhenin. SCORE: Scientific computers for overcoming rounding errors. In G.R. Joubert, editor, Advances in Parallel Computing. ParCo-93, pages 24-32, Grenoble, Sep. 1993. Elsevier Science.
The performance of prospective parallel computers approaches the terra-flops level. The important feature of the architectures of these computers is scalability. However, it is limited to the number of processors and memory units but does not take into account the word length of operands. The idea behind the SCORE-project is the development of a few prototypes of parallel architectures which make use of the dynamically changed length of operands, interval analysis and symbolic transformations. This kind of architecture is considered to be a basis for high-accuracy computations by the elimination of round-off errors. In this paper we will describe one of the SCORE-prototypes developed within the framework of the vertical processing architecture. The description includes the philosophy of the project and the architecture of the system. We will briefly consider the software environment, as well as some experiments and comparisons.
G. Adhar and S. Peng. Parallel algorithms for k-tree recognition and its applications. In Proceedings of the 26th Hawaii International Conference on System Sciences, pages 194-201, Maui, Hawaii, USA, Jan. 1994. IEEE Computer Society Press.
In this paper, we present parallel algorithms to recognize k-trees and to find a collection of k vertex disjoint paths (cables) between any two nodes in a k-tree. We give a tree representation of a k-tree and a parallel construction of the tree representation. The parallel algorithm for recognition runs in time using processors. The parallel algorithm for finding cables runs in time using processors.
Q.P. Gu and S. Peng. Efficient algorithms for disjoint paths on star networks. In Proceedings of the 6th Transputer/Occam International Conference, Tokyo, Japan, June 1994.
This paper presents optimal algorithms for finding disjoint paths in a star graph which is an interesting network for fault tolerant computation. Our results on k-pairwise routing is time which improves significantly the previous result of . The length of the longest path constructed in our algorithms also improve over the previous results.
Q.P. Gu and S. Peng. Disjoint paths in hypercubes and star graphs. In Proceedings of Joint Symposium on Parallel Processing, Tokyo, Japan, May 1994.
The set-to-set routing which allows certain node failure is considered. Efficient algorithms for set-to-set routing in hypercubes and star graphs are given. For n-dimensional hypercubes, the algorithm finds the disjoint paths of length at most 2n-2 in time. For n-dimensional star graphs, the algorithm finds disjoint paths of length at most d(G)+5 in time, where d(G) is the diameter of the graph G.
S. Sedukhin. A new systolic architecture for pipeline prime factor DFT-algorithm. In Fourth Great Lakes Symposium on VLSI. Design Automation of High Performance VLSI Systems. March 4-5, 1994, University of Notre Dame, Indiana, pages 40-45, 10662 Los Vaqueros Circle, P.O. Box 3014, Los Alamitos, CA 90720-1264, March 1994. University of Notre Dame, Notre Dame, IN, USA, IEEE Computer Society Press.
The paper shows how a rectangular array of N=N1xN2 processing elements (PE), where N1 and N2 are relatively prime, can be used to carry out efficient two-dimensional systolic implementation of N-point DFT, offering highly attractive throughput rates in relation to other -processor solutions, such as the conventional linear systolic array. The systematic approach is allowed to chose, among all possible systolic processors for 2DDFT-algorithm, an optimal design which has the minimum number of locally connected PE's, good coordination between the processes of computation and communication, small number of I/O pins, minimum possible time of processing and minimum amount of input data.
S. Sedukhin and I. Sedukhin. An interactive graphic CAD tool for the synthesis and analysis of VLSI systolic structures. In V. Malyshkin, editor, Proceedings of the International Conference on the Parallel Computing Technologies (PaCT-93), pages 163-175, 113405, 125, Warshavskoe shosse, Moscow, Russia, Aug. - Sept. 1993. Computing Center of Siberian Division of Russian Academy of Sciences, ReSCo J.-S.Co.
This article presents a systematic approach for formal synthesis of a set of systolic array processors for the matrix algorithms. This systematic approach is the theoretical background for visual software S4CAD tool, which is either considered. Starting with the source algorithm specification in the form of the linear recurrent equations formal approach and the S4CAD tool allow us to obtain and examine the set of admissible array processors. The tool uses advanced media technologies of the Windows 3 graphical environment that gets more comfort for the user to evaluate and chose an optimal solution. Number of basic algorithms of linear algebra and graph theory was considered with the formal approach and added to the library.

Technical Reports

T. Ikedo and N. Mirenkov. Aizu Supercomputer: A reality problem engine. 93-2/1-016, The University of Aizu, 1993.
N. Mirenkov and A. Vazhenin. SCORE: Scientific computers for overcoming rounding errors. 93-1-012, The University of Aizu, 1993.
Q. Gu and S. Peng. Efficent algorithms for disjoint paths in star graphs. TR 93-1-015, The University of Aizu, November 1993.
S. Peng, Q.P. Gu, and S. Okawa. Set-to-set routing in hypercubes and star graphs. 94-1-002, The University of Aizu, 1994.
Q.P. Gu and S. Peng. Disjoint paths paradigms in incomplete star graphs. 94-1-003, The University of Aizu, 1994.
S.G. Sedukhin and I.S. Sedukhin. Systematic approach and software tool for systolic design. 93-1-010, The University of Aizu, 1993.
Sergey V. Ten. PVM and HENCE: Frameworks for parallel distributed processing. 93-1-011, The University of Aizu, 1993.
Omar Hammami and Sergey V. Ten. Instruction scheduling and cache management at the basic block level - algorithm and hardware support. 93-2-007, The University of Aizu, 1993.

Doctoral Dissertations Advised

Bulysheva L.A. Program Integration before Parallelisation, The University of Novosibirsk, 1993.
Thesis Advisor: N. N. Mirenkov.
Vartazaryan G. Multiple-precision Computation of Mathematical Functions on Vertical Processing Computers, The University of Novosibirsk, 1993.
Thesis Advisor: N. N. Mirenkov.
Kovtuchenko A.P. Program Parallelisation for VLIW-computers, The University of Novosibirsk, 1993.
Thesis Advisor: N. N. Mirenkov.
Karapetian G. Design and analysis highly parallel algorithms and structures for matrix multiplication, The University of Novosibirsk, 1993.
Thesis Advisor: S. G. Sedukhin.

Academic Activities

Nikolay N. Mirenkov. American mathematical society (1975-), 1993.
Reviewer and member of American Mathematical Society.
Nikolay N. Mirenkov. Russian academy of sciences (1991-), 1993.
Member of Editorial Board of Journal PROGRAMMING, Nauka.
Nikolay N. Mirenkov. John wiley publisher, USA(1992-), 1993.
Member of Editorial Board of the Journal SCIENTIFIC PROGRAMMING.
Nikolay N. Mirenkov. Chapman and Hall, UK (1993-), 1993.
Member of Editorial Board of the new series of books in Parallel and Distributed Computing.
Nikolay N. Mirenkov. CONPAR-94/VAPP-Y Conference, Linz, Austria, 1994, 1993.
Member of Program Committee and Member of Standing Committee.
Shietung Peng. IEEE (1988-), 1993.
Referee of IEEE Trans. Computer.
Shietung Peng. IEEE (1992-), 1993.
Referee of IEEE Trans. Parallel and Distributed Systems.
Shietung Peng. Academic Press (1990-), 1993.
Referee of Journal of Algorithms.
Shietung Peng. Academic Press (1987-), 1993.
Referee of Journal of Parallel and Distributed Computing.
Stanislav G. Sedukhin. World Scientific Publisher, Singapore (1992-), 1993.
Member of Editorial Board of the International Journal Parallel Processing Letters.
Stanislav G. Sedukhin. CONPAR-94/VAPP-Y Conference, Linz, Austria, 1994, 1993.
Member of Program Committee.
Stanislav G. Sedukhin. Proc. of the Novosibirsk Computing Center (1992-), 1993.
Editor-in-Chief of the Computer Science series.

Next: Operating Systems Laboratory Up: Department of Computer Previous: Language Processing Systems

a-fujitu@edumng1.u-aizu.ac.jp
Fri Feb 10 09:19:38 JST 1995