/ N. N. Mirenkov / Professor
/ S. G. Sedukhin / Associate Professor
/ Shietung Peng / Assistant Professor
/ Sergey V. Ten / Assistant Professor
Our general program is high-performance computing and networking. We are undertaking research efforts in parallel and distributed algorithms/architectures, visual languages and tools. We lead a joint research project ``Massively Parallel Computation: Synchronous vs. Asynchronous Designs'' and take part in two other projects: Aizu Supercomputer and Efficient Algorithmic Design Techniques for Parallel Computing. We are also developing a top-down education project related to distributed parallel processing. Our facilities include the university workstation network, CM-5 and a multi-transputer system, as well as the corresponding software: PVM, LINDA, p4, LAM, HeNCE, PORTAL, AVS, Khoros, etc. S4CAD system developed in the Lab for systematic designing and analyzing systolic algorithms and structures is also used. We are studying the problem of parallel program portability and developing the supporting tools. Our theoretical work is related to efficient algorithms for parallel numerical computation including matrix multiplication, Fourier transformation, and Singular Value Decomposition as well as for fault tolerant routing on certain interconnection networks, like hypercubes, star graphs, and torus.
The important question of our research is parallel program transparency and accessibility of massively parallel computers to large user populations. We are developing a visual language paradigm for the interactive specification of application algorithms. The approach is based on a set of computational schemes (``shapes'' of computation) presented by color figures, pictures and animation films with sound accompaniment. Each film is related to a series of frames (computational steps) and it reflects some knowledge about data processing. Each frame ``brightens'' up a substructure of data for which operations should be specified. As a rule, this substructure is a set of points and/or moving objects in a multi-dimensional space-time. A user embeds his algorithm into computational schemes by making these schemes more precise. In fact, he defines the specification by creating his new film. The corresponding program (sequential or parallel) is generated automatically.
Each computational scheme itself is considered as a special-purpose application for which a parallel program is developed in advance as an item of the system library. The specification of a problem is considered as input data for such a parallel program. However, this data can introduce a composition of a few computational schemes, an irregularity of computation, special sizes of data, etc. It means that to reach efficiency it is impossible to have only one variant of the scheme implementation. So, each computational scheme is supported at least by a few parallel programs from the system library. Different programs are related to different conditions being defined by the specification. In other words, a computational scheme has a special space of decision-making. This space takes into account levels of problem granularity (coarse-grained, middle-grained and fine-grained), irregularity (small, essential, great), dynamics of the data decomposition (static, hybrid and fully dynamic), and some others related to algorithm features. Because each computational scheme reflects certain knowledge about data processing, we select the best possible solutions for the implementation of this data processing under different conditions and on the basis of the best hand-made algorithms. As a result, in the space of decisions for a computational scheme, all points are not used, only a part or even a few are used. For these selected points we are developing scalable parallel programs as items of the system library. It is important to note that for each such program a conventional environment is choosed for the final stage of implementation. Now, extensions of Fortran and C as well as AVS, PVM and Linda are considered as a set of such an environment.
Refereed Journal Papers
A brief overview of parallel programming languages developed in the former USSR is given. First some historical remarks and early results are described. Then application-oriented languages are presented and programming systems supporting the partnership of universalism and specialisation are considered.
K-tree core of a tree which is a natural generalization of a core of a tree is defined in this paper. Two efficient algorithms for finding a k-tree core of a given tree are given. The time complexities of the algorithms are O(kn) and O(n log n), respectively, where n is the size of the tree. The algorithms have applications on resource allocation in a distributed database.
A core of a tree T is a path P in T which minimizes the sum of the distances from the nodes in T to path P. An optimal parallel algorithm for finding a core of a tree is given in this paper. The algorithm runs in O(log n) time using O(n/(log n)) processors on the EREW PRAM model.
In this paper, efficient parallel algorithms for finding an optimal path-shaped or tree-shaped facility with a specified size in a tree network are presented. The optimization criteria considered are minimizing or maximizing the sum of distance or the eccentricity. The parallel computational model used is an EREW PRAM.
Refereed Proceeding Papers
The subject of the paper is pyramid structures (topologies) for multiprocessors, which show good properties with respect to scalability, adaptability, generality, performance and fault tolerance. The conceptual basis of our approach is the following: tight coupling, restricted neighbourhood, distributed shared memory, regularity, homogeneity, hierarchy and layer specialization. We consider a model of a connection-network, a construction kit for multiprocessors as well as potential efficiency of the pyramid system, features of pyramid links, and functional specialization of pyramid subsystems.
The performance of prospective parallel computers approaches the terra-flops level. The important feature of the architectures of these computers is scalability. However, it is limited to the number of processors and memory units but does not take into account the word length of operands. The idea behind the SCORE-project is the development of a few prototypes of parallel architectures which make use of the dynamically changed length of operands, interval analysis and symbolic transformations. This kind of architecture is considered to be a basis for high-accuracy computations by the elimination of round-off errors. In this paper we will describe one of the SCORE-prototypes developed within the framework of the vertical processing architecture. The description includes the philosophy of the project and the architecture of the system. We will briefly consider the software environment, as well as some experiments and comparisons.
In this paper, we present parallel algorithms to recognize k-trees and to find a collection of k vertex disjoint paths (cables) between any two nodes in a k-tree. We give a tree representation of a k-tree and a parallel construction of the tree representation. The parallel algorithm for recognition runs in time using processors. The parallel algorithm for finding cables runs in time using processors.
This paper presents optimal algorithms for finding disjoint paths in a star graph which is an interesting network for fault tolerant computation. Our results on k-pairwise routing is time which improves significantly the previous result of . The length of the longest path constructed in our algorithms also improve over the previous results.
The set-to-set routing which allows certain node failure is considered. Efficient algorithms for set-to-set routing in hypercubes and star graphs are given. For n-dimensional hypercubes, the algorithm finds the disjoint paths of length at most 2n-2 in time. For n-dimensional star graphs, the algorithm finds disjoint paths of length at most d(G)+5 in time, where d(G) is the diameter of the graph G.
The paper shows how a rectangular array of N=N1xN2 processing elements (PE), where N1 and N2 are relatively prime, can be used to carry out efficient two-dimensional systolic implementation of N-point DFT, offering highly attractive throughput rates in relation to other -processor solutions, such as the conventional linear systolic array. The systematic approach is allowed to chose, among all possible systolic processors for 2DDFT-algorithm, an optimal design which has the minimum number of locally connected PE's, good coordination between the processes of computation and communication, small number of I/O pins, minimum possible time of processing and minimum amount of input data.
This article presents a systematic approach for formal synthesis of a set of systolic array processors for the matrix algorithms. This systematic approach is the theoretical background for visual software S4CAD tool, which is either considered. Starting with the source algorithm specification in the form of the linear recurrent equations formal approach and the S4CAD tool allow us to obtain and examine the set of admissible array processors. The tool uses advanced media technologies of the Windows 3 graphical environment that gets more comfort for the user to evaluate and chose an optimal solution. Number of basic algorithms of linear algebra and graph theory was considered with the formal approach and added to the library.
Technical Reports
Doctoral Dissertations Advised
Thesis Advisor: N. N. Mirenkov.
Thesis Advisor: N. N. Mirenkov.
Thesis Advisor: N. N. Mirenkov.
Thesis Advisor: S. G. Sedukhin.
Academic Activities
Reviewer and member of American Mathematical Society.
Member of Editorial Board of Journal PROGRAMMING, Nauka.
Member of Editorial Board of the Journal SCIENTIFIC PROGRAMMING.
Member of Editorial Board of the new series of books in Parallel and Distributed Computing.
Member of Program Committee and Member of Standing Committee.
Referee of IEEE Trans. Computer.
Referee of IEEE Trans. Parallel and Distributed Systems.
Referee of Journal of Algorithms.
Referee of Journal of Parallel and Distributed Computing.
Member of Editorial Board of the International Journal Parallel Processing Letters.
Member of Program Committee.
Editor-in-Chief of the Computer Science series.