| Open | Published:
Local DNA sequence alignment in a cluster of workstations: Algorithms and tools
Journal of the Brazilian Computer Societyvolume 10, pages81–88 (2004)
Distributed Shared Memory systems allow the use of the shared memory programming paradigm in distributed architectures where no physically shared memory exist. Scope consistent software DSMs provide a relaxed memory model that reduces the coherence overhead by ensuring consistency only at synchronization operations, on a per-lock basis. Much of the work in DSM systems is validated by benchmarks and there are only a few examples of real parallel applications running on DSM systems. Sequence comparison is a basic operation in DNA sequencing projects, and most of sequence comparison methods used are based on heuristics, that are faster but do not produce optimal alignments. Recently, many organisms had their DNA entirely sequenced, and this reality presents the need for comparing long DNA sequences, which is a challenging task due to its high demands for computational power and memory. In this article, we present and evaluate a parallelization strategy for implementing a sequence alignment algorithm for long sequences. This strategy was implemented in JIAJIA, a scope consistent software DSM system. Our results on an eight-machine cluster presented good speedups, showing that our parallelization strategy and programming support were appropriate.
S. F. Altschul et al. Gapped blast and psi-blast: a new generation of protein database search programs.Nucleic Acids Research, 25(17):3389–3402, 1997.
I. Foster.Designing and Building Parallel Programs. Addison-Wesley, 1995.
K. Gharachorloo. Memory consistency and event ordering in scalable shared-memory multiprocessors. InInt. Symp. On Computer Architecture (ISCA), pages 15–24. ACM, 1990.
S. Hu, W. Shi, and Z. Tang. Jiajia: An svm system based on a new cache coherence protocol. InHigh Performance Computing and Networking (HPCN), pages 463–472. Springer-Verlag, 1999.
W. Hu and W. Shi. Jiajia users manual. Technical report, Chinese Academy of Sciences, 1999.
L. Iftode, J. Singh, and K. Li. Scope consistency: Bridging the gap between release consistency and entry consistency. In8th ACM SPAA’96, pages 277–287. ACM, 1996.
K. Li.Shared Virtual Memory on Loosely Coupled Architectures. PhD thesis, Yale University, 1986.
W. S. Martins, J. B. Del Cuvillo, F. J. Useche, K. B. Theobald, and G. R. Gao. A multithread parallel implementation of a dynamic programming algorithm for sequence comparison. InBrazilian Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pages 1–8, 2001.
R. Melo, M. E. T. Walter, A. C. M. A. Melo, and R. B. Batista. Comparing two long dna sequences using a dsm system. InEuro-Par 2003, pages 517–524. Springer-Verlag, 2003.
D. Mosberger. Memory consistency models.Operating Systems Review, pages 18–26, 1993.
S. B. Needleman and C. D. Wunsh. A general method applicable to the search of similarities of amino acid sequences of two proteins.Journal of Molecular Biology, (48):443–453, 1970.
W. R. Pearson and D. L. Lipman. Improved tools for biological sequence comparison. InProceedings Of The National Academy Of Science USA, pages 2444–2448. NAS, 1988.
G. Pfister.In Search of Clusters — The Coming Battle for Lowly Parallel Computing. Prentice-Hall, 1995.
J. C. Setubal and J. Meidanis.Introduction to Computational Molecular Biology. Brooks/Cole Publishing Company, 1997.
T. F. Smith and M. S. Waterman. Identification of common molecular sub-sequences.Journal of Molecular Biology, (147):195–197, 1981.
E. Speight and J. Bennet. Brazos: a third generation dsm system. InUSENIX/WindowsNT Workshop, pages 95–106, 1997.