The purpose of this post is to examine the introduction of parallel computing and the challenges of software development for Parallel execution environment. First I will introduce the idea of parallel computing and up next I will present and evaluate the challenges of parallel computing along with their solutions and finally some conclusion will be drawn.
Vertical & Horizontal Development in Computing
The question arise when we are thinking about how the complex scientific problems of the twenty-first century including climate modeling, genomic research and artificial intelligence are testing the limits of the Von Neumann model of sequential processing.
In the past, computer scientists worked on the new approach to extend the power of computers in vertical manner, this means that they were working on producing huge super computers but with recent advances in technology and reducing cost of resources and arrival of multi kernel processing has helped us to think about new ways to solve huge and complex problem in parallel manners.
Introduction to parallel computing
For the most part, along with a host of new research questions that have arisen in the last decade, there remains a significant challenge today. Parallel processing offers the promise of providing the computational speed required to solve important large-scale problems. In fact, parallel processing requires a big shift in how we think to solve the problem.
Regardless of new hardware technologies, we should think about the new approach of developing software systems and also the way we think about our problem and presenting our solution. (Design and Analysis of Computer Algorithms).
Challenges of parallel computing
For the sake of applying the power and flexibility of multi-core processors, we should think about a new approach to breakdown huge problems into smaller elements. A better illustration of parallel processing occurs when a divide and conquer model is used to solve a task.
In this approach the problem is successively partitioned into smaller and smaller parts and sent off to other processors, until each one has only a trivial job to perform. Each processor then completes that trivial operation and returns its result to the processor that sent the task. These processors in turn do a little work and give the results back to the processors that gave them the tasks, and so on, all the way back to the originating processor. In this model there is far more communications between processors.
n the next step, we should think about how to express our program which can be executable in a parallel computing environment. Functional Programming plays a vital role in this area, since it provide programmer to solve their issue in functional manner rather than sequential processing. There are simple principles in functional programming such as avoiding Mutable states, Lambdas, Closures and more importantly declarative paradigm which help programmers to free their mind about concurrency, synchronization, Race Condition and other multi core computation issues.
Although parallel functional programming helps us to represent our program in declarative manner in order to be applicable for parallel execution, but the problem is remain unsolved without thinking about how we can manage data in parallel computing environment.
Industrial Revolution of Data – Age of Big Data
We’re now entering into new age of computing named as “Industrial Revolution of Data”. In fact, the majority of data will be produced automatically by different kinds of machine such as software logs, video cameras, RFID, wireless sensors and so on.
Due to the considerable decrease in cost of computer resources, storing those data is so cheap, so companies tend to collect and store them in huge data warehouse for future when it can be mined for valuable information. The Big Data now comes to play, working with such distributed, huge and complex data would be impossible or better to say inefficient with existing software and databases system.
We should think about other approaches for storing large set of data which is stored in different computers and in the next step effectively mining and executing queries from those sources. Perhaps the biggest game-changer to come along is MapReduce, the parallel programming framework that has gained prominence thanks to its use at web search companies.
The research in parallel computing has had the most success and influence in parallel databases. In fact, instead of breaking out a large problem into smaller element execute by different threads simultaneously, parallel database help us to store, querying and retrieve data from distributed resources over network effectively.
MapReduce as Parallel Programming Framework
MapReduce algorithm is invented by Google to cope with Big Data in their search engine system. In fact, MapReduce is containing two simple primitives function which are available in Lisp and also in other functional languages. The computation include two basic operation, a map operation which execute on input records containing key/value pairs, and then invoking a reduce operation which collect and aggregate all responses from different nodes.
There are many different Implementations in different programming languages which are exist and used in industry for processing large set of data. In fact, most of NoSQL databases use this algorithm for collecting data from different sources in distributed heterogeneous environment.
The biggest advantage of MapReduce is that it allows for distributed processing of map and reduction function. In fact, it allows us, to collect and process distributed data stored in different machine simultaneously.
Parallel computing can help us to solve hug complex problem in more efficient way. In order to parallelize our task we should think about different challenges which we cope in developing software for parallel execution environment.
However, we should bear in mind that parallel computing is useful when we are facing with a big problem which can distributed among different computing agents. In addition, we should deeply think about the nature of problem, time as well as limits and costs of Parallel Programming.