Wednesday, February 3, 2016
[Links] Several multicore programming problem and coping strategies (a problem)
Note: The full text of this article posted from drzhouweiming column (http://blog.csdn.net/drzhouweiming), to get a better reading results, you can view the original here: http: //blog.csdn.net/drzhouweiming/ archive / 2007/04/10 / 1559698.aspx
Several multicore programming problem and coping strategies (a problem)
With the birth of multi-core CPU, multi-core programming issues will be put on the programmer's schedule, there are many old programmers that long ago how CPU machine, the industry programming on multiple CPU machines has accumulated a lot of experience, Programming will be similar multi-core CPU, as long draw before multitasking programming, parallel programming and parallel algorithms aspect of the experience is enough.
I want to say yes, multicore machines and previous multi-CPU machines are very different from previous multi-CPU machines are used in specific areas, such as a server, or some may be large-scale parallel computing areas, which is easy to play the advantages of multi-CPU, multi-core machine is now applied to all aspects of ordinary users, especially the client machine you want to use a multi-core CPU, and a lot of the client software in order to play the advantages of multi-core parallel is probably no server and can be large Simple parallel computing in specific areas.
When the General Assembly when participating in CSDN and Mr. Meng Yan chatted multicore programming, Mr. Meng Yan of multicore programming feel very pessimistic about the future, and when you saw him last year in view of the prospects of multicore programming completely changed. Mr. Meng Yan wanted to have a very deep understanding of the multi-core programming, because of the time issue, and Mr. Meng Yan failed in this regard in-depth talk down. On the way back, I re-thought about at about multicore programming difficulties of today and quickly returned home to write it down, posted for everyone to share.
Serialization problem areas: a puzzle1) The acceleration factor
When measuring the performance of multiprocessor systems, an index typically requires called acceleration factor is defined as follows:
S (p) = single processor execution time (best sequence algorithm) / use p processors having the desired execution time
2) Eminem Jorda Law
There is a parallel processing when Eminem Jorda law, expressed by the following equation:
S (p) = p / (1 + (p-1) * f)
Where S (p) indicates acceleration factor
p represents the number of processorsf represents a serial section occupied by the entire program execution time proportional
When f = 5%, p = 20 时, S around (p) = 10.256
When f = 5%, p = 100 time, S (p) = 16.8
aboutThat is as long as 5% of the serial part, when the number of processors from 20 to 100, the acceleration factor only from about 10.256 to 16.8, an increase of 5 times the number of processors, the speed increase of only 60% more. Even if the number of processors to an infinite number, limit acceleration factor of only 20.
If you follow the laws of Eminem Jorda, it can be said that multi-core and almost no prospects, even if only 1% of the software can not parallelized part, then the maximum speed system can only reach 100, no amount of CPU can not increase speed performance . According to this law, we can say that the development of multi-core CPU makes the continuation of Moore's Law for many years will not reach the limit.
3) Gustafson's Law
Gustafson's law is proposed and Eminem Jorda different assumptions prove acceleration factor is beyond the limits of the laws of Eminem Jorda, Gustafson think software serial portion is fixed and will not increase with increasing size , and assuming the parallel processing execution time is fixed portion (server software may be the case). Gustafson's law using the formula described below:
S (p) = p + (1-p) * fts
Fts which represents the percentage of serial execution
If the serial ratio of 5%, the number of processors is 20, then the acceleration factor of 20+ (1-20) * 5% = 19.05
If the serial ratio of 5%, the number of processor 100, then the acceleration factor of 100+ (1-100) * 5% = 95.05
Gustafson's Law acceleration coefficient was almost proportional to the number of processors, if the reality of the situation in line with the assumption of Gustafson's Law, then the performance of the software will be able to handle with the increase in the number increases.
4) the actual situation in the serial analysis
The results gap Eminem Jorda Law and Gustafson's law is so big, so in the end the reality is that a law in line with it? I personally think that in reality neither like Eminem Jorda's law so pessimistic, but it is not so optimistic as Gustafson's Law. Why do you say? Or make a simple analysis of it.
First need to determine the software in the end there are so content can not be parallelized in order to estimate the proportion of the serial part, when in the 1960s, Bernstein would give three conditions can not be parallel computation:
Condition 1: a storage unit after the write C1, C2 of the read data unit. Called "read-after-write" competition
Condition 2: C1 after reading a data storage unit, C2 write the unit. Called "Reading write" competition
Condition 1: Write a storage unit C1, C2 write the unit. Called "write write" competition
Satisfy the above three conditions can not be any parallel execution. Unfortunately satisfy the above there are a lot of phenomena in the actual software, that is, we often say that the share issue to be locked to protect data.
Serialization problems caused if the lock protection under the premise of a fixed number of tasks, serialization proportion with the increase of software size is reduced, but unfortunately it will increase with the increasing number of tasks , that is to say, the more the number of processors, lock contention would lead to more serious serialized, so that the proportion of serialization sharply increases with the number of processors. (About serialization lock intensified competition led to the situation I will explain in another article). So serialization problem is a major challenge facing multicore programming.
5) Possible solutions
For serial of aspects of the problem, the first thought is to use less lock solutions, even the use of lock-free programming, but it is almost for ordinary programmers difficult to complete the work, because there was no lock programming algorithm is too complicated, and improper use of error-prone, and many have been published in professional journals algorithm to lock-on and later proved to be wrong, imagine how difficult it inside.
The second solution is to use an atomic operation to replace the lock, using atomic operations essentially does not solve the serialization problem, just let serialized speed greatly improved, so that the serial execution time proportional share of the greatly reduced . But chip manufacturers to provide atomic operations currently very limited, only works in a few places, chip makers may also need to continue efforts in this regard, to provide more powerful features little atomic operations of some more places to avoid locks use .
The third solution is to design and to reduce the proportion of algorithmic level serialization share. Perhaps the need to find practical aspects of parallel design patterns to reduce the use of locks, the industry in this regard has already accumulated some experience, such as task decomposition mode, data decomposition mode, data sharing mode, I believe that with the massive use of multi-core CPU there will be more new efficient parallel algorithm design patterns and come up.
The fourth solution is the chip design aspects to consider, because I know nothing about the design of the chip, so the solution may be just my wishful conjecture. The main idea is in the chip level design some new instructions that unlike previous single-core CPU instructions, as is done by a single CPU, but some parallel instruction is completed by a plurality of parallel processing CPU, so the programmer calling these Parallel processing instructions as written in serialized programming procedures, but full use of the advantage of multiple CPU.
The Author: Wei-Ming Zhou, Freelance, in the software industry more than ten years. Currently focused on the content of software testing, multi-core programming, software design and other basic aspects. Written "multi-task under Data Structures and Algorithms," a book, and is currently writing "Software Testing Practice," a book, plans to write a multi-core programming books in the near future.
Reference: "Parallel Programming Mode" Timothy Mattson waiting Ao Fujiang translation
"Parallel Computing Comprehensive Discussion on" Jack Dongarra eds Mo Yao then translated
"Parallel Programming" Barry Wilkinson waited Lu Xinda translated
"Multicore programming techniques" Shameem Akhter waiting Li Baofeng translated
"Parallel Algorithms Practice" Chen Guoliang eds
Reply:
Reply:
Hello, may I ask a question rather naive, but I have not figured out the problem, that is, whether the multi-core parallel programs is to establish different core cpu parallel relationship, like a multi-cpu, like instructions and data are parallel, fair treatment of each cpu. Because for multicore cpu is two separate computing unit placed on a silicon wafer, and a storage space or a public channel. In practical terms is two cpu operator involved in computing or control.
Reply:
Mark, wait to come back to learn
Reply:
learning! thinking!
This is a new piece of treasure
Reply:
to su32fn,
In the preparation of multi-core program, you can generally multicore CPU as much to look at. Multicore just interaction (data, instruction exchange, etc.) between the increase in multi-CPU separate bus, faster.
Reply:
The current multi-core multi-core processor is only one program, there are a lot of multi-core processor architecture.
Such as IBM's Cell, SUN etc. The Spark. Because the better the kind of program, there is not conclusive, academia is also great controversy, the future of parallel algorithms, parallel system architecture development, there are many unknowns.
Reply:
mark
Reply:
Somehow, the problem does not know what to say.
The issue separate understand?
1, layer,
Threads, processes operating system to get these things right, at most, to control what they run in which the nucleus.
Here only consider the relationship between processes, threads and multicore.
2, second floor,
Usually an inner thread to complete a transaction, right?
If you want to separate, just to consider the relationship between the transaction and threads between processes, independent of the core.
So, the place with the lock with a lock, multi-threaded, multi-process, the remaining things to the operating system.
intel multicore activities that give me the phone, I did not go blending.
Reply:
If you listen to Meng Yan, then no future, I also read some of his articles. For him, "everyone is drunk, I tried to say."
In this technology inside, head master method, will be half the power in half.
Multi-CPU and multi-core aspects are interlinked.
I can not say understand, know only a few scratched the surface.
Reply:
Linux multicore debugging environment- Intel + Totalview combination
!Currently, in the software development industry, a variety of excellent performance debugging tools abound. However, most of them only support windows environment. Even if we can support linux platform, the operation is also very convenient. Thus, for long-term program to write on linux developers, how to debug it becomes a matter of headache! Intel software and Totalview Debugger is in this case came into being!
Intel software can produce excellent application performance on Intel architecture, and can take advantage of the advanced features of the latest Intel multi-core processors. Combine TotalView Debugger and Intel software debugging tools will set off a revolution under linux!
TotalView Debugger is a debugging tool for linux platform parallel environment, its IDE environment, multithreading (process) debug capabilities, memory debugging capabilities, cluster debugging capabilities are unmatched in the industry!
XLsoft to join Intel, TotalView company on October 30, 2008 held a "multicore debugging Linux environment" free training seminar in Shanghai. We are very pleased to invite you to participate in, and offers free software trial CD!
I. Registration
Online registration page:
http://www.xlsoft.com.cn/TotalView/TotalView_download.asp
registration hotline: 021-62128912 / 010-84492749
Registration Email: Marketing@xlsoft.com.cn
Second, Lectures:
1. Linux platform under debugging tools overview
2. Intel Software Features
3. Totalview Debugger Features
Third, talk time:
2008 年 10 月 30 日 (星期四) 14:00 ~ 17:00
Fourth, Venue:
Shanghai Pine City Hotel, 3rd Floor, Hyatt long hall
(Xujiahui Zhao Jia Bang Road, No. 777 Dongan junction, about 15 minutes from Hengshan Road Station)
Fourth, the activity details:
Contact: Juan
Tel: 021-62128916 Mobile: 15000262606
E-mail: kiko.wang@xlsoft.com.cn
Hotline:
021-62128912 010-84492749
More service information, please contact us Marketing@xlsoft.com.cn or contact information.
Shanghai Information Technology Co., Ltd. World-wide software
Shanghai Tel: 021-62128912 Beijing: 010-84492749
No comments:
Post a Comment