Synchronization primitives

OS concept 2005. 2. 28. 20:38

Synchronization primitives

Note that there are no "official" definitions for these terms, so different texts and implementations associate slightly different characteristics with each primitive.

semaphores

Two operations, performed by any thread:

original Dijkstra	P()	V()
Tanenbaum	down()	up()
POSIX	sem_wait()	sem_post()
Silberschatz	wait()	signal()
operation	`while (s==0) {wait}; s--`	`s++`

Note that signals are saved, unlike condition variables. Useful for counting resources (initial value > 1), to implement mutexes (initial value 1) or to signal completion of code across threads (initial value 0). Some semaphore implementations allow decrementing by more than one.

mutex

Also known as a lock. Supports lock/unlock operation. In many implementations, the same thread must lock and unlock, but different threads can share the same mutex. POSIX says: "Mutexes have ownership, unlike semaphores. Although any thread, within the scope of a mutex, can get an unlocked mutex and lock access to the same critical section of code, only the thread that locked a mutex can unlock it." Can be implemented via a semaphore.

binary semaphore

Semaphore that can take two values, 0 and 1. Can be used as a mutex or to signal completion of code across threads.

locks

Same as mutex.

events (from Tanenbaum, Nutt)

Nutt: similar to condition variables, without the mutex

signals (from Tanenbaum)

Signals are "interrupts" that cause a process to asynchronously either abort or jump to a signal handler designated by the process. Pending system calls are interrupted and return an error indication. Also used in conjunction with semaphores and condition variables.

condition variables

Condition variables allow several threads to share a monitor (region). Condition variables support two operations, signal() and wait(). Some implementations (Java, POSIX) also support a "broadcast" signal (notifyAll() in Java, pthread_cond_broadcast in POSIX). A thread that reaches a wait in a monitor always waits until another thread sends a signal. If several threads are waiting on a condition variable, signal awakens one of them, while the broadcast signal awakens all.

The choice of threads being signaled depends on the implementation. Most implementations make no guarantees.

Condition variables are not counters (i.e., unlike semaphors). Signals do not accumulate. If there is nobody waiting for a signal, the signal is lost.

AND synchronization (Nutt, p. 222)

Obtains all required semaphores or none at all. Can possibly be implemented using a T&S with a bit mask.

monitors

Class or segment of code that can only be executed by one thread at a time. Variables within the monitor can only be accessed by routines in the monitor. Monitor often use condition variables to allow blocking within the monitor.

Test-and-set machine instructions

Atomic instruction that sets the value of a memory location to "true". It returns the value the memory location had before setting it. It can be used to implement semaphores and locks. It does not block.

property	lock	semaphore	condition variable
method that blocks calling thread (ANSI)	`acquire()`, `mutex_lock()`	`sem_wait()`	`pthread_cond_wait()`
method that releases	`mutex_unlock()` (same thread)	`sem_post()` (same or other thread)	`pthread_cond_signal()` (other thread)
method that probes without blocking	`pthread_mutex_trylock()`	`sem_trywait()`	not available
behavior of first thread to reach wait	doesn't block	doesn't block unless semaphore initialized to zero	blocks
data members	List threads; / waiting threads */ boolean locked;	List *threads; unsigned int count;	List threads; Lock mutex;

FAQ

Is a lock variable and a mutex the same thing?

Yes. Unfortunately, you'll also find mutex used as the name for a semaphore variable (e.g., Silberschatz, p. 174)

Is an event and a condition variable the same thing, except an event signals all processes waiting on it and a condition variable signals only one process?

Condition variables are used with monitors, i.e., in association with a mutex, events are used by themselves. Both can use "broadcast" signaling, as in notifyAll in Java.

Is a mutex and a conditional variable the same thing except a mutex has a value associated with it and a conditional variable does not?

No, if no other process has locked the mutex, the thread will "pass by" the mutex without waiting. For a condition variable, the thread will always wait until being signaled.

How is thread blocking really implemented?

Text books typically show some kind of waiting operation that blocks the thread when it needs to wait for a resource. However, this doesn't really work unless you had some kind of message system and a single thread per processor. Thus, in reality, a thread will put itself on the queue for a synchronization primitive and then suspend itself. When the synchronization variable unlocks (gets signaled, etc.), the calling process removes the first waiting thread from the queue and puts its back on the ready queue for scheduling. With condition variables, the signaling thread may also suspend itself.

What is the typical use of condition variables?

Condition variables not used to protect anything. The associated mutex does, but that's no different from a lock. A CV is a mechanism to temporarily release the lock until some event occurs that makes it sensible for the waiting thread to continue.

Thus, the thread calling the wait() often looks something like

lock to protect variables;
while (1) {
  wait (until work/message/condition/event arrives, 
    possibly from one of several sources; unlock mutex while waiting);
  get new work, protected by mutex;
}

This is sometimes called a worker thread.

Any of the threads that are sources for work then do

while (1) {
  read(); /* some blocking operation on a channel/file/network socket */
  get lock;
  put work on some queue protected by lock;
  release lock; /* be sure to do this *before* signaling 
    - otherwise the wait() in the worker thread can't return */
  signal to worke thread;  /* we got work for you! */
}

R4000 synchronization primitives

'OS concept' 카테고리의 다른 글

임베디드 시스템의 기본 #1 (0)	2006.07.20
실시간 운영체제 종류 (0)	2006.01.23
Multi-Threaded Programming With POSIX Threads (0)	2005.02.28
Application Development Guide --Conditional variable (0)	2005.02.28
System V 계열 세마포어(semaphore)를 통한 상호배제 (0)	2005.02.18

Posted by '김용환'

Becoming A Real Programmer

paper and essay 2005. 2. 28. 20:34

(퍼옴) http://users.actcom.co.il/~choo/lupg/essays/becoming-a-real-programmer.html

v1.0.0

Becoming A Real Programmer (thinking about thinking...)

Table Of Contents

So you want to become a programmer. an independent programmer, that is not dependent on specific people to hold their hand, that can design a project and lead its development. A real programmer.

Assuming that's what you want to become, and assuming you're not there yet - how do you go about doing that?

How long will it take?

Quite often, it will take you several years to become a real programmer. If you happen to have around you good programmers who are also good teachers, the time could shrink by much. But nothing will make this time shrink more then by you thinking about what you do, trying to develop a plan for your self-advancement, and criticizing your progress, your plan, and your changes to the plan.

How critical is self-criticism?

If you do something, and you think you did well - you limit your ability to do it better next time - because you don't see what wasn't good enough.

If you leave self-criticism to certain occasions (such as after fully completing a given project), you miss the opportunity to learn during this project. Learning means seeing something that wasn't good enough, thinking how to do it better, trying the new way, checking if it improved something, and continuing that until you feel this isn't your worst problem any more. The longer you wait between cycles, the longer the overall process takes to converge.

Of-course, changing things all the time won't work either - cause some methods of operation only justify themselves if used for a long period of time.

Assuming that initially even usage of simple methods of operation will make us work better, we can start with using rather quick cycles, and as we make progress and master the simple methods, we can increase the time between cycles, in order to assess more complicated methods of operation.

This is too theoretical - how about some examples?

Let us take the following example - we were told that documenting code is useful. How useful is it? How much and where to document? we need to write 'according to the book' in our first project. Leave the code for one week, and then come back and try to add a feature. Did we need to read something several times in order to remember what's going on? then lets add a comment there. Some comments were not clear? lets try to make them clear - perhaps write more? perhaps write less? perhaps improve our English grammar?

After we played with documentation for a few month we want to get to work on a larger document - something that explains the general design of a software module, for example. This takes longer to write, and it is hard to assess its quality without fully writing it, leaving that software project for a month, and then getting back to it. Our cycles slowed, because we already learned the simple things (documentation of single functions), and the more complicated methods take longer to complete, and require waiting longer in order to assess their usefulness (you don't forget the design of a module after 1 week, but you easily forget the internals of a specific function after a few days).

Letting ideas "sink"

The method of leaving something and getting back to it later is based on two principles:

we need to wait a while in order to be able to look at something in a new perspective - we might have got locked in one manner of thinking, and simply leaving the material for a while often helps us to unlock - because we forgot what it was that kept us locked in.
when we need to deal with something new, we often don't fully understand it. If we leave it for a while and then come back to it, we suddenly see things we didn't see before - understand little nuances we overlooked when we were busy grasping the major principles. It could also be that we worked on a different project meanwhile, and when we got back to the original project, we saw how something we learned during the second project, could have been applied to the first project. We correlate.

Broadening the scope of lessons learned

When we find a mistake we made, we can simply find a fix for it and use this fix next time, or we can instead try to check if there is a more general class of mistakes, to which this specific mistake belongs. In the later case, we can find a solution to a whole class of mistakes, and thus not have to learn from different mistakes in the same class - we will avoid doing those other mistakes in the first place.

For example, suppose that we made an error of forgetting to allocate the extra null character when allocating a string in a "C" program. We coul just fix this bug and check the program again. We could think "oh, we didn't think about it", and go over all string allocations in our program to check they don't contain the same bug. We might decide that memory allocation is error-prone in general, and instead of doing the allocation and calculation of required space all over the code - we should write a function that allocates strings. We can broaden this to other types of allocations - and write functions that allocate objects of other types. If we had this problem with allocation, perhaps we have a similar problem with objects initialization - lets write functions that initialize objects, and use them, instead of writing the same initialization code all over the place. We start to see that even very small code re-use reduces the number of bugs in our code. We move from "ah, this initialization only requires 2 lines of code - why write it in a function?", to "repeating these 2 lines of code again and again will mean that if we make a mistake in 1 out of 4 locations in the code - then with 20 such code locations, we will have 5 bugs - all because we were too lazy to write a 2-line function".

See how we got from a specific memory-allocation error, to learning something about avoiding code repetition, which we would have dismissed as "a waste of time" if we were directly asked to do this in the first place.

Short-cuts don't let lessons sink-in

We can argue that an experienced programmer could have shown us all of this immediately, and with enough patience, show us why something that looks trivial can save us a lot of time in the total development process. However, using such a short-cut prevents from the lesson to sink-in. We will still be tempted to choose the quicker path over using the slower path, that makes the overall journey shorter. We have to make our own mistakes in order to learn. Or we have to see how our friends made mistakes and learn from them...

Learning from other people's mistakes

When we learn of a mistake that someone else made, it is easy to think "this won't happen to me, I'm smarter" and dismiss it. But if we already invested time in seeing this mistake, why not stop, reminding ourselves we are only human and in certain situations we might do a similar (stupid) mistake? After we admit this to ourselves, we can go about learning from this mistake, as if it was our own mistake.

One possible conclusion that follows is that helping other programmers fix their own bugs can help us learn, and thus is a useful activity - not just an annoying burden.

The better programmer you become - the more humble you get

Pride is a major obstacle for learning. "I am smart, i won't make this mistake again" is an error that will lead us to do a very similar mistake soon after. "I write good code, so i don't need to check it" will mean our simple local bugs will be hidden, and get exposed later when our module is used by other modules, which make it behave differently then our feeble tests did. By then, the bugs will be harder to track down, because the test-case is more complicated. Thus, we should reverse the cause and the result - "I check my code, and thus i write good code".

Pride will also make us think we can learn everything on our own. Given infinite time this might be true - but life is short. We want to learn faster. Let us, then, learn how to listen to other people's criticism, and weed out the useful parts in their criticism, even if we think the majority of this criticism is wrong or pointless. For example, if someone keeps telling us that we did not test our code properly, while we spent a lot of time on testing, instead of dismissing the criticism - let us listen to it. Several examples that we will be given will be wrong - we did test those parts of the code properly. But once in a while some claim for lack of testing will be correct - aha, we indeed forgot to check that input option. Now that we see it, we can start our generalization process - how do we find all possible inputs and test all of them? if there are too many possible inputs - how do we choose selected test-cases that will help identify most of the probable bugs? How do we choose test-cases in general, not just for testing inputs - what about testing timing problems?

Being lazy the smart way

"I will find the bugs when i debug the code" is the claim of the sloppy lazy programmer, who refuses to read their own code again after they wrote it. The smart lazy programmer will claim "if i put more time re-reading my code now when it is fresh, I will probably spend less time over-all on writing and testing this code module - if I consider the time I spend on it now, as well as during system integration and system testing later on".

The smart ultra-lazy programmer will think even further then this - "if I spend time now on learning lessons, in the long run i will be able to finish programming tasks in a shorter duration, and the time saved in the future will be much more then the extra time i spend now on my lessons learning".

Conclusions

In order to become real programmers, we need to think of what we plan to do. We need to think of what we just did, and of what we did a short while ago, of what we did a long while ago... Then we need to learn lessons. And then we need to broaden the lessons to more generalized classes of mistakes.

We also need to listen to other people - they might see now something that we will only see next year - even if they are not smarter then us. They are simply not locked-in by things locking us in, that we did not identify yet. Or they had the chance to learn a lesson due to a mistake they made in the past, that we didn't come across yet. Perhaps they come from a completely different background, and have completely different methodologies of thinking, that make some bug easier to spot. So we humble ourselves, and listen.

And we must avoid letting our short-term laziness hurt our much-more-useful long-term laziness.

'paper and essay' 카테고리의 다른 글

bash 스트링 조작하기 - String 함수 구현하기 (0)	2008.01.18
bash shell (0)	2008.01.18
Asking "Why?" at Sun Microsystems Laboratories (0)	2005.02.28
Why Do Universities Fail Teaching Software Engineering? (0)	2005.02.28
How to write papers (0)	2005.02.15

Posted by '김용환'

Why Do Universities Fail Teaching Software Engineering?

paper and essay 2005. 2. 28. 20:33

(퍼옴) http://users.actcom.co.il/~choo/lupg/essays/software-engineering-and-uni.html

v1.0.2

Why Do Universities Fail Teaching Software Engineering?

Table Of Contents:

Abstract

Many universities and other institutions try to teach software engineering concepts to their students, either as part of their regular programming courses, in courses designed to teach just software engineering, or as a set of such courses.

In this short essay, I'll supply a loose definition of "Software Engineering", show why it is so hard (or impossible) teaching it to inexperienced programmers, and try to illustrate a few methods that may be used to still put some sense of engineering into inexperienced programmers.

What Is Software Engineering?

Software engineering is a methodological process of writing software, in a manner that can be repeated for various software projects in a high level of accuracy, and produce good software, under given time and budget constraints.

Lets explore this definition a bit. The methodology means that the software creation process is broken down into several steps, each of which is self-contained, and can be described easily, without using "magical" terms, as is common in the software industry ("i didn't know where to start, so i just hacked down something, and somehow it worked").

For example, one may break software creation into steps such as "requirements specification", "architectural design", "interface design", "detailed design", "implementation", "integration" and "testing". Each of these steps can be properly defined (e.g. "requirements specification is a step in which we define the set of functionalities our software will support, without defining yet how it will support them, or how any kind of user-interface would look like").

Good software would imply that the software does what it is supposed to do, that it has few bugs, that it can be easily maintained, and that it is relatively cheap to extend and add new features without introducing many new bugs.

The software doing what it is supposed to do might sound trivial, but in many cases the original requirements get massively cut down because the software fails to perform various operations properly. easy maintenance would mean that it would be easy for a new programmer to learn how the software works and how to fix bugs in it, and that fixing bugs does not require rewriting complete parts of the software. Easy extension would mean, for example, that the software can be easily ported to new platforms, can easily have its user interface replaced (text UI to graphical UI, or web UI) without rewriting the rest of the system, and so on.

Finally, every software development project has some limits on the amount of time until it has to be ready, and the amount of money that can be spent on the software. The engineering process should make it possible to supply a rather accurate forecast for the time and effort it will take to develop different modules of the software, using different designs. The time limits often come on expense of future maintainability and expend-ability of the software, and these issues need to be weighed against each other, and weighted appropriately, depending on the software requirements.

Software Engineering And Students

From the above description, one can see that software engineering introduces a lot of overhead to software development, and that this overhead might only be worthwhile for large software development. Using such process for small class exercises of few hundreds lines of code is an overkill and makes students despise or even ridicule the idea of software engineering.

Another problem with software engineering, is that it can't be practiced without understanding that small decisions have big implications on large programs. For example, suppose that a method of a communications class returns a pointer to an object it has allocated. Who should be in charge of freeing that memory? The caller of the method, or the object that initially allocated the memory? What happens if in one place its easier to have the caller free the memory, and in another place its easier to make the allocating object free the memory? an inexperienced programmer would most likely do the easier thing in each case, claiming that the simplicity of the code is important, and would overlook the importance of consistency. For a small program, it won't do much harm - since the program is written in a short time, the programmer can remember who should free the memory in each case. Even if they don't, and have a memory leak, they are not going to run this program long enough to notice - once it works, it gets dumped.

The same little decision would have a much bigger impact on a large software, that is supposed to last. A large software tends to have hundreds or thousands of classes, and not each one of them is used very often. If you had consistent rules (e.g. "the object allocating memory is always in charge of freeing it"), it will be easy to remember when to free allocated memory, and thus having less memory leaks and memory corruptions to chase.

More On Students

As we saw, software engineering requires large projects to make sense. It also requires experience to make sense. In particular, bad experience - if you write software in the right way, you don't get to see how wrong badly written software can get, and thus don't learn to appreciate the right ways. Thus, part of learning software engineering is achieved by seeing how lack of it can hurt large software projects. For students "a large project" might be something they have written in one semester, in which they also studied a few other courses.
Given a software made of a few thousands lines of code, or a few tens of classes, inexperienced programmers consider the software to be a large project. Later on, when they get out to work in the industry, they will begin to realize those were in fact rather small projects. In such small projects, lack of proper engineering won't usually reveal the problems it causes, since those projects only need to work once or twice (when demonstrating them to the teacher) and don't get developed further afterwards.

Another aspect of software engineering is designing the software to be easy to maintain. Since most student projects don't get tested by end-users users, many of their bugs remain hidden, and thus the programmer isn't exposed to the amount of time it would take to debug and fix it until it becomes on the level of usable release software - something which often would take more than the amount of time it took to originally write the code and make it run. The original developer of the code tends not to stress-test their code in its weak spots, since they subconsciously know it'll cause the software to crash or malfunction - something which will require more work, that isn't really appreciated when the project is graded. Thus, the programmer cannot see how their software design affected the ability to isolate bugs in the code, and get the false impression that their design generated bug-free software.

Another aspect of learning about software engineering, is seeing how it affects the life cycle of a software project. Sometimes, experienced software engineers make decisions that look like small neat-peaking to the naked eye. Only after a while, as the project evolves to different directions, these original decisions begin to make sense to an inexperienced programmer. Some people will claim that this ability to see the need for those small decisions in advance comes with experience - and yet, this is what software engineering strives for - the ability to make these decisions using a methodological process, that generates repeated success.

Finally, there are students that understand the rules of thumb, and believe that they might be useful, but still prefer not using them for their exercises on the grounds that understanding is enough, and when they get involved in large projects, they will know how to deal with them. The problem is that by the time they get to working on large projects, they might gather bad habits and find it hard to free themselves of these bad habits. They also overlook the little things that can only be achieved with real practice, and thus will stay behind fellow students, who have spent a few years of their undergraduate studies to actually apply these rules, and gained some experience, and thus more insight.

So the conclusion we come to is that you need good familiarity and hands-on involvement in large and lasting software projects, both successful and failures, in order to grasp the essence of software engineering, and appreciate it. And in university environments (or various programming courses, for that matter) the ability to participate in such activities is rather limited.

More On Universities

What do universities do in order to try and teach software engineering? One thing they do is try to teach it as a set of rules of thumb, hoping that students will follow them, even if only because they are being enforced somehow when grading exercises and exams. The problem with this approach is that often the people who do the actual grading are graduate students, who themselves haven't had the chance to grab the concept of software engineering (especially if they entered graduate school directly after finishing undergraduate school).

Even if some of the teachers, or teaching assistants, do have experience with large and lasting software projects, often their students don't know about that, and hence don't trust them. When you don't trust someone, you don't listen to advise they give you if you cannot see an immediate benefit (and in software engineering, there is no benefit for very small exercises). Thus, the students actually tend to ignore their teachers' rules of thumb, seeing them as a burden, and this causes more damage than if those rules were never given in the first place.

At other times, there might be a good software engineering teacher, that indeed has experience in the field, and tries to show some real life examples to their students. These efforts, however, might be lost if the students don't get their own hands-on experience with such software. Learning software engineering on a theoretical bases, and using small code examples (since there is no time to really delve into large code examples) makes no sense, except for people who already understand software engineering in the first place - kind of a chicken-and-egg problem.

How To Make It Work?

After seeing what is needed to make students appreciate software engineering, we might as well spell out a few things that will make teaching it possible. First, we need to have accredited teachers. These teachers may either be people with past or current experience in the industry, that can use it to back their claims to students. They could also be people who participated in large academic projects, that got enough credit for being large and lasting. A good example would be the work done in the MIT university, and their Athena project (see also A review of - MIT project Athena: a model for distributed campus computing). Another good example is the work done at the Washington university in Seattle, by the Distributed Object Computing (DOC) group. There exist various other such examples. The important factor is that they are large projects, involve quite a few staff members (including graduate students), and last for quite a few years, and thus carry a scope similar to that of large industrial projects.

It is also important that the students will know that their teachers have that experience. This is not to be used as a method of bragging, but rather to assure the students that their teacher is not just talking about theoretical software engineering methods; That the teacher has actually applied them, and can show them good, real-life examples, of why these methods are useful, and should be practiced even for smaller projects.

Carrying out large projects by university staff members is also good as it allows graduate students to participate in such projects, and thus be more credible to serve as teaching assistants in software engineering related courses. With good project management, it is also possible to allow undergraduate students to take part in such projects, and witness, from first hand, the complexity of such a project. When they have to delve into code created by other programmers, possibly code that is 2-3 years old, they will learn to appreciate how hard it is to get into code parts that weren't properly engineered, and how relatively easy it is to get into parts that were properly engineered. And having specific parts of the code that are badly engineered on purpose, would serve the teaching goal quite well.

Of-course, getting students involved in large software projects should be done gradually. At first, they may be introduced to small modules, learn them, and be guided in making small changes to them. It is a very useful quality to be able to delve into existing source bases, and inexperienced programmers often find it hard to do. At later phases, these students will be able to write new classes or modules. Getting credit for such projects will be more desirable than letting these students design their own software in a software project course, that will turn out to be small (they don't have enough time to work on it, and usually only 2-3 of them work on the code of such project) and having to create software that will most likely not last, and not be a part of a lasting project.

Financing Academic Software Projects

One of the major problems with carrying large software projects in universities is financing them. You need more equipment than with theoretical research, more system management staff, and more researchers than for a theoretical research.

The equipment needed is often not so hard to get as a donation from large hardware manufacturers and resellers. They already have such donation relationship with universities, sometimes donating full labs for students to work on, in a hope that these students will get used to their development environments, and endorse them when they get some influence in their future working place.

Another option is carrying software research projects that are useful for large industrial companies. Showing these companies how this research can help them, it is possible to convince them to sponsor such projects. The fact that financing a project in which graduate students and undergraduate students perform large parts of the work, would be cheaper than financing it inside the industry, it might look appealing to industrial companies. The TAO project carried by the DOC group at the Washington university and university of California, is a good example of such a relationship.

Another major problem is attracting good software engineers that both wish to carry out research work in their field, and have the skills to manage large projects at the same time. The success of such a project in fact often relies on one or more such enthusiast leaders, that carries a good reputation in the academic field, as well as in the industry. It would be easier to attract such people to the academy, if they know they will get a supportive environment, and financing to projects that will seem feasible and are personally appealing to them. Sometimes, it does not require paying them better than in the industry. The fact that they get more freedom, and without the pressure of marketing personnel, would be enough to attract a few of them to moving to the academic world.

[LUPG Home] [Tutorials] [Related Material] [Essays] [Project Ideas] [Send Comments]

This document is copyright (c) 2001-2002 by guy keren.

The material in this document is provided AS IS, without any expressed or implied warranty, or claim of fitness for a particular purpose. Neither the author nor any contributers shell be liable for any damages incured directly or indirectly by using the material contained in this document.

permission to copy this document (electronically or on paper, for personal or organization internal use) or publish it on-line is hereby granted, provided that the document is copied as-is, this copyright notice is preserved, and a link to the original document is written in the document's body, or in the page linking to the copy of this document.

Permission to make translations of this document is also granted, under these terms - assuming the translation preserves the meaning of the text, the copyright notice is preserved as-is, and a link to the original document is written in the document's body, or in the page linking to the copy of this document.

For any questions about the document and its license, please contact the author.

'paper and essay' 카테고리의 다른 글

bash 스트링 조작하기 - String 함수 구현하기 (0)	2008.01.18
bash shell (0)	2008.01.18
Asking "Why?" at Sun Microsystems Laboratories (0)	2005.02.28
Becoming A Real Programmer (0)	2005.02.28
How to write papers (0)	2005.02.15

Posted by '김용환'

Multi-Threaded Programming With POSIX Threads

OS concept 2005. 2. 28. 20:31

[LUPG Home] [Tutorials] [Related Material] [Essays] [Project Ideas] [Send Comments]

v1.2

Multi-Threaded Programming With POSIX Threads

Table Of Contents:

Before We Start...

This tutorial is an attempt to help you become familiar with multi-threaded programming with the POSIX threads (pthreads) library, and attempts to show how its features can be used in "real-life" programs. It explains the different tools defined by the library, shows how to use them, and then gives an example of using them to solve programming problems. There is an implicit assumption that the user has some theoretical familiarity with paralell programming (or multi-processing) concepts. Users without such background might find the concepts harder to grasp. A seperate tutorial will be prepared to explain the theoreticl background and terms to those who are familiar only with normal "serial" programming.

I would assume that users which are familiar with asynchronous programming models, such as those used in windowing environments (X, Motif), will find it easier to grasp the concepts of multi-threaded programming.

When talking about POSIX threads, one cannot avoid the question "Which draft of the POSIX threads standard shall be used?". As this threads standard has been revised over a period of several years, one will find that implementations adhering to different drafts of the standard have a different set of functions, different default values, and different nuances. Since this tutorial was written using a Linux system with the kernel-level LinuxThreads library, v0.5, programmers with access to other systems, using different versions of pthreads, should refer to their system's manuals in case of incompatibilities. Also, since some of the example programs are using blocking system calls, they won't work with user-level threading libraries (refer to our parallel programming theory tutorial for more information).
Having said that, i'd try to check the example programs on other systems as well (Solaris 2.5 comes to mind), to make it more "cross-platform".

What Is a Thread? Why Use Threads

A thread is a semi-process, that has its own stack, and executes a given piece of code. Unlike a real process, the thread normally shares its memory with other threads (where as for processes we usually have a different memory area for each one of them). A Thread Group is a set of threads all executing inside the same process. They all share the same memory, and thus can access the same global variables, same heap memory, same set of file descriptors, etc. All these threads execute in parallel (i.e. using time slices, or if the system has several processors, then really in parallel).

The advantage of using a thread group instead of a normal serial program is that several operations may be carried out in parallel, and thus events can be handled immediately as they arrive (for example, if we have one thread handling a user interface, and another thread handling database queries, we can execute a heavy query requested by the user, and still respond to user input while the query is executed).

The advantage of using a thread group over using a process group is that context switching between threads is much faster than context switching between processes (context switching means that the system switches from running one thread or process, to running another thread or process). Also, communications between two threads is usually faster and easier to implement than communications between two processes.

On the other hand, because threads in a group all use the same memory space, if one of them corrupts the contents of its memory, other threads might suffer as well. With processes, the operating system normally protects processes from one another, and thus if one corrupts its own memory space, other processes won't suffer. Another advantage of using processes is that they can run on different machines, while all the threads have to run on the same machine (at least normally).

Creating And Destroying Threads

When a multi-threaded program starts executing, it has one thread running, which executes the main() function of the program. This is already a full-fledged thread, with its own thread ID. In order to create a new thread, the program should use the pthread_create() function. Here is how to use it:


#include <stdio.h>       /* standard I/O routines                 */
#include <pthread.h>     /* pthread functions and data structures */

/* function to be executed by the new thread */
void*
do_loop(void* data)
{
    int i;			/* counter, to print numbers */
    int j;			/* counter, for delay        */
    int me = *((int*)data);     /* thread identifying number */

    for (i=0; i<10; i++) {
	for (j=0; j<500000; j++) /* delay loop */
	    ;
        printf("'%d' - Got '%d'\n", me, i);
    }

    /* terminate the thread */
    pthread_exit(NULL);
}

/* like any C program, program's execution begins in main */
int
main(int argc, char* argv[])
{
    int        thr_id;         /* thread ID for the newly created thread */
    pthread_t  p_thread;       /* thread's structure                     */
    int        a         = 1;  /* thread 1 identifying number            */
    int        b         = 2;  /* thread 2 identifying number            */

    /* create a new thread that will execute 'do_loop()' */
    thr_id = pthread_create(&p_thread, NULL, do_loop, (void*)&a);
    /* run 'do_loop()' in the main thread as well */
    do_loop((void*)&b);
    
    /* NOT REACHED */
    return 0;
}

A few notes should be mentioned about this program:

Note that the main program is also a thread, so it executes the do_loop() function in parallel to the thread it creates.
pthread_create() gets 4 parameters. The first parameter is used by pthread_create() to supply the program with information about the thread. The second parameter is used to set some attributes for the new thread. In our case we supplied a NULL pointer to tell pthread_create() to use the default values. The third parameter is the name of the function that the thread will start executing. The forth parameter is an argument to pass to this function. Note the cast to a 'void*'. It is not required by ANSI-C syntax, but is placed here for clarification.
The delay loop inside the function is used only to demonstrate that the threads are executing in parallel. Use a larger delay value if your CPU runs too fast, and you see all the printouts of one thread before the other.
The call to pthread_exit() Causes the current thread to exit and free any thread-specific resources it is taking. There is no need to use this call at the end of the thread's top function, since when it returns, the thread would exit automatically anyway. This function is useful if we want to exit a thread in the middle of its execution.

In order to compile a multi-threaded program using gcc, we need to link it with the pthreads library. Assuming you have this library already installed on your system, here is how to compile our first program:

gcc pthread_create.c -o pthread_create -lpthread

Note that for some of the programs later on this tutorial, one may need to add a '-D_GNU_SOURCE' flag to this compile line, to get the source compiled.

The source code for this program may be found in the pthread_create.c file.

Synchronizing Threads With Mutexes

One of the basic problems when running several threads that use the same memory space, is making sure they don't "step on each other's toes". By this we refer to the problem of using a data structure from two different threads.

For instance, consider the case where two threads try to update two variables. One tries to set both to 0, and the other tries to set both to 1. If both threads would try to do that at the same time, we might get with a situation where one variable contains 1, and one contains 0. This is because a context-switch (we already know what this is by now, right?) might occur after the first tread zeroed out the first variable, then the second thread would set both variables to 1, and when the first thread resumes operation, it will zero out the second variable, thus getting the first variable set to '1', and the second set to '0'.

What Is A Mutex?

A basic mechanism supplied by the pthreads library to solve this problem, is called a mutex. A mutex is a lock that guarantees three things:

Atomicity - Locking a mutex is an atomic operation, meaning that the operating system (or threads library) assures you that if you locked a mutex, no other thread succeeded in locking this mutex at the same time.
Singularity - If a thread managed to lock a mutex, it is assured that no other thread will be able to lock the thread until the original thread releases the lock.
Non-Busy Wait - If a thread attempts to lock a thread that was locked by a second thread, the first thread will be suspended (and will not consume any CPU resources) until the lock is freed by the second thread. At this time, the first thread will wake up and continue execution, having the mutex locked by it.

From these three points we can see how a mutex can be used to assure exclusive access to variables (or in general critical code sections). Here is some pseudo-code that updates the two variables we were talking about in the previous section, and can be used by the first thread:

lock mutex 'X1'.
set first variable to '0'.
set second variable to '0'.
unlock mutex 'X1'.

Meanwhile, the second thread will do something like this:

lock mutex 'X1'.
set first variable to '1'.
set second variable to '1'.
unlock mutex 'X1'.

Assuming both threads use the same mutex, we are assured that after they both ran through this code, either both variables are set to '0', or both are set to '1'. You'd note this requires some work from the programmer - If a third thread was to access these variables via some code that does not use this mutex, it still might mess up the variable's contents. Thus, it is important to enclose all the code that accesses these variables in a small set of functions, and always use only these functions to access these variables.

Creating And Initializing A Mutex

In order to create a mutex, we first need to declare a variable of type pthread_mutex_t, and then initialize it. The simplest way it by assigning it the PTHREAD_MUTEX_INITIALIZER constant. So we'll use a code that looks something like this:


pthread_mutex_t a_mutex = PTHREAD_MUTEX_INITIALIZER;

One note should be made here: This type of initialization creates a mutex called 'fast mutex'. This means that if a thread locks the mutex and then tries to lock it again, it'll get stuck - it will be in a deadlock.

There is another type of mutex, called 'recursive mutex', which allows the thread that locked it, to lock it several more times, without getting blocked (but other threads that try to lock the mutex now will get blocked). If the thread then unlocks the mutex, it'll still be locked, until it is unlocked the same amount of times as it was locked. This is similar to the way modern door locks work - if you turned it twice clockwise to lock it, you need to turn it twice counter-clockwise to unlock it. This kind of mutex can be created by assigning the constant PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP to a mutex variable.

Locking And Unlocking A Mutex

In order to lock a mutex, we may use the function pthread_mutex_lock(). This function attempts to lock the mutex, or block the thread if the mutex is already locked by another thread. In this case, when the mutex is unlocked by the first process, the function will return with the mutex locked by our process. Here is how to lock a mutex (assuming it was initialized earlier):


int rc = pthread_mutex_lock(&a_mutex);
if (rc) { /* an error has occurred */
    perror("pthread_mutex_lock");
    pthread_exit(NULL);
}
/* mutex is now locked - do your stuff. */
.
.

After the thread did what it had to (change variables or data structures, handle file, or whatever it intended to do), it should free the mutex, using the pthread_mutex_unlock() function, like this:


rc = pthread_mutex_unlock(&a_mutex);
if (rc) {
    perror("pthread_mutex_unlock");
    pthread_exit(NULL);
}

Destroying A Mutex

After we finished using a mutex, we should destroy it. Finished using means no thread needs it at all. If only one thread finished with the mutex, it should leave it alive, for the other threads that might still need to use it. Once all finished using it, the last one can destroy it using the pthread_mutex_destroy() function:


rc = pthread_mutex_destroy(&a_mutex);

After this call, this variable (a_mutex) may not be used as a mutex any more, unless it is initialized again. Thus, if one destroys a mutex too early, and another thread tries to lock or unlock it, that thread will get a EINVAL error code from the lock or unlock function.

Using A Mutex - A Complete Example

After we have seen the full life cycle of a mutex, lets see an example program that uses a mutex. The program introduces two employees competing for the "employee of the day" title, and the glory that comes with it. To simulate that in a rapid pace, the program employs 3 threads: one that promotes Danny to "employee of the day", one that promotes Moshe to that situation, and a third thread that makes sure that the employee of the day's contents is consistent (i.e. contains exactly the data of one employee).
Two copies of the program are supplied. One that uses a mutex, and one that does not. Try them both, to see the differences, and be convinced that mutexes are essential in a multi-threaded environment.

The programs themselves are in the files accompanying this tutorial. The one that uses a mutex is employee-with-mutex.c. The one that does not use a mutex is employee-without-mutex.c. Read the comments inside the source files to get a better understanding of how they work.

Starvation And Deadlock Situations

Again we should remember that pthread_mutex_lock() might block for a non-determined duration, in case of the mutex being already locked. If it remains locked forever, it is said that our poor thread is "starved" - it was trying to acquire a resource, but never got it. It is up to the programmer to ensure that such starvation won't occur. The pthread library does not help us with that.

The pthread library might, however, figure out a "deadlock". A deadlock is a situation in which a set of threads are all waiting for resources taken by other threads, all in the same set. Naturally, if all threads are blocked waiting for a mutex, none of them will ever come back to life again. The pthread library keeps track of such situations, and thus would fail the last thread trying to call pthread_mutex_lock(), with an error of type EDEADLK. The programmer should check for such a value, and take steps to solve the deadlock somehow.

Refined Synchronization - Condition Variables

As we've seen before with mutexes, they allow for simple coordination - exclusive access to a resource. However, we often need to be able to make real synchronization between threads:

In a server, one thread reads requests from clients, and dispatches them to several threads for handling. These threads need to be notified when there is data to process, otherwise they should wait without consuming CPU time.
In a GUI (Graphical User Interface) Application, one thread reads user input, another handles graphical output, and a third thread sends requests to a server and handles its replies. The server-handling thread needs to be able to notify the graphics-drawing thread when a reply from the server arrived, so it will immediately show it to the user. The user-input thread needs to be always responsive to the user, for example, to allow her to cancel long operations currently executed by the server-handling thread.

All these examples require the ability to send notifications between threads. This is where condition variables are brought into the picture.

What Is A Condition Variable?

A condition variable is a mechanism that allows threads to wait (without wasting CPU cycles) for some even to occur. Several threads may wait on a condition variable, until some other thread signals this condition variable (thus sending a notification). At this time, one of the threads waiting on this condition variable wakes up, and can act on the event. It is possible to also wake up all threads waiting on this condition variable by using a broadcast method on this variable.

Note that a condition variable does not provide locking. Thus, a mutex is used along with the condition variable, to provide the necessary locking when accessing this condition variable.

Creating And Initializing A Condition Variable

Creation of a condition variable requires defining a variable of type pthread_cond_t, and initializing it properly. Initialization may be done with either a simple use of a macro named PTHREAD_COND_INITIALIZER or the usage of the pthread_cond_init() function. We will show the first form here:

pthread_cond_t got_request = PTHREAD_COND_INITIALIZER;

This defines a condition variable named 'got_request', and initializes it.

Note: since the PTHREAD_COND_INITIALIZER is actually a structure initializer, it may be used to initialize a condition variable only when it is declared. In order to initialize it during runtime, one must use the pthread_cond_init() function.

Signaling A Condition Variable

In order to signal a condition variable, one should either the pthread_cond_signal() function (to wake up a only one thread waiting on this variable), or the pthread_cond_broadcast() function (to wake up all threads waiting on this variable). Here is an example using signal, assuming 'got_request' is a properly initialized condition variable:

int rc = pthread_cond_signal(&got_request);

Or by using the broadcast function:

int rc = pthread_cond_broadcast(&got_request);

When either function returns, 'rc' is set to 0 on success, and to a non-zero value on failure. In such a case (failure), the return value denotes the error that occured (EINVAL denotes that the given parameter is not a condition variable. ENOMEM denotes that the system has run out of memory.

Note: success of a signaling operation does not mean any thread was awakened - it might be that no thread was waiting on the condition variable, and thus the signaling does nothing (i.e. the signal is lost).
It is also not remembered for future use - if after the signaling function returns another thread starts waiting on this condition variable, a further signal is required to wake it up.

Waiting On A Condition Variable

If one thread signals the condition variable, other threads would probably want to wait for this signal. They may do so using one of two functions, pthread_cond_wait() or pthread_cond_timedwait(). Each of these functions takes a condition variable, and a mutex (which should be locked before calling the wait function), unlocks the mutex, and waits until the condition variable is signaled, suspending the thread's execution. If this signaling causes the thread to awake (see discussion of pthread_cond_signal() earlier), the mutex is automagically locked again by the wait funciton, and the wait function returns.

The only difference between these two functions is that pthread_cond_timedwait() allows the programmer to specify a timeout for the waiting, after which the function always returns, with a proper error value (ETIMEDOUT) to notify that condition variable was NOT signaled before the timeout passed. The pthread_cond_wait() would wait indefinitely if it was never signaled.

Here is how to use these two functions. We make the assumption that 'got_request' is a properly initialized condition variable, and that 'request_mutex' is a properly initialized mutex. First, we try the pthread_cond_wait() function:


/* first, lock the mutex */
int rc = pthread_mutex_lock(&request_mutex);
if (rc) { /* an error has occurred */
    perror("pthread_mutex_lock");
    pthread_exit(NULL);
}
/* mutex is now locked - wait on the condition variable.             */
/* During the execution of pthread_cond_wait, the mutex is unlocked. */
rc = pthread_cond_wait(&got_request, &request_mutex);
if (rc == 0) { /* we were awakened due to the cond. variable being signaled */
               /* The mutex is now locked again by pthread_cond_wait()      */
    /* do your stuff... */
    .
}
/* finally, unlock the mutex */
pthread_mutex_unlock(&request_mutex);

Now an example using the pthread_cond_timedwait() function:


#include <sys/time.h>     /* struct timeval definition           */
#include <unistd.h>       /* declaration of gettimeofday()       */

struct timeval  now;            /* time when we started waiting        */
struct timespec timeout;        /* timeout value for the wait function */
int             done;           /* are we done waiting?                */

/* first, lock the mutex */
int rc = pthread_mutex_lock(&a_mutex);
if (rc) { /* an error has occurred */
    perror("pthread_mutex_lock");
    pthread_exit(NULL);
}
/* mutex is now locked */

/* get current time */ 
gettimeofday(&now);
/* prepare timeout value.              */
/* Note that we need an absolute time. */
timeout.tv_sec = now.tv_sec + 5
timeout.tv_nsec = now.tv_usec * 1000; /* timeval uses micro-seconds.         */
                                      /* timespec uses nano-seconds.         */
                                      /* 1 micro-second = 1000 nano-seconds. */

/* wait on the condition variable. */
/* we use a loop, since a Unix signal might stop the wait before the timeout */
done = 0;
while (!done) {
    /* remember that pthread_cond_timedwait() unlocks the mutex on entrance */
    rc = pthread_cond_timedwait(&got_request, &request_mutex, &timeout);
    switch(rc) {
        case 0:  /* we were awakened due to the cond. variable being signaled */
                 /* the mutex was now locked again by pthread_cond_timedwait. */
            /* do your stuff here... */
            .
            .
            done = 0;
            break;
        default:        /* some error occurred (e.g. we got a Unix signal) */
            if (errno == ETIMEDOUT) { /* our time is up */
                done = 0;
            }
            break;      /* break this switch, but re-do the while loop.   */
    }
}
/* finally, unlock the mutex */
pthread_mutex_unlock(&request_mutex);

As you can see, the timed wait version is way more complex, and thus better be wrapped up by some function, rather than being re-coded in every necessary location.

Note: it might be that a condition variable that has 2 or more threads waiting on it is signaled many times, and yet one of the threads waiting on it never awakened. This is because we are not guaranteed which of the waiting threads is awakened when the variable is signaled. It might be that the awakened thread quickly comes back to waiting on the condition variables, and gets awakened again when the variable is signaled again, and so on. The situation for the un-awakened thread is called 'starvation'. It is up to the programmer to make sure this situation does not occur if it implies bad behavior. Yet, in our server example from before, this situation might indicate requests are coming in a very slow pace, and thus perhaps we have too many threads waiting to service requests. In this case, this situation is actually good, as it means every request is handled immediately when it arrives.

Note 2: when the mutex is being broadcast (using pthread_cond_broadcast), this does not mean all threads are running together. Each of them tries to lock the mutex again before returning from their wait function, and thus they'll start running one by one, each one locking the mutex, doing their work, and freeing the mutex before the next thread gets its chance to run.

Destroying A Condition Variable

After we are done using a condition variable, we should destroy it, to free any system resources it might be using. This can be done using the pthread_cond_destroy(). In order for this to work, there should be no threads waiting on this condition variable. Here is how to use this function, again, assuming 'got_request' is a pre-initialized condition variable:


int rc = pthread_cond_destroy(&got_request);
if (rc == EBUSY) { /* some thread is still waiting on this condition variable */
    /* handle this case here... */
    .
    .
}

What if some thread is still waiting on this variable? depending on the case, it might imply some flaw in the usage of this variable, or just lack of proper thread cleanup code. It is probably good to alert the programmer, at least during debug phase of the program, of such a case. It might mean nothing, but it might be significant.

A Real Condition For A Condition Variable

A note should be taken about condition variables - they are usually pointless without some real condition checking combined with them. To make this clear, lets consider the server example we introduced earlier. Assume that we use the 'got_request' condition variable to signal that a new request has arrived that needs handling, and is held in some requests queue. If we had threads waiting on the condition variable when this variable is signaled, we are assured that one of these threads will awake and handle this request.

However, what if all threads are busy handling previous requests, when a new one arrives? the signaling of the condition variable will do nothing (since all threads are busy doing other things, NOT waiting on the condition variable now), and after all threads finish handling their current request, they come back to wait on the variable, which won't necessarily be signaled again (for example, if no new requests arrive). Thus, there is at least one request pending, while all handling threads are blocked, waiting for a signal.

In order to overcome this problem, we may set some integer variable to denote the number of pending requests, and have each thread check the value of this variable before waiting on the variable. If this variable's value is positive, some request is pending, and the thread should go and handle it, instead of going to sleep. Further more, a thread that handled a request, should reduce the value of this variable by one, to make the count correct.
Lets see how this affects the waiting code we have seen above.


/* number of pending requests, initially none */
int num_requests = 0;
.
.
/* first, lock the mutex */
int rc = pthread_mutex_lock(&request_mutex);
if (rc) { /* an error has occurred */
    perror("pthread_mutex_lock");
    pthread_exit(NULL);
}
/* mutex is now locked - wait on the condition variable */
/* if there are no requests to be handled.              */
rc = 0;
if (num_requests == 0)
    rc = pthread_cond_wait(&got_request, &request_mutex);
if (num_requests > 0 && rc == 0) { /* we have a request pending */
        /* unlock mutex - so other threads would be able to handle */
        /* other reqeusts waiting in the queue paralelly.          */
        rc = pthread_mutex_unlock(&request_mutex);
        /* do your stuff... */
        .
        .
        /* decrease count of pending requests */
        num_requests--;
        /* and lock the mutex again - to remain symmetrical,. */
        rc = pthread_mutex_lock(&request_mutex);
    }
}
/* finally, unlock the mutex */
pthread_mutex_unlock(&request_mutex);

Using A Condition Variable - A Complete Example

As an example for the actual usage of condition variables, we will show a program that simulates the server we have described earlier - one thread, the receiver, gets client requests. It inserts the requests to a linked list, and a hoard of threads, the handlers, are handling these requests. For simplicity, in our simulation, the receiver thread creates requests and does not read them from real clients.

The program source is available in the file thread-pool-server.c, and contains many comments. Please read the source file first, and then read the following clarifying notes.

The 'main' function first launches the handler threads, and then performs the chord of the receiver thread, via its main loop.
A single mutex is used both to protect the condition variable, and to protect the linked list of waiting requests. This simplifies the design. As an exercise, you may think how to divide these roles into two mutexes.
The mutex itself MUST be a recursive mutex. In order to see why, look at the code of the 'handle_requests_loop' function. You will notice that it first locks the mutex, and afterwards calls the 'get_request' function, which locks the mutex again. If we used a non-recursive mutex, we'd get locked indefinitely in the mutex locking operation of the 'get_request' function.
You may argue that we could remove the mutex locking in the 'get_request' function, and thus remove the double-locking problem, but this is a flawed design - in a larger program, we might call the 'get_request' function from other places in the code, and we'll need to check for proper locking of the mutex in each of them.
As a rule, when using recursive mutexes, we should try to make sure that each lock operation is accompanied by a matching unlock operation in the same function. Otherwise, it will be very hard to make sure that after locking the mutex several times, it is being unlocked the same number of times, and deadlocks would occur.
The implicit unlocking and re-locking of the mutex on the call to the pthread_cond_wait() function is confusing at first. It is best to add a comment regarding this behavior in the code, or else someone that reads this code might accidentally add a further mutex lock.
When a handler thread handles a request - it should free the mutex, to avoid blocking all the other handler threads. After it finished handling the request, it should lock the mutex again, and check if there are more requests to handle.

"Private" thread data - Thread-Specific Data

In "normal", single-thread programs, we sometimes find the need to use a global variable. Ok, so good old teach' told us it is bad practice to have global variables, but they sometimes do come handy. Especially if they are static variables - meaning, they are recognized only on the scope of a single file.

In multi-threaded programs, we also might find a need for such variables. We should note, however, that the same variable is accessible from all the threads, so we need to protect access to it using a mutex, which is extra overhead. Further more, we sometimes need to have a variable that is 'global', but only for a specific thread. Or the same 'global' variable should have different values in different threads. For example, consider a program that needs to have one globally accessible linked list in each thread, but note the same list. Further, we want the same code to be executed by all threads. In this case, the global pointer to the start of the list should be point to a different address in each thread.

In order to have such a pointer, we need a mechanism that enables the same global variable to have a different location in memory. This is what the thread-specific data mechanism is used for.

Overview Of Thread-Specific Data Support

In the thread-specific data (TSD) mechanism, we have notions of keys and values. Each key has a name, and pointer to some memory area. Keys with the same name in two separate threads always point to different memory locations - this is handled by the library functions that allocate memory blocks to be accessed via these keys. We have a function to create a key (invoked once per key name for the whole process), a function to allocate memory (invoked separately in each thread), and functions to de-allocate this memory for a specific thread, and a function to destroy the key, again, process-wide. we also have functions to access the data pointed to by a key, either setting its value, or returning the value it points to.

Allocating Thread-Specific Data Block

The pthread_key_create() function is used to allocate a new key. This key now becomes valid for all threads in our process. When a key is created, the value it points to defaults to NULL. Later on each thread may change its copy of the value as it wishes. Here is how to use this function:


/* rc is used to contain return values of pthread functions */
int rc;
/* define a variable to hold the key, once created.         */
pthread_key_t list_key;
/* cleanup_list is a function that can clean up some data   */
/* it is specific to our program, not to TSD                */
extern void* cleanup_list(void*);

/* create the key, supplying a function that'll be invoked when it's deleted. */
rc = pthread_key_create(&list_key, cleanup_list);

Some notes:

After pthread_key_create() returns, the variable 'list_key' points to the newly created key.
The function pointer passed as second parameter to pthread_key_create(), will be automatically invoked by the pthread library when our thread exits, with a pointer to the key's value as its parameter. We may supply a NULL pointer as the function pointer, and then no function will be invoked for key. Note that the function will be invoked once in each thread, even thought we created this key only once, in one thread.
If we created several keys, their associated destructor functions will be called in an arbitrary order, regardless of the order of keys creation.
If the pthread_key_create() function succeeds, it returns 0. Otherwise, it returns some error code.
There is a limit of PTHREAD_KEYS_MAX keys that may exist in our process at any given time. An attempt to create a key after PTHREAD_KEYS_MAX exits, will cause a return value of EAGAIN from the pthread_key_create() function.

Accessing Thread-Specific Data

After we have created a key, we may access its value using two pthread functions: pthread_getspecific() and pthread_setspecific(). The first is used to get the value of a given key, and the second is used to set the data of a given key. A key's value is simply a void pointer (void*), so we can store in it anything that we want. Lets see how to use these functions. We assume that 'a_key' is a properly initialized variable of type pthread_key_t that contains a previously created key:


/* this variable will be used to store return codes of pthread functions */
int rc;

/* define a variable into which we'll store some data */
/* for example, and integer.                          */
int* p_num = (int*)malloc(sizeof(int));
if (!p_num) {
    fprintf(stderr, "malloc: out of memory\n";
    exit(1);
}
/* initialize our variable to some value */
(*p_num) = 4;

/* now lets store this value in our TSD key.    */
/* note that we don't store 'p_num' in our key. */
/* we store the value that p_num points to.     */
rc = pthread_setspecific(a_key, (void*)p_num);

.
.
/* and somewhere later in our code... */
.
.
/* get the value of key 'a_key' and print it. */
{
    int* p_keyval = (int*)pthread_getspecific(a_key);

    if (p_keyval != NULL) {
	printf("value of 'a_key' is: %d\n", *p_keyval);
    }
}

Note that if we set the value of the key in one thread, and try to get it in another thread, we will get a NULL, since this value is distinct for each thread.

Note also that there are two cases where pthread_getspecific() might return NULL:

The key supplied as a parameter is invalid (e.g. its key wasn't created).
The value of this key is NULL. This means it either wasn't initialized, or was set to NULL explicitly by a previous call to pthread_setspecific().

Deleting Thread-Specific Data Block

The pthread_key_delete() function may be used to delete keys. But do not be confused by this function's name: it does not delete memory associated with this key, nor does it call the destructor function defined during the key's creation. Thus, you still need to do memory cleanup on your own if you need to free this memory during runtime. However, since usage of global variables (and thus also thread-specific data), you usually don't need to free this memory until the thread terminate, in which case the pthread library will invoke your destructor functions anyway.

Using this function is simple. Assuming list_key is a pthread_key_t variable pointing to a properly created key, use this function like this:

int rc = pthread_key_delete(key);

the function will return 0 on success, or EINVAL if the supplied variable does not point to a valid TSD key.

A Complete Example

None yet. Give me a while to think of one...... sorry. All i can think of right now is 'global variables are evil'. I'll try to find a good example for the future. If you have a good example, please let me know.

Thread Cancellation And Termination

As we create threads, we need to think about terminating them as well. There are several issues involved here. We need to be able to terminate threads cleanly. Unlike processes, where a very ugly method of using signals is used, the folks that designed the pthreads library were a little more thoughtful. So they supplied us with a whole system of canceling a thread, cleaning up after a thread, and so on. We will discuss these methods here.

Canceling A Thread

When we want to terminate a thread, we can use the pthread_cancel function. This function gets a thread ID as a parameter, and sends a cancellation request to this thread. What this thread does with this request depends on its state. It might act on it immediately, it might act on it when it gets to a cancellation point (discussed below), or it might completely ignore it. We'll see later how to set the state of a thread and define how it acts on cancellation requests. Lets first see how to use the cancel function. We assume that 'thr_id' is a variable of type pthread_id containing the ID of a running thread:


pthread_cancel(thr_id);

The pthread_cancel() function returns 0, so we cannot know if it succeeded or not.

Setting Thread Cancellation State

A thread's cancel state may be modified using several methods. The first is by using the pthread_setcancelstate() function. This function defines whether the thread will accept cancellation requests or not. The function takes two arguments. One that sets the new cancel state, and one into which the previous cancel state is stored by the function. Here is how it is used:


int old_cancel_state;
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &old_cancel_state);

This will disable canceling this thread. We can also enable canceling the thread like this:


int old_cancel_state;
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, &old_cancel_state);

Note that you may supply a NULL pointer as the second parameter, and then you won't get the old cancel state.

A similar function, named pthread_setcanceltype() is used to define how a thread responds to a cancellation request, assuming it is in the 'ENABLED' cancel state. One option is to handle the request immediately (asynchronously). The other is to defer the request until a cancellation point. To set the first option (asynchronous cancellation), do something like:


int old_cancel_type;
pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, &old_cancel_type);

And to set the second option (deferred cancellation):


int old_cancel_type;
pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, &old_cancel_type);

Note that you may supply a NULL pointer as the second parameter, and then you won't get the old cancel type.

You might wonder - "What if i never set the cancellation state or type of a thread?". Well, in such a case, the pthread_create() function automatically sets the thread to enabled deferred cancellation, that is, PTHREAD_CANCEL_ENABLE for the cancel mode, and PTHREAD_CANCEL_DEFERRED for the cancel type.

Cancellation Points

As we've seen, a thread might be in a state where it does not handle cancel requests immediately, but rather defers them until it reaches a cancellation point. So what are these cancellation points?

In general, any function that might suspend the execution of a thread for a long time, should be a cancellation point. In practice, this depends on the specific implementation, and how conformant it is to the relevant POSIX standard (and which version of the standard it conforms to...). The following set of pthread functions serve as cancellation points:

pthread_join()
pthread_cond_wait()
pthread_cond_timedwait()
pthread_testcancel()
sem_wait()
sigwait()

This means that if a thread executes any of these functions, it'll check for deferred cancel requests. If there is one, it will execute the cancellation sequence, and terminate. Out of these functions, pthread_testcancel() is unique - it's only purpose is to test whether a cancellation request is pending for this thread. If there is, it executes the cancellation sequence. If not, it returns immediately. This function may be used in a thread that does a lot of processing without getting into a "natural" cancellation state.

Note: In real conformant implementations of the pthreads standard, normal system calls that cause the process to block, such as read(), select(), wait() and so on, are also cancellation points. The same goes for standard C library functions that use these system calls (the various printf functions, for example).

Setting Thread Cleanup Functions

One of the features the pthreads library supplies is the ability for a thread to clean up after itself, before it exits. This is done by specifying one or more functions that will be called automatically by the pthreads library when the thread exits, either due to its own will (e.g. calling pthread_exit()), or due to it being canceled.

Two functions are supplied for this purpose. The pthread_cleanup_push() function is used to add a cleanup function to the set of cleanup functions for the current thread. The pthread_cleanup_pop() function removes the last function added with pthread_cleanup_push(). When the thread terminates, its cleanup functions are called in the reverse order of their registration. So the the last one to be registered is the first one to be called.

When the cleanup functions are called, each one is supplied with one parameter, that was supplied as the second parameter to the pthread_cleanup_push() function call. Lets see how these functions may be used. In our example we'll see how these functions may be used to clean up some memory that our thread allocates when it starts running.



/* first, here is the cleanup function we want to register.        */
/* it gets a pointer to the allocated memory, and simply frees it. */
void
cleanup_after_malloc(void* allocated_memory)
{
    if (allocated_memory)
        free(allocated_memory);
}

/* and here is our thread's function.      */
/* we use the same function we used in our */
/* thread-pool server.                     */
void*
handle_requests_loop(void* data)
{
    .
    .
    /* this variable will be used later. please read on...         */
    int old_cancel_type;

    /* allocate some memory to hold the start time of this thread. */
    /* assume MAX_TIME_LEN is a previously defined macro.          */
    char* start_time = (char*)malloc(MAX_TIME_LEN);

    /* push our cleanup handler. */
    pthread_cleanup_push(cleanup_after_malloc, (void*)start_time);
    .
    .
    /* here we start the thread's main loop, and do whatever is desired.. */
    .
    .
    .

    /* and finally, we unregister the cleanup handler. our method may seem */
    /* awkward, but please read the comments below for an explanation.     */

    /* put the thread in deferred cancellation mode.      */
    pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, &old_cancel_type);

    /* supplying '1' means to execute the cleanup handler */
    /* prior to unregistering it. supplying '0' would     */
    /* have meant not to execute it.                      */
    pthread_cleanup_pop(1);

    /* restore the thread's previous cancellation mode.   */
    pthread_setcanceltype(old_cancel_type, NULL);
}

As we can see, we allocated some memory here, and registered a cleanup handler that will free this memory when our thread exits. After the execution of the main loop of our thread, we unregistered the cleanup handler. This must be done in the same function that registered the cleanup handler, and in the same nesting level, since both pthread_cleanup_pop() and pthread_cleanup_pop() functions are actually macros that add a '{' symbol and a '}' symbol, respectively.

As to the reason that we used that complex piece of code to unregister the cleanup handler, this is done to assure that our thread won't get canceled in the middle of the execution of our cleanup handler. This could have happened if our thread was in asynchronous cancellation mode. Thus, we made sure it was in deferred cancellation mode, then unregistered the cleanup handler, and finally restored whatever cancellation mode our thread was in previously. Note that we still assume the thread cannot be canceled in the execution of pthread_cleanup_pop() itself - this is true, since pthread_cleanup_pop() is not a cancellation point.

Synchronizing On Threads Exiting

Sometimes it is desired for a thread to wait for the end of execution of another thread. This can be done using the pthread_join() function. It receives two parameters: a variable of type pthread_t, denoting the thread to be joined, and an address of a void* variable, into which the exit code of the thread will be placed (or PTHREAD_CANCELED if the joined thread was canceled).
The pthread_join() function suspends the execution of the calling thread until the joined thread is terminated.

For example, consider our earlier thread pool server. Looking back at the code, you'll see that we used an odd sleep() call before terminating the process. We did this since the main thread had no idea when the other threads finished processing all pending requests. We could have solved it by making the main thread run a loop of checking if no more requests are pending, but that would be a busy loop.

A cleaner way of implementing this, is by adding three changes to the code:

Tell the handler threads when we are done creating requests, by setting some flag.
Make the threads check, whenever the requests queue is empty, whether or not new requests are supposed to be generated. If not, then the thread should exit.
Make the main thread wait for the end of execution of each of the threads it spawned.

The first 2 changes are rather easy. We create a global variable named 'done_creating_requests' and set it to '0' initially. Each thread checks the contents of this variable every time before it intends to go to wait on the condition variable (i.e. the requests queue is empty).
The main thread is modified to set this variable to '1' after it finished generating all requests. Then the condition variable is being broadcast, in case any of the threads is waiting on it, to make sure all threads go and check the 'done_creating_requests' flag.

The last change is done using a pthread_join() loop: call pthread_join() once for each handler thread. This way, we know that only after all handler threads have exited, this loop is finished, and then we may safely terminate the process. If we didn't use this loop, we might terminate the process while one of the handler threads is still handling a request.

The modified program is available in the file named thread-pool-server-with-join.c. Look for the word 'CHANGE' (in capital letters) to see the locations of the three changes.

Detaching A Thread

We have seen how threads can be joined using the pthread_join() function. In fact, threads that are in a 'join-able' state, must be joined by other threads, or else their memory resources will not be fully cleaned out. This is similar to what happens with processes whose parents didn't clean up after them (also called 'orphan' or 'zombie' processes).

If we have a thread that we wish would exit whenever it wants without the need to join it, we should put it in the detached state. This can be done either with appropriate flags to the pthread_create() function, or by using the pthread_detach() function. We'll consider the second option in our tutorial.

The pthread_detach() function gets one parameter, of type pthread_t, that denotes the thread we wish to put in the detached state. For example, we can create a thread and immediately detach it with a code similar to this:


pthread_t a_thread;   /* store the thread's structure here              */
int rc;               /* return value for pthread functions.            */
extern void* thread_loop(void*); /* declare the thread's main function. */

/* create the new thread. */
rc = pthread_create(&a_thread, NULL, thread_loop, NULL);

/* and if that succeeded, detach the newly created thread. */
if (rc == 0) {
    rc = pthread_detach(a_thread);
}

Of-course, if we wish to have a thread in the detached state immediately, using the first option (setting the detached state directly when calling pthread_create() is more efficient.

Threads Cancellation - A Complete Example

Our next example is much larger than the previous examples. It demonstrates how one could write a multi-threaded program in C, in a more or less clean manner. We take our previous thread-pool server, and enhance it in two ways. First, we add the ability to tune the number of handler threads based on the requests load. New threads are created if the requests queue becomes too large, and after the queue becomes shorter again, extra threads are canceled.

Second, we fix up the termination of the server when there are no more new requests to handle. Instead of the ugly sleep we used in our first example, this time the main thread waits for all threads to finish handling their last requests, by joining each of them using pthread_join().

The code is now being split to 4 separate files, as follows:

requests_queue.c - This file contains functions to manipulate a requests queue. We took the add_request() and get_request() functions and put them here, along with a data structure that contains all the variables previously defined as globals - pointer to queue's head, counter of requests, and even pointers to the queue's mutex and condition variable. This way, all the manipulation of the data is done in a single file, and all its functions receive a pointer to a 'requests_queue' structure.
handler_thread.c - this contains the functions executed by each handler thread - a function that runs the main loop (an enhanced version of the 'handle_requests_loop()' function, and a few local functions explained below). We also define a data structure to collect all the data we want to pass to each thread. We pass a pointer to such a structure as a parameter to the thread's function in the pthread_create() call, instead of using a bunch of ugly globals: the thread's ID, a pointer to the requests queue structure, and pointers to the mutex and condition variable to be used.
handler_threads_pool.c - here we define an abstraction of a thread pool. We have a function to create a thread, a function to delete (cancel) a thread, and a function to delete all active handler threads, called during program termination. we define here a structure similar to that used to hold the requests queue, and thus the functions are similar. However, because we only access this pool from one thread, the main thread, we don't need to protect it using a mutex. This saves some overhead caused by mutexes. the overhead is small, but for a busy server, it might begin to become noticeable.
main.c - and finally, the main function to rule them all, and in the system bind them. This function creates a requests queue, creates a threads pool, creates few handler threads, and then starts generating requests. After adding a request to the queue, it checks the queue size and the number of active handler threads, and adjusts the number of threads to the size of the queue. We use a simple water-marks algorithm here, but as you can see from the code, it can be easily be replaced by a more sophisticated algorithm. In our water-marks algorithm implementation, when the high water-mark is reached, we start creating new handler threads, to empty the queue faster. Later, when the low water-mark is reached, we start canceling the extra threads, until we are left with the original number of handler threads.

After rewriting the program in a more manageable manner, we added code that uses the newly learned pthreads functions, as follows:

Each handler thread created puts itself in the deferred cancellation mode. This makes sure that when it gets canceled, it can finish handling its current request, before terminating.
Each handler thread also registers a cleanup function, to unlock the mutex when it terminates. This is done, since a thread is most likely to get canceled when calling pthread_cond_wait(), which is a cancellation point. Since the function is called with the mutex locked, it might cause the thread to exit and cause all other threads to 'hang' on the mutex. Thus, unlocking the mutex in a cleanup handler (registered with the pthread_cleanup_push() function) is the proper solution.
Finally, the main thread is set to clean up properly, and not brutally, as we did before. When it wishes to terminate, it calls the 'delete_handler_threads_pool()' function, which calls pthread_join for each remaining handler thread. This way, the function returns only after all handler threads finished handling their last request.

Please refer to the source code for the full details. Reading the header files first will make it easier to understand the design. To compile the program, just switch to the thread-pool-server-changes directory, and type 'gmake'.

Exercise: our last program contains some possible race condition during its termination process. Can you see what this race is all about? Can you offer a complete solution to this problem? (hint - think of what happens to threads deleted using 'delete_handler_thread()').

Exercise 2: the way we implement the water-marks algorithm might come up too slow on creation of new threads. Try thinking of a different algorithm that will shorten the average time a request stays on the queue until it gets handled. Add some code to measure this time, and experiment until you find your "optimal pool algorithm". Note - Time should be measured in very small units (using the getrusage system call), and several runs of each algorithm should be made, to get more accurate measurements.

Using Threads For Responsive User Interface Programming

One area in which threads can be very helpful is in user-interface programs. These programs are usually centered around a loop of reading user input, processing it, and showing the results of the processing. The processing part may sometimes take a while to complete, and the user is made to wait during this operation. By placing such long operations in a seperate thread, while having another thread to read user input, the program can be more responsive. It may allow the user to cancel the operation in the middle.

In graphical programs the problem is more severe, since the application should always be ready for a message from the windowing system telling it to repaint part of its window. If it's too busy executing some other task, its window will remain blank, which is rather ugly. In such a case, it is a good idea to have one thread handle the message loop of the windowing systm and always ready to get such repain requests (as well as user input). When ever this thread sees a need to do an operation that might take a long time to complete (say, more than 0.2 seconds in the worse case), it will delegate the job to a seperate thread.

In order to structure things better, we may use a third thread, to control and synchronize the user-input and task-performing threads. If the user-input thread gets any user input, it will ask the controlling thread to handle the operation. If the task-performing thread finishes its operation, it will ask the controlling thread to show the results to the user.

User Interaction - A Complete Example

As an example, we will write a simple character-mode program that counts the number of lines in a file, while allowing the user to cancel the operation in the middle.

Our main thread will launch one thread to perform the line counting, and a second thread to check for user input. After that, the main thread waits on a condition variable. When any of the threads finishes its operation, it signals this condition variable, in order to let the main thread check what happened. A global variable is used to flag whether or not a cancel request was made by the user. It is initialized to '0', but if the user-input thread receives a cancellation request (the user pressing 'e'), it sets this flag to '1', signals the condition variable, and terminates. The line-counting thread will signal the condition variable only after it finished its computation.

Before you go read the program, we should explain the use of the system() function and the 'stty' Unix command. The system() function spawns a shell in which it executes the Unix command given as a parameter. The stty Unix command is used to change terminal mode settings. We use it to switch the terminal from its default, line-buffered mode, to a character mode (also known as raw mode), so the call to getchar() in the user-input thread will return immediatly after the user presses any key. If we hadn't done so, the system will buffer all input to the program until the user presses the ENTER key. Finally, since this raw mode is not very useful (to say the least) once the program terminates and we get the shell prompt again, the user-input thread registers a cleanup function that restores the normal terminal mode, i.e. line-buffered. For more info, please refer to stty's manual page.

The program's source can be found in the file line-count.c. The name of the file whose lines it reads is hardcoded to 'very_large_data_file'. You should create a file with this name in the program's directory (large enough for the operation to take enough time). Alternatively, you may un-compress the file 'very_large_data_file.Z' found in this directory, using the command:

uncompress very_large_data_file.Z

note that this will create a 5MB(!) file named 'very_large_data_file', so make sure you have enough free disk-space before performing this operation.

Using 3rd-Party Libraries In A Multi-Threaded Application

One more point, and a very important one, should be taken by programmers employeeing multi-threading in their programs. Since a multi-threaded program might have the same function executed by different threads at the same time, one must make sure that any function that might be invoked from more than one thread at a time, is MT-safe (Multi-Thread Safe). This means that any access to data structures and other shared resources is protected using mutexes.

It may be possibe to use a non-MT-safe library in a multi-threaded programs in two ways:

Use this library only from a single thread. This way we are assured that no function from the library is executed simultanouasly from two seperate threads. The problem here is that it might limit your whole design, and might force you to add more communications between threads, if another thread needs to somehow use a function from this library.
Use mutexes to protect function calls to the library. This means that a single mutex is used by any thread invoking any function in this library. The mutex is locked, the function is invoked, and then the mutex is unlocked. The problem with this solution is that the locking is not done in a fine granularity - even if two functions from the library do not interfere with each other, they still cannot be invoked at the same time by seperate threads. The second thread will be blocked on the mutex until the first thread finishes the function call. You might call for using seperate mutexes for unrelated functions, but usually you've no idea how the library really works and thus cannot know which functions access the same set of resources. More than that, even if you do know that, a new version of the library might behave differently, forcing you to modify your whole locking system.

As you can see, non-MT-safe libraries need special attention, so it is best to find MT-safe libraries with a similar functionality, if possible.

Using A Threads-Aware Debugger

One last thing to note - when debugging a multi-threaded application, one needs to use a debugger that "sees" the threads in the program. Most up-to-date debuggers that come with commercial development environments are thread-aware. As for Linux, gdb as is shiped with most (all?) distributions seems to be not thread-aware. There is a project, called 'SmartGDB', that added thread support to gdb, as well as a graphical user interface (which is almost a must when debugging multi-threaded applications). However, it may be used to debug only multi-threaded applications that use the various user-level thread libraries. Debugging LinuxThreads with SmartGDB requires applying some kernel patches, that are currently available only for Linux kernels from the 2.1.X series. More information about this tool may be found at http://hegel.ittc.ukans.edu/projects/smartgdb/. There is also some information about availability of patches to the 2.0.32 kernel and gdb 4.17. This information may be found on the LinuxThreads homepage.

Side-Notes

water-marks algorithm: An algorithm used mostly when handling buffers or queues: start filling in the queue. If its size exceeds a threshold, known as the high water-mark, stop filling the queue (or start emptying it faster). Keep this state until the size of the queue becomes lower than another threshold, known as the low water-mark. At this point, resume the operation of filling the queue (or return the emptying speed to the original speed).

[LUPG Home] [Tutorials] [Related Material] [Essays] [Project Ideas] [Send Comments]

This document is copyright (c) 1998-2002 by guy keren.

The material in this document is provided AS IS, without any expressed or implied warranty, or claim of fitness for a particular purpose. Neither the author nor any contributers shell be liable for any damages incured directly or indirectly by using the material contained in this document.

permission to copy this document (electronically or on paper, for personal or organization internal use) or publish it on-line is hereby granted, provided that the document is copied as-is, this copyright notice is preserved, and a link to the original document is written in the document's body, or in the page linking to the copy of this document.

Permission to make translations of this document is also granted, under these terms - assuming the translation preserves the meaning of the text, the copyright notice is preserved as-is, and a link to the original document is written in the document's body, or in the page linking to the copy of this document.

For any questions about the document and its license, please contact the author.

'OS concept' 카테고리의 다른 글

실시간 운영체제 종류 (0)	2006.01.23
Synchronization primitives (0)	2005.02.28
Application Development Guide --Conditional variable (0)	2005.02.28
System V 계열 세마포어(semaphore)를 통한 상호배제 (0)	2005.02.18
조건변수 condition variable (0)	2005.02.18

Posted by '김용환'

Application Development Guide --Conditional variable

OS concept 2005. 2. 28. 20:30

Application Development Guide --Core Components

Synchronization Objects

In a multithreaded program, you must use synchronization objects whenever there is a possibility of corruption of shared data or conflicting scheduling of threads that have mutual scheduling dependencies. The following subsections discuss two kinds of synchronization objects: mutexes and condition variables.

Mutexes

A mutex (mutual exclusion) is an object that multiple threads use to ensure the integrity of a shared resource that they access, most commonly shared data. A mutex has two states: locked and unlocked. For each piece of shared data, all threads accessing that data must use the same mutex; each thread locks the mutex before it accesses the shared data and unlocks the mutex when it is finished accessing that data. If the mutex is locked by another thread, the thread requesting the lock is blocked when it tries to lock the mutex if you call pthread_mutex_lock() (see Figure 9). The blocked thread continues and is not blocked if you call pthread_mutex_trylock().

Figure 9. Only One Thread Can Lock a Mutex

Each mutex must be initialized. (To initialize mutexes as part of the program's one-time initialization code, see "One-Time Initialization Routines".) To initialize a mutex, use the pthread_mutex_init() routine. This routine allows you to specify an attributes object, which allows you to specify the mutex type. The following are types of mutexes:

A fast mutex (the default) is locked only once by a thread. If the thread tries to lock the mutex again without first unlocking it, the thread waits for itself to release the first lock and deadlocks on itself.
This type of mutex is called fast because it can be locked and unlocked more rapidly than a recursive mutex. It is the most efficient form of mutex.
A recursive mutex can be locked more than once by a given thread without causing a deadlock. The thread must call the pthread_mutex_unlock() routine the same number of times that it called the pthread_mutex_lock() routine before another thread can lock the mutex. Recursive mutexes have the notion of a mutex owner. When a thread successfully locks a recursive mutex, it owns that mutex and the lock count is set to 1. Any other thread attempting to lock the mutex blocks until the mutex becomes unlocked. If the owner of the mutex attempts to lock the mutex again, the lock count is incremented, and the thread continues running. When an owner unlocks a recursive mutex, the lock count is decremented. The mutex remains locked and owned until the count reaches 0 (zero). It is an error for any thread other than the owner to attempt to unlock the mutex.
A recursive mutex is useful if a thread needs exclusive access to a piece of data, and it needs to call another routine (or itself) that needs exclusive access to the data. A recursive mutex allows nested attempts to lock the mutex to succeed rather than deadlock.
This type of mutex requires more careful programming. Never use a recursive mutex with condition variables because the implicit unlock performed for a pthread_cond_wait() or pthread_cond_timedwait() may not actually release the mutex. In that case, no other thread can satisfy the condition of the predicate.
A nonrecursive mutex is locked only once by a thread, like a fast mutex. If the thread tries to lock the mutex again without first unlocking it, the thread receives an error. Thus, nonrecursive mutexes are more informative than fast mutexes because fast mutexes block in such a case, leaving it up to you to determine why the thread no longer executes. Also, if someone other than the owner tries to unlock a nonrecursive mutex, an error is returned.

To lock a mutex, use one of the following routines, depending on what you want to happen if the mutex is locked:

The pthread_mutex_lock() routine
If the mutex is locked, the thread waits for the mutex to become available.
The pthread_mutex_trylock() routine
If the mutex is locked, the thread continues without waiting for the mutex to become available. The thread immediately checks the return status to see if the lock was successful, and then takes whatever action is appropriate if it was not.

When a thread is finished accessing a piece of shared data, it unlocks the associated mutex by calling the pthread_mutex_unlock() routine.

If another thread is waiting on the mutex, its execution is unblocked. If more than one thread is waiting on the mutex, the scheduling policy and the thread scheduling priority determine which thread acquires the mutex. (See "Scheduling Priority Attribute" for additional information.)

You can delete a mutex and reclaim its storage by calling the pthread_mutex_destroy() routine. Use this routine only after the mutex is no longer needed by any thread. This routine may require serialization. Mutexes are automatically deleted when the program terminates.

Note:

Never include DCE APIs (such as pthread_mutex_destroy()) in the termination routine of a DLL. Doing so can result in an error (such as return code 6 -- invalid handle) when termination occurs out of sequence.

Condition Variables

A condition variable allows a thread to block its own execution until some shared data reaches a particular state. Cooperating threads check the shared data and wait on the condition variable. For example, one thread in a program produces work-to-do packets and another thread consumes these packets (does the work). If the work queue is empty when the consumer thread checks it, that thread waits on a work-to-do condition variable. When the producer thread puts a packet on the queue, it signals the work-to-do condition variable.

A condition variable is used to wait for a shared resource to assume some specific state (a predicate). A mutex, on the other hand, is used to reserve some shared resource while the resource is being manipulated. For example, a thread A may need to wait for a thread B to finish a task X before thread A proceeds to execute a task Y. Thread B can tell thread A that it has finished task X by using a variable they both have access to, a condition variable called a predicate. When thread A is ready to execute task Y, it looks at the condition variable (predicate) to see if thread B is finished (see Figure 10).

Figure 10. Thread A Waits on Condition Ready, Then Wakes Up and Proceeds

First, thread A locks the mutex named mutex_ready that is associated with the condition variable. Then it reads the predicate associated with the condition variable named ready. If the predicate indicates that thread B has finished task X, then thread A can unlock the mutex and proceed with task Y. If the condition variable predicate indicated that thread B has not yet finished task X; however, then thread A waits for the condition variable to change. Thread A calls the pthreadwait primitive. Waiting on the condition variable automatically unlocks the mutex, allowing thread B to lock the mutex when it has finished task X. The lock is automatically reacquired before waking up thread A(see Figure 11).

Figure 11. Thread B Signals Condition Ready

Thread B updates the predicate named ready associated with the condition variable to the state thread A is waiting for. It also executes a signal on the condition variable while holding the mutex mutex_ready.

Thread A wakes up, verifies that the condition variable (predicate) is in the correct state, and proceeds to execute task Y (see Figure 10).

Note that, although the condition variable is used for explicit communications among threads, the communications are anonymous. Thread B does not necessarily know that thread A is waiting on the condition variable that thread B signals. And thread A does not know that it was thread B that woke it up from its wait on the condition variable.

Use the pthread_cond_init() routine to create a condition variable. To create condition variables as part of the program's one-time initialization code, see "One-Time Initialization Routines".

Use the pthread_cond_wait() routine to cause a thread to wait until the condition is signaled or broadcast. This routine specifies a condition variable and a mutex that you have locked. (If you have not locked the mutex, the results of pthread_cond_wait() are unpredictable.) This routine unlocks the mutex and causes the calling thread to wait on the condition variable until another thread calls one of the following routines:

The pthread_cond_signal() routine to wake one thread that is waiting on the condition variable
The pthread_cond_broadcast() routine to wake all threads that are waiting on a condition variable

If you want to limit the time that a thread waits for a condition to be signaled or broadcast, use the pthread_cond_timedwait() routine. This routine specifies the condition variable, mutex, and absolute time at which the wait should expire if the condition variable is not signaled or broadcast.

You can delete a condition variable and reclaim its storage by calling the pthread_cond_destroy() routine. Use this routine only after the condition variable is no longer needed by any thread. Condition variables are automatically deleted when the program terminates.

Other Synchronization Methods

There is another synchronization method that is not anonymous: the join primitive. This allows a thread to wait for another specific thread to complete its execution. When the second thread is finished, the first thread unblocks and continues its execution. Unlike mutexes and condition variables, the join primitive is not associated with any particular shared data.

'OS concept' 카테고리의 다른 글

Synchronization primitives (0)	2005.02.28
Multi-Threaded Programming With POSIX Threads (0)	2005.02.28
System V 계열 세마포어(semaphore)를 통한 상호배제 (0)	2005.02.18
조건변수 condition variable (0)	2005.02.18
[세마포어] [뮤텍스] (0)	2005.02.12

Posted by '김용환'

unix vi 명령

unix and linux 2005. 2. 28. 20:09

UNIX VI 사용법

Quick-reference vi
==================

작성자 : 이재용 KREONet, SERI

Cursor를 옮기는 여러가지 방법

                        k(-)
                         △
                         ||
               h(bs)     <-----          ----->   l(sp)
                         ||
                         ▽
                         j(+)

     h,j,k 와 i 를 이용하영 간단히 키보를 누르면 cursor의 위치가 변경됩니다.
     ()에 있는 키를 선택하여도 됩니다.

     bs : BackSpace                         sp : SPace bar

화면을 옮기는 여러가지 방법

     ^f     - (FORWARD) 화면을 one page 앞으로 옮김
     ^b     - (BACKWARD) 화면을 one page 뒤으로 옮김
     ^d     - (DOWN) 화면을 반 page 앞으로 옮김
     ^u     - (UP) 화면을 반 page 뒤으로 옮김

        * ^ 는 키보드의 를 나타내는 것입니다.

한 화면 안에서 cursor를 옮기는 방법

     H     - HOME, 한 화면의 top line으로 옮긴다.
     M     - MIDDLE, 한 화면의 중간으로 옮긴다.
     L     - LAST, 한 화면의 마지막 line으로 옮긴다.
     G     - GOTO, 화일의 마지막 line으로 옮긴다.
     nG     - GOTO nth line the file (or :n)
     ^G     - GIVES file status

한 line 안에서 cursor를 옮기는 방법

     w     - WORD, 한단어 앞으로
     b     - BACKWARD, 한단어 뒤로
     e     - END, 현재 cursor가 위치한 맨 뒤로
     o     - zero, line의 맨 앞으로 (or ^)
     $     - end, line의 맨뒤로

검 색

     /pattern - scan (/) 'pattern'이라는 글자를 현재 cursor가
              위치한 곳의 다음 단어를 검색
     ?pattern - scan (?) 'pattern'이라는 글자를 현재 cursor가
              위치한 곳의 앞 단어를 검색
     n      - 앞쪽에서 입력했었던 문자 'pattern' 의 다음단어를 검색
     N      - 앞쪽에서 입력했었던 문자 'pattern' 의 다음 앞 단어를 검색

vi를 빠져나가는 방법

     :q!     - 화일의 내용을 저장하지 않고 exit
     :w     - WRITE, vi를 빠져나가지 않으면서 내용을 저장
     :wq     - WRITE and QUIT, vi를 빠져나가면서 내용을 저장
     ZZ     - :wq의 명령과 같다.

삽입 mode

Note :     ESC (escape key) 는 삽입(insert)를 중지 시키고 명령어 mode로 돌아가는 키이다.

     i     - INSERT, cursor 위치 부터 문자 삽입
     I     - INSERT, line의 맨 앞부터 문자 삽입
     a     - APPEND, cursor 다음 위치 부터 문자 삽입
     A     - APPEND, cursor 위치한 line의 끝부터 문자 삽입
     o     - OPEN line, 현재의 line 다음에 line 삽입
     O     - OPEN line, 현재의 line 앞에 line 삽입
     r     - REPLACE, 한 문자만을 대치 (does not require ESC)
     R     - REPLACE, ESC 키가 입력될때 까지 대치
     cw     - CHANGE word, cursor가 위치한 곳의 단어 끝까지 대치
                (cnw - change n number of words)
     C     - CHANGE, cursor가 위치한 곳에서 line 끝까지 대치
     u     - UNDOES, 마지막 명령의 취소
     U     - UNDOES, line전체를 원 상태로 복구

Note : INSERT mode 를 만든는 모든 문자(i, a, o, r, c, s <대문자 포함>)
     를 사용하는 경우 insert mode에서 입력 한 내용을 취소하고자
     하면, ESC 를 누르고 undo를 수행하는 u를 입력하면 된다.

Yanking : (Copying)

     Y     - YANKS (copies) line을 사용자가 보이지 않는 buffer에 저장
     nY     - YANKS n , n line을 사용자가 보이지 않는 buffer에 저장

Deleting :

     x     - deletes, 문자 하나를 삭제 (also 'd sp')
     dw     - DELETES words, 문자 하나를 삭제
     D     - DELETES, cursor가 있는 곳에서 부터 line끝 까지 삭제
     dd     - DELETES lines, line하나를 삭제 하고 보이지 않는 buffer에 저장
     ndd     - DELETES n, n line을 삭제 하고 보이지 않는 buffer에 저장
             (i.e., 10dd deletes 10 lines)

Putting :

     p     - PUTS, cursor가 위치한 다음 line에 보이지 않는 buffer의
                내용을 삽입
     P     - PUTS, cursor가 위치한 앞 line에 보이지 않는 buffer의
                내용을 삽입
     xp     - cursor 가 위치한 문자와 다음 문자를 교환

Interactive edit : (search and replace)

     /pattern - find pattern to be replaced (as above)
     cw      - use a replacement comand (cw, dw, r, s, etc.)
     n      - find next occurrence of 'pattern'
           - repeat command

     * 앞의 내용 참고

Global replacement :

        :1,$s/string1/string2/g
        1 line부터 끝까지 string1 를 string2 로 대치
     e.g.,     :1,$s/sun/SUN/g

Global delete :

        :g/pattern/d
        1 line부터 끝까지 pattern 을 제거
     e.g.,     :g/###/d (to delete lines inserted by cc file.c | & error-v)

Reading in files :

     :r file2 - cursor가 위치한 다음 line에 file2를 삽입

Editing between files : (not needed for SUN system users)

     :w     - 다른 화일을 읽기전에 현재의 화일을 저장 (file1)
     :e file2 - 두번째 file을 edit하기 위해 load (file2)
     :w     - 두번째 화일을 저장 (file2)
     :e #     - original file 을 수정하기 위해 load (file1)

     example :w          /* file1을 빠져 나가기전에 저장*/
          :e file2     /* file2 load      */
               "x4Y     /* buffer 'x'에 file2의 top 4 line을 저장 */
          :e #          /* file1 을 load (no changes) */
               "xP     /* buffer 'x'에 저장 되어 있는 내용을 put */

Miscellancenous commands :

     :! cmd     - editor안에서 shell command를 수행 하고자 할때
     ~     - (tilde or 'wavy'), 대문자를 소문자로 소문자를 대문자로 교환
     %     - 한 line안에서 (,),{,},[,]를 검색
     mx     - 문자 x에 현재 위치를 표시
        d'x     - 문자 x에 표시된 위치 부터 현재 cursor 위치 까지 삭제
     ^V     - allows for insertion of control characters (e.g., ^L)
     ?string - scan (/) backward for 'pattern'
     :n,m w file - n line에서 m line까지 내용을 file 이라는 이름으록
                 (e.g., 15,25 w file)
     J     - JOINS, cusor가 있는 line과 다음 line을 join
     :set ai     - editor 가 자동적으로 insert tabs을 삽입
     :set list - special characters 를 보여줌
               (i.e., non-printable characters)
     :set nows - stop wraparound search
     :set ts=n - set tab stops to be other than the default (8)
     :set wm=n - set wrap margin (automatic carriage return insert at n)

example :     setenv EXINIT 'set ai wm=8 ts=4|map F W|map @ :w^M:e#^M'

'unix and linux' 카테고리의 다른 글

Yum (0)	2007.06.07
Virtual Box 설치하기 (0)	2007.06.07
페도라 설치 (0)	2007.06.06
Richard Stevens (0)	2006.07.20
[펌] ssh 포트(22번) 변경하기 (0)	2006.01.21

Posted by '김용환'

System V 계열 세마포어(semaphore)를 통한 상호배제

OS concept 2005. 2. 18. 22:16

http://www.sunyzero.com/zboard/view.php?id=sunycomputer&page=3&sn1=&divpage=1&sn=off&ss=on&sc=on&select_arrange=headnum&desc=asc&no=82

Subject		[C] System V 계열 세마포어(semaphore)를 통한 상호배제: 예제있음

[ 세마포어와 상호배제 : System V Semaphore 예제 ]

* 글쓴이 : Steven Kim <sunyzero@yahoo.com>
* 마지막 고친날짜 : 2003-08-20
* 이 글에 문제가 있거나 오타/이상함이 있는 경우 댓글을 첨언하여 주시면 반영하겠습니다.(if idle...)

1. 개념
세마포어(semaphore)는 상호배제(Mutual Exclusion:MUTEX) 이론을 이용하여 만들어진 운영체제에서 제공하는 기능중의 하나입니다.
세마포어를 처음 주창한 사람이 딕스트라(Dijkstra)인것은 아시죠? shortest path 알고리즘 및 여러가지 이론을 만든 수학자 아저씨...(shortest path에서는 bellman ford algorithm과 가장 많이 쓰이죠. Linear Programming쪽 공부하면 가장 처음에 배우는 앨거리듬이죠)
자 이런 세마포어는 어떤 것일까? 한번 살펴보겠습니다. 실제로 세마포어는 어떤 영역을 한번에 한녀석만 들어갈 수 있게 만든 것입니다.
쉽게 말하면 특정 영역에 A란 녀석이 쓰기를 시도할때 B란 녀석이 다시 쓰기를 하면 다 쓰고 난뒤에 그 내용이 A란 녀석이 쓴 내용인지, 혹은 B란 녀석이 쓴 내용인지 알 도리가 없습니다. 심지어 두개가 섞여버리는 수도 있습니다. 이런 경우를 방지하기 위해서 데이터를 변경할 경우엔 그 변경하는 타이밍에 다른 녀석이 접근하지 못하게 하는게 가장 좋죠. 물론 읽기를 시도할때도 쓰기가 완전히 끝난 다음에 읽도록 하는게 좋겠죠?

실제로 세마포어는 에서 P/V 오퍼레이션에 대한 것은 운영체제 책에 나오므로 읽어봅시다. P는 세마포어의 상태를 Zero로 만들고, V는 다시 양수쪽으로 바꾸게 됩니다. 그렇게 되어 현재 상태가 0 이면 기다리게 되고, 그게 자신의 차례가 되면 바로 전에 임계영역(critical section)에 있던 녀석이 빠져나가면서 V 를 해서 양수를 만들어주어 자신이 진입할 수 있도록 해줍니다.

2. 간단한 세마포어 예제(1)
자 그러면 실제로 사용되는 부분을 봅시다. 아래의 코드를 보면서 생각해봅시다.
(여기서 제공하는 것은 가장 많이 사용되는 SysV계열의 세마포어를 기준으로 합니다.)

* 설정
- 현재 아래 소스코드가 수행되는 서버는 pre-fork 된 10개의 서버가 동시에 접근가능한 부분이다.
- shm_userinfo 는 공유메모리 영역이다. 10개의 서버는 이 영역을 공유한다.
- sem_buf.sem_op 는 세마포어에 더할 값이다. -1 이면 P 오퍼레이션(lock)을 수행한다. 1 은 V 오퍼레이션(unlock)을 수행한다.
- semop()의 마지막(3번째 파라메터)는 sem_buf 배열의 갯수를 의미한다.

[code]
....(생략)....
    sem_buf.sem_num = sem_idx;        /* semaphore element index */
    sem_buf.sem_flg = SEM_UNDO;        /* SEM_UNDO flag */
    sem_buf.sem_op = -1;
    semop(sem_id, &sem_buf, 1); /* P operation : decrease */

    shm_userinfo->person[i].id = cur_id; /* critical section */

    sem_buf.sem_num = sem_idx;
    sem_buf.sem_flg = SEM_UNDO;
    sem_buf.sem_op = 1;
    semop(sem_id, &sem_buf, 1); /* V operation : increase */
....(생략)....
[/code]

위에서 보면 10개의 서버를 간략하게 a, b, c, ... 로 칭할때 a 가 먼저 P 를 걸고 critical section에 진입해서 데이터를 수정하고 있습니다. 이 데이터를 수정하는 도중에 b, c 프로세스가 이 부분에 진입하면서 P 를 걸고 들어오게 됩니다. P 오퍼레이션이 되면서 사용가능한 공간이 없음을 알리기 위해서 세마포어값은 0 이 됩니다. 그런 뒤에 a 가 shm_userinfo 메모리 영역을 수정하고 임계영역을 벗어나게 되면 sem_buf.sem_op = 1 을 넣고, semop() 를 호출하여 세마포어값을 증가시킵니다. 그러면 다음에 신호를 받을 녀석은 0 상태에서 1 이 되고 따라서 자신이 임계영역을 들어갈 수 있는 상태라는 것을 알게됩니다. 따라서 임계영역에 진입하면서 P 를 호출해서 0 을 만들죠.

주의) 세마포어는 위의 shm_userinfo 라는 영역이 동시에 변경되거나 변경되는 도중 읽기가 일어나는 것을 막기 위해서 사용됩니다. 그러므로 실제 상호배제는 특정 영역을 접근하는 것에 있어서 동시성을 배제하는 것이 목적이지, 연산의 순서를 정하는 것은 아닙니다.

3. System V semaphore functions
위에서 간단하게 세마포어의 사용에 대해서 봤으니 실제로 세마포어를 생성하고, 변경하고, 제거하고, 정보를 알아보는 방법도 알아봅시다.

3.1 세마포어 생성: semget

문법 : int semget ( key_t key, int nsems, int semflg )

- 반환값 : 성공: 세마포어에 접근가능한 식별자(semaphore id)를 반환합니다. 이 id로 접근가능합니다.
          실패: -1

- key   : 세마포어를 생성할 세마포어 키입니다. 일반적으로 공유메모리와 같이 쓸 경우 키값은 혼동을 피하기 위해서 같이 사용합니다. 단 몇몇 시스템은 같은 키를 사용할 수 없는 경우가 있습니다.
          만일 특정키와 중복되지 않는 키를 생성해서 사용하기 위해서는 IPC_PRIVATE를 사용할 수 있습니다(이 경우엔 반환되는 ipc id값을 저장하고 있다가 해당 semid로 접근해야 합니다)
        ex) 공유메모리키 : 0x35001000 -> 세마포어 키 : 0x35001000
- nsems : 생성시 세마포어 갯수를 의미합니다. 기존 세마포어에 attach 할 경우엔 이 값은 무시됩니다.
- semflg: 생성시 적용할 플래그입니다. 이 플래그들은 OR 연산으로 여러개의 플래그를 적용할 수 있습니다. 일반적으로 이 플래그는 IPC 기본 플래그와 동일합니다. 따라서 공유메모리 메세지큐의 플래그와 동일한 의미를 가집니다. 그리고 플래그들과 별개로 이 semflg의 하위 9비트는 생성권한을 의미합니다. 0777 로 주면 모든 유저가 읽기/삭제/접근이 가능합니다.
  + semflg 목록

  IPC_CREAT : 세마포어를 생성합니다.
  IPC_EXCL  : IPC_CREAT 와 같이 사용할 수 있습니다. 이미 생성되어있는 경우 -1을 반환하고 errno는 EEXIST로 세팅됩니다.

예) 0x44001100 키를 가지고 5개의 배열을 지닌 세마포어를 생성. 생성시 권한은 0660 으로 세팅, IPC_EXCL 플래그가 존재하므로 이미 존재한다면 -1 을 반환하고 errno는 EEXIST로 설정된다.

sem_id = semget(0x44001100, 5, IPC_CREAT|IPC_EXCL|0660);

3.2 세마포어 제어: semctl (semaphore control)

문법 : int semctl (int semid, int semnum, int cmd, union semun arg)

- 반환값 : 성공: 양수값
          실패: -1

- semid : semget() 에서 리턴된 세마포어 식별자(semaphore id)
- semnum: 제어할 세마포어 배열의 인덱스
- cmd   : 세마포어 제어 명령
- arg   : 세마포어 제어 명령(cmd)에 따라서 저장되거나 반환되는 세마포어 정보 공용체

arg는 다음과 같다. 이 공용체는 어떤 cmd 를 사용하는가에 따라서 다른 의미로 사용된다.
union semun {
    int val;                    /* SETVAL을 위한값 */
    struct semid_ds *buf;       /* IPC_STAT, IPC_SET을 위한 버퍼 */
    unsigned short int *array;  /* GETALL, SETALL을 위한 배열 */
    struct seminfo *__buf;      /* IPC_INFO을 위한 버퍼 */
};

cmd 을 위한 값은 다음과 같다. 당연히 이 명령들은 세마포어에 접근가능한 해당권한이 있어야 한다. 읽기명령인 경우는 읽기권한, 변경/삭제인경우는 쓰기 권한이 있어야 한다.

IPC_STAT    arg.buf 원소에 semaphore 정보를 복사합니다. 인수 semnum 는 무시된다.
            이 struct semid_ds 구조체 아래와 같습니다.

        struct semid_ds
        {
            struct ipc_perm sem_perm;     /* operation permission struct   */
            __time_t sem_otime;           /* last semop() time             */
            __time_t sem_ctime;           /* last time changed by semctl() */
            unsigned long int sem_nsems;  /* number of semaphores in set   */
        };

IPC_RMID    세마포어 설정을 즉시 제거하고, 그것의 데이타 구조는 모든 대기중인 프로세스들을 재실행한다.
            호출한 프로세스 유효 사용자ID는 수퍼유저나 세마포어설정의 생성, 소유자중의 하나이어야한다.
            인수 semnum 는 무시된다.

GETALL      arg.array 인수에 설정된 모든 세마포어 위한 semval 를 반환한다.
            변수 The argument semnum 는 무시된다.

GETNCNT     시스템 호출은 semncnt 의 값을 반환한다.

GETPID      세마포어 호출은 sempid 의 값을 반환한다. 세마포어를 호출한 프로세스의 pid를 의미한다.

GETVAL      시스템 호출은 설정의 semnum 번째에 해당하는 세마포어 배열의 semval 의 값을 리턴한다.

GETZCNT     시스템 호출은 설정의 semnum 번째에 해당하는 세마포어 배열에 semzcnt의 값을 리턴한다.
            semzcnt는 현재 블록되어있는 기다리는 프로세스 갯수이다.

SETALL      세마포어 인수배열을 사용하여 설정과 관련된 semid_ds 구조체의 sem_ctime, semval 의 값을 새로 설정한다.
            기존의 세마포어에 대해서 Undo 엔트리는 모두 소거되고 대기열에서 유휴중인 프로세스들은 semval을 0으로
            만들고 기다리게 된다. 인수 semnum은 무시된다.

3.3 세마포어 값 변경

문법 : int semop ( int semid, struct sembuf *sops, unsigned nsops )

- 반환값 : 성공:  0 리턴
          실패: -1 리턴

- semid : semget() 에서 리턴된 세마포어 식별자(semaphore id)
- sops  : 세마포어 연산을 위한 지시자 구조체, 각 구조체의 의미는 아래와 같다.
            short sem_num;  /* semaphore 배열번호: 0 = first */
            short sem_op;   /* semaphore operation          */
            short sem_flg;  /* operation flags              */
- nsops : sops 배열의 갯수, 변경할 세마포어가 여러개인 경우는 연산을 위한 sops의 배열 갯수를 의미하게 된다.

위에서 세마포어 연산을 위한 struct sembuf *sops 의 각 필드들은 각 의미를 가진다.

sem_num : 세마포어 배열번호이다. 연산을 위한 세마포어 배열의 인덱스이다.
sem_op  : 세마포어 값(semval)에 더할 값이다. 주로 1, -1 로 되어있다.
          -1은 값을 감소시키므로 P 연산에 해당하고, 1 이면 값을 증가시키므로 V 연산에 해당한다.
sem_flg : 세마포어 연산을 위한 선택적 플래그, 아래와 같은 의미를 가진다.

IPC_NOWAIT : 일반적으로 세마포어는 P 를 호출했을때 세마포어값이 0 이면 사용가능하지 않으므로 현재 임계영역에 있는 프로세스가 끝나고 V를 호출하기까지 블록됩니다. 하지만, IPC_NOWAIT를 설정한다면 errno를 EAGAIN 으로 설정하고 바로 리턴합니다. 말그대로 기다림이 없다는 것입니다.
SEM_UNDO : 이 플래그가 세팅되면 해당 프로세스가 종료될때 자동으로 작업을 취소하게 해줍니다. 따라서 어떤 프로세스가 임계영역안에서 죽는 일이 발생한다면 자동으로 해당 작업은 취소되고 다음 프로세스가 P 연산을 걸고 임계영역안으로 진입할 수 있게 됩니다.

예제) 아래는 2개의 세마포어를 바꾸는 작업이다. 실제로 0번째(첫번째 세마포어 배열), 3번째(4번째 세마포어 배열)을 동시에 바꾸는 연산을 수행한다. 또한 sem_op가 -1 이므로 P 연산에 해당하는 작업이 수행된다.
struct sembuf semops[2];

semops[0].sem_num = 0;
semops[0].sem_op  = -1;
semops[0].sem_flg = SEM_UNDO;
semops[1].sem_num = 3;
semops[1].sem_op  = -1;
semops[1].sem_flg = SEM_UNDO;
semop(semid, semops, 2);

4. 사용예
아래는 세마포어를 사용하는 실제 예제코드이다. 여러개의 프로세스가 fork()되어서 세마포어를 이용하여 상호배제를 하는 것을 볼 수 있다.

4.1 주의!
아래 예제를 넣을려고 했지만 그냥 소스파일을 올리는 것으로 대신하겠다. 실제로 src/ipc 디렉토리에 들어가서 make 로 컴파일을 하면 sysv_sem, sysv_nosem 두개의 파일이 나온다. 실제 이 파일들은 sysv_sem.c 파일에 세마포어를 사용한것과 사용하지 않은 것으로 컴파일한 것이다.

실행은 아규먼트 인자로서 fork 할 프로세스 갯수를 넣어주면 된다.
또한, 3개 이상 fork() 할 경우 3번째 프로세스는 인위적으로 abort()로 종료시키는데 세마포어 사용시 제대로 SEM_UNDO가 제대로 작동함을 보여주는 것이다. 만일 SEM_UNDO가 없다면 3번째 프로세스가 abort()로 종료되면서 모든 프로세스들은 블록될것이다.

PS) 어디에 퍼가서 사용하실때는 원본글의 출처를 밝혀주시기 바랍니다. 혹여 안밝혀도 상관없습니다. 다만 그냥 예의상... ^^*

Ref. 배타적인 자원사용에 대한 것을 자세히 공부해두면 좋다. 비동기적 프로세싱이나 커널프로그래밍에서는 이것은 필수다.
* TAS(Test-and-Set) operation : atomic한 프로세스로 서로다른 두 함수가 하나의 리소스에 동시적으로 접근할때 먼저 진입한 쪽이 1로 세팅하여 다른 함수가 진입하지 못하게 막는다. 끝나고 나갈때 0으로 돌려주면 다음 함수가 진입한다. 68000 CPU의 경우에는 하드웨어 인스트럭션으로 TAS 를 제공한다.
* Spinlock : 배타적으로 다수의 CPU에서 서로 자원의 선점을 위해서 사용되어진다. 커널내부에서 주로 사용한다.

'OS concept' 카테고리의 다른 글

Multi-Threaded Programming With POSIX Threads (0)	2005.02.28
Application Development Guide --Conditional variable (0)	2005.02.28
조건변수 condition variable (0)	2005.02.18
[세마포어] [뮤텍스] (0)	2005.02.12
인터럽트에 대해서 (0)	2005.01.19

Posted by '김용환'

pthread 개념 - Application Development Guide --Core Components

c or linux 2005. 2. 18. 21:00

IBM Distributed Computing Environment for AIX, Version 2.2; (C) IBM Corporation

Application Development Guide --Core Components

Synchronization Objects

Mutexes

Figure 9. Only One Thread Can Lock a Mutex

A fast mutex (the default) is locked only once by a thread. If the thread tries to lock the mutex again without first unlocking it, the thread waits for itself to release the first lock and deadlocks on itself.
This type of mutex is called fast because it can be locked and unlocked more rapidly than a recursive mutex. It is the most efficient form of mutex.
A recursive mutex can be locked more than once by a given thread without causing a deadlock. The thread must call the pthread_mutex_unlock() routine the same number of times that it called the pthread_mutex_lock() routine before another thread can lock the mutex. Recursive mutexes have the notion of a mutex owner. When a thread successfully locks a recursive mutex, it owns that mutex and the lock count is set to 1. Any other thread attempting to lock the mutex blocks until the mutex becomes unlocked. If the owner of the mutex attempts to lock the mutex again, the lock count is incremented, and the thread continues running. When an owner unlocks a recursive mutex, the lock count is decremented. The mutex remains locked and owned until the count reaches 0 (zero). It is an error for any thread other than the owner to attempt to unlock the mutex.
A recursive mutex is useful if a thread needs exclusive access to a piece of data, and it needs to call another routine (or itself) that needs exclusive access to the data. A recursive mutex allows nested attempts to lock the mutex to succeed rather than deadlock.
This type of mutex requires more careful programming. Never use a recursive mutex with condition variables because the implicit unlock performed for a pthread_cond_wait() or pthread_cond_timedwait() may not actually release the mutex. In that case, no other thread can satisfy the condition of the predicate.
A nonrecursive mutex is locked only once by a thread, like a fast mutex. If the thread tries to lock the mutex again without first unlocking it, the thread receives an error. Thus, nonrecursive mutexes are more informative than fast mutexes because fast mutexes block in such a case, leaving it up to you to determine why the thread no longer executes. Also, if someone other than the owner tries to unlock a nonrecursive mutex, an error is returned.

To lock a mutex, use one of the following routines, depending on what you want to happen if the mutex is locked:

The pthread_mutex_lock() routine
If the mutex is locked, the thread waits for the mutex to become available.
The pthread_mutex_trylock() routine
If the mutex is locked, the thread continues without waiting for the mutex to become available. The thread immediately checks the return status to see if the lock was successful, and then takes whatever action is appropriate if it was not.

When a thread is finished accessing a piece of shared data, it unlocks the associated mutex by calling the pthread_mutex_unlock() routine.

Note:

Condition Variables

Figure 10. Thread A Waits on Condition Ready, Then Wakes Up and Proceeds

Figure 11. Thread B Signals Condition Ready

Thread A wakes up, verifies that the condition variable (predicate) is in the correct state, and proceeds to execute task Y (see Figure 10).

Use the pthread_cond_init() routine to create a condition variable. To create condition variables as part of the program's one-time initialization code, see "One-Time Initialization Routines".

The pthread_cond_signal() routine to wake one thread that is waiting on the condition variable
The pthread_cond_broadcast() routine to wake all threads that are waiting on a condition variable

Other Synchronization Methods

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

'c or linux' 카테고리의 다른 글

동적 메모리 - 메모리 할당 (0)	2005.03.09
동적 메모리 - 메모리 크기 변경 (0)	2005.03.09
POSIX 쓰레드로 멀티 쓰레드 프로그래밍하기 (0)	2005.02.18
함수 포인터 (0)	2005.02.16
ctags 활용 (0)	2005.02.15

Posted by '김용환'

POSIX 쓰레드로 멀티 쓰레드 프로그래밍하기

c or linux 2005. 2. 18. 20:59

v1.1.1

POSIX 쓰레드로 멀티 쓰레드 프로그래밍하기

옮긴이: 차현진(terminus@kldp.org)
원 본: http://users.actcom.co.il/~choo/lupg/tutorials/multi-thread/multi-thread.html

차례

시작하기 전에(Before We Start)...

이 튜토리얼은 여러분에게 POSIX 쓰레드(pthread)를 이용한 멀티 쓰레드 프로그램에 익숙해지게 하고 쓰레드의 특징들이 실제 프로그램에서 어떻게 쓰이는지 보여줄 것입니다. 라이브러리가 정의해 놓은 여러가지 툴들을 설명하고 그것들을 어떻게 쓰는지, 또한 프로그래밍 문제를 해결하기위해 실제로 어떻게 적용시키는지를 보여줄 것입니다. 이 글을 읽으려면 병렬 프로그래밍(혹은 멀티 프로세스) 개념을 알고 있어야 합니다. 안 그러면 개념 잡기가 약간 힘들 것입니다. 각 튜토리얼은 "직렬" 프로그래밍에만 익숙한 독자들을 위해 이론적 배경 지식과 용어들을 설명 하면서 시작할 것입니다.

독자들이 X나 모티프 같은 비동기적인 프로그래밍 환경에 익숙하다고 가정을 하고 진행하겠습니다. 이런 환경에 익숙하다면 멀티 쓰레드 프로그래밍 개념을 이해하기 쉽습니다.

POSIX 쓰레드를 말할 때 항상 나오는 질문은 "어떤 POSIX 쓰레드 표준안을 써야 할 것인가?"입니다. 쓰레드 표준은 지난 몇 년간 계속 수정중이기 때문에 서로 다른 함수들, 서로 다른 디폴트 값, 서로 다른 뉘앙스의 여러 구현들이 있습니다. 본 튜토리얼은 리눅스 시스템의 커널 레벨 LinuxThreads 라이브러리 0.5 버전을 사용했기 때문에 다른 시스템, 다른 버전의 pthread를 쓰는 프로그래머들은 문제 발생시 해당 시스템의 매뉴얼을 참고해야 할 것입니다. 몇몇 예제들은 블러킹 시스템 콜을 쓰기 때문에 유저 레벨 쓰레드 라이브러리에서는 동작하지 않을 것입니다 (더 많은 정보를 보려면 우리 웹 사이트의 parallel programming theory tutorial을 참고하세요).
앞에서 얘기 했듯이 여기 나오는 예제들은 리눅스 이외의 다른 시스템에서도 동작하도록 노력을 했습니다(솔라리스 2.5).

쓰레드가 뭔데 그걸 쓰죠?(What Is a Thread? Why Use Threads)

쓰레드는 프로세스와 비슷합니다. 자신의 스택을 가지고 주어진 코드를 실행합니다. 하지만 진짜 프로세스와는 다르게 메모리를 다른 쓰레드와 공유합니다(프로세스는 자신만의 메모리 공간을 가지고 있습니다). 쓰레드 그룹은 한 프로세스 안에서 실행되는 모든 쓰레드를 나타내고, 메모리를 공유하기 때문에 전역 변수와 힙 메모리, 파일 디스크립터 등등을 공유합니다. 또한 같은 쓰레드 그룹의 쓰레드들은 병렬적으로 실행됩니다(즉, 시간을 잘라서 사용을 하는데 프로세서가 여러개라면 진짜 병렬로 동작합니다).

보통의 순차적인 프로그램 대신 쓰레드 그룹을 사용하면 몇 가지 일을 동시에 할 수 있는 장점이 있습니다. 따라서 어떤 이벤트에 대해 즉각적으로 반응을 할 수 있습니다 (예를 들면, 한 쓰레드는 사용자 인터페이스를 처리하고 다른 쓰레드는 데이타베이스 쿼리를 처리한다고 하면, 아주 엄청난 양의 쿼리가 들어와 바쁜 경우에도 사용자 입력에 대해 반응하고 처리할 수가 있습니다).

프로세스 그룹대신 쓰레드 그룹을 사용했을 때의 장점으로는 쓰레드간 컨택스트 스위치(context switching)가 프로세스간 컨택스트 스위치보다 훨씬 빠르다는 것입니다(컨택스트 스위칭이란 현재 돌고 있는 쓰레드나 프로세스에서 다른 쓰레드나 프로세스로 옮겨 가는 것을 말합니다). 또한, 보통 두 쓰레드간 통신을 두 프로세스간 통신보다 빠르고 쉽게 구현 할 수 있습니다.

다른 한 편으로는 한 그룹안의 모든 쓰레드들은 같은 메모리 영역을 사용하기 때문에 한 쓰레드가 메모리를 잘 못 건드리면 다른 쓰레드들에 영향이 미칠 수 있습니다. 프로세스에서는 운영체제가 프로세스를 다른 프로세스로부터 보호해 주기 때문에 쓰레드같은 영향은 없습니다. 프로세스의 다른 장점으로, 서로 다른 프로세스는 서로 다른 시스템(머신)에서 각각 돌 수 있다는 것입니다. 쓰레드는 보통 한 시스템에서 돌아야 합니다.

쓰레드 만들고 없애기(Creating And Destroying Threads)

멀티 쓰레드 프로그램이 실행을 시작하면 main()을 실행시키는 하나의 쓰레드만이 존재하게 됩니다. 이 완전한 쓰레드는 자신의 쓰레드 ID를 갖습니다. 새 쓰레드를 만들려면 pthread_create() 함수를 써야 됩니다. 어떻게 쓰는지 보시죠.


#include <stdio.h>       /* 표준 I/O 루틴 */
#include <pthread.h>     /* pthread 함수와 데이타 스트럭쳐 */

/* 새 쓰레드가 실행시킬 함수 */
void*
do_loop(void* data)
{
    int i;

    int i;			/* 숫자를 찍을 카운터 */
    int j;			/* 지연용 카운터      */
    int me = *((int*)data);     /* 쓰레드 구분 숫자 */

    for (i=0; i<10; i++) {
	for (j=0; j<500000; j++) /* 지연 루프 */
	    ;
        printf("'%d' - Got '%d'\n", me, i);
    }

    /* 쓰레드 없애기 */
    pthread_exit(NULL);
}

/* 보통의 C 프로그램처럼 main에서 시작합니다. */
int
main(int argc, char* argv[])
{
    int        thr_id;         /* 새 쓰레드용 쓰레드 ID */
    pthread_t  p_thread;       /* 쓰레드 구조체       */
    int        a         = 1;  /* 1번 쓰레드 구분 숫자  */
    int        b         = 2;  /* 2번 쓰레드 구분 숫자  */

    /* 'do_loop()를 실행시킬 새 쓰레드 만들기 */
    thr_id = pthread_create(&p_thread, NULL, do_loop, (void*)&a);
    /* main()함수에서도 'do_loop()' 실행시키기 */
    do_loop((void*)&b);
    
    /* NOT REACHED */
    return 0;
}

위 프로그램에서 몇 가지를 살펴보겠습니다.

메인 프로그램 자체도 쓰레드이기 때문에 do_loop()는 자신이 새로 실행시킨 쓰레드가 실행시킨 do_loop()와 병렬로 동작합니다.
pthread_create()는 4개의 파라미터를 받습니다. 첫 번째는 쓰레드에 대한 정보를 제공하기 위해서 쓰입니다. 두 번째는 새 쓰레드에 속성을 주기 위해서 쓰이는데 우리는 NULL 포인터를 넘겨 줘서 기본값을 쓰게 했습니다. 세 번째 파라미터는 어떤 함수에서 쓰레드가 시작할 것인지를 알려주는 것이고 네 번째는 그 함수로 넘겨줄 아규먼트를 나타냅니다. 여기서 'void*'로 캐스팅 한 것은 이것이 비록 ANSI-C 문법에서는 불필요하지만 좀 더 명확하게 하기 위해서 쓰인 것입니다.
지연 루프는 병렬로 실행되는 쓰레드를 확실히 보여주기 위해서 쓰였습니다. CPU가 너무 빨라서 한 쓰레드가 모두 출력된 다음 다른 쓰레드의 출력이 나온다면 지연값을 증가시키기 바랍니다.
pthread_exit()는 현재 쓰레드를 종료 시키고 자신이 갖고 있던 자신만의 쓰레드 리소스들을 놓아 줍니다. 쓰레드의 첫 함수 마지막에서 꼭 이 함수를 불러야 할 필요는 없습니다. 그 함수에서 리턴을 하게 되면 자동으로 종료가 됩니다. 쓰레드 중간에서 쓰레드를 종료하고 싶은 경우가 생길 때, 유용하게 쓰일 수 있습니다.

멀티 쓰레드 프로그램을 gcc로 컴파일 하려면 pthread 라이브러리 를 링크시켜줘야 합니다. 이미 여러분의 시스템에 이 라이브러리가 설치되어 있다고 가정하고 어떻게 컴파일 하는지를 보여 드리겠습니다.

gcc pthread_create.c -o pthread_create -lpthread

앞으로 나올 몇몇 프로그램들은 제대로 컴파일 하기 위해서 '-D_GNU_SOURCE' 를 줘서 컴파일 해야 할지도 모릅니다. 주의하세요.

이 프로그램의 소스 코드는 pthread_create.c를 보세요.

뮤텍스로 쓰레드 동기화하기

여러개의 쓰레드를 동시에 돌릴 때 발생하는 기본적인 문제점 중의 하나는 같은 메모리 영역을 쓰기 때문에 "서로의 상태에 신경 쓰도록" 하는 것입니다. 그래서 여기서는 두 개의 쓰레드가 동일한 데이타 구조에 접근할 때 생기는 문제점을 살펴 보도록 하겠습니다.

예를 들어서, 두 쓰레드가 두 변수를 업데이트 하려고 하는 상황을 생각해 봅시다. 한 쓰레드는 두 변수를 0으로 세트하려고 하고, 다른 쓰레드는 두 변수를 1로 세트 하려고 합니다. 만약에 두 쓰레드가 동시에 이 일을 하려고 한다면 한 변수는 1로, 다른 한 변수는 0으로 세트된 상황이 생길 수도 있습니다. 이런 일이 생기는 이유는 첫번째 쓰레드가 첫번째 변수를 0으로 만들고 나서 바로 컨택스트 스위치(context switching-이제 이게 뭔지 아시죠?)가 일어나고, 두번째 쓰레드가 두 변수를 1로 세트를 한 다음 다시 첫번째 쓰레드가 동작을 하면 두 번째 변수만을 0으로 만들기 때문에 결과적으로 첫 번째 변수는 1로, 두 번째 변수는 0으로 됩니다.

뮤텍스(mutex)가 뭐죠?

이 문제를 해결하기 위해서 pthread 라이브러리가 제공하는 기본 메카니즘을 뮤텍스라 부릅니다. 뮤텍스는 다음 세가지를 보장해주는 잠금 장치입니다. (역주: mutex - MUTual EXclusion - 상호 배타성)

원자성(Atomicity) - 뮤텍스를 걸었을 경우 다른 쓰레드가 동시에 뮤텍스가 걸린 영역으로 들어오지 못하게 보장해주는 원자적 동작입니다.
유일성(Singularity) - 한 쓰레드가 뮤텍스를 걸었을 경우 자신이 풀기 전에는 다른 쓰레드가 다시 뮤텍스를 걸지 못하게 해 줍니다.
Non-Busy Wait - A라는 쓰레드가 이미 뮤텍스가 걸린 B 쓰레드를 걸려고 한다면 A 쓰레드는 B 쓰레드가 뮤텍스를 풀 때까지 서스펜드(suspend)됩니다 (CPU 리소스를 전혀 사용하지 않습니다). B가 뮤텍스를 풀면 A는 깨어나고 자신이 뮤텍스를 걸고 실행을 계속해 나갑니다.

이 세가지에서 볼 수 있듯이 어떻게 뮤텍스가 변수(혹은 코드의 임계 부분)에 대해서 배타적 접근을 확실하게 해 주는지 알 수 있습니다. 앞에서 설명했던 두 변수를 업데이트 해주는 가상 코드를 살펴 보죠. 다음은 첫 번째 쓰레드입니다.

'X1' 뮤텍스를 잠근다.
첫번째 변수를 '0'으로 세팅.
두번째 변수를 '0'으로 세팅.
'X1' 뮤텍스를 푼다.

두번째 쓰레드는 이렇게 되겠죠.

'X1' 뮤텍스를 잠근다.
첫번째 변수를 '1'로 세팅.
두번째 변수를 '1'로 세팅.
'X1' 뮤텍스를 푼다.

두 쓰레드가 같은 뮤텍스를 쓰고 동시에 돌았다고 하면 두 변수 모두 '0'으로 세트되어 있던지 '1'로 세트되어 있을 겁니다. 프로그래머가 주의할 일이 조금 있습니다. 만약에 세번째 쓰레드가 코드의 다른 부분에서 'X1' 뮤텍스 없이 이 두 변수에 접근을 한다면 역시나 변수 내용이 뒤죽박죽 될 가능성이 있습니다. 따라서 이 변수에 접근하는 모든 코드들을 조그만 함수로 만들어 놓고 이 변수들에 접근 할 때는 이 함수만 쓰도록 해야 합니다.

뮤텍스 만들고 초기화하기

뮤텍스를 만들려면 먼저 pthread_mutex_t 형의 변수를 선언하고 초기화 해야 합니다. 가장 간단한 방법은 PTHREAD_MUTEX_INITIALIZER 상수를 할당하는 것입니다. 따라서 다음같은 코드를 쓰면 되겠습니다.


pthread_mutex_t a_mutex = PTHREAD_MUTEX_INITIALIZER;

주의 할 점이 하나 있는데, 이런 형태의 초기화는 '빠른 뮤텍스(fast mutex)'라는 뮤텍스를 만들어 줍니다. 무슨 뜻이냐면, 만약에 쓰레드가 뮤텍스를 잠근 뒤에 또, 그 뮤텍스를 잠그려고 하면, 그냥 멈춰버릴 것입니다. - 데드락(deadlock)이 걸린다는 뜻입니다.

'재귀적 뮤텍스(recursive mutex)'란 다른 형태도 있는데 한 번 잠근 뒤에도 몇 번이고 더 잠글 수 있게 해주는 뮤텍스입니다. 이 뮤텍스는 위에서 말한 데드락 상황이 안 걸리게 해 줍니다(하지만 이 뮤텍스를 풀려고 하는 다른 쓰레드는 멈출 것입니다). 걸었던 뮤텍스를 풀 때, 걸었던 만큼 풀지 않는 한 뮤텍스가 계속 걸려 있을 겁니다. 이 방법은 현대적인 문잠금 장치에서 문을 잠글 때는 시계 방향으로 두 번 돌리고, 풀 때는 반시계 방향으로 두 번 돌리는 것과 비슷합니다. 이런 뮤텍스를 만들려면 PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP를 할당해 주면 됩니다.

뮤텍스 걸고 풀기(Locking And Unlocking A Mutex)

뮤텍스를 걸때는 pthread_mutex_lock() 함수를 씁니다. 뮤텍스를 걸려고 하는데, 이미 다른 쓰레드가 그 뮤텍스를 걸어놨다면 자신의 쓰레드를 멈추게 합니다. 이렇게 멈췄을 경우에는 뮤텍스를 걸었던 프로세스가 뮤텍스를 풀면 이 함수는 다시 뮤텍스를 걸고 리턴을 합니다. 미리 초기화 했다고 가정하고 어떻게 뮤텍스를 거는지 보여 드리죠.


int rc = pthread_mutex_lock(&a_mutex);
if (rc) { /* 에러 발생 */
    perror("pthread_mutex_lock");
    pthread_exit(NULL);
}
/* 뮤텍스가 걸렸습니다. 필요한 일을 하세요. */
.
.

쓰레드는 자신이 할 일(변수나 데이타 구조의 값을 바꾼다거나 파일을 처리하는등)을 하고 나면 다음처럼 pthread_mutex_unlock() 함수를 써서 뮤텍스를 풀어 줘야 합니다.


rc = pthread_mutex_unlock(&a_mutex);
if (rc) {
    perror("pthread_mutex_unlock");
    pthread_exit(NULL);
}

뮤텍스 없애기(Destroying A Mutex)

뮤텍스로 할 일을 다 했다면 이젠 없앨 차례입니다. 할 일을 다 했다는 얘기는 어떤 쓰레드도 그 뮤텍스가 필요없어졌다는 뜻입니다. 만약에 한 쓰레드만 뮤텍스로 할 일을 끝마쳤다면 이 때는 없애면 안 됩니다. 다른 쓰레드가 그 뮤텍스를 쓸지도 모르기 때문입니다. 모든 쓰레드가 확실히 뮤텍스를 쓸 일이 없다면 마지막 쓰레드가 pthread_mutex_destroy() 함수로 그 뮤텍스를 없앨 수 있습니다.


rc = pthread_mutex_destroy(&a_mutex);

이 함수를 부르고 나면 a_mutex 변수는 다시 초기화 되지 않는 한 더 이상 뮤텍스로 쓰일 수가 없습니다. 따라서 만약에 한 쓰레드가 너무 일찍 뮤텍스를 없앴을 경우에, 다른 쓰레드에서 잠그거나 풀려고 한다면 잠그고 푸는 함수는 EINVAL 에러 코드를 만나게 됩니다.

뮤텍스 사용법 - 완전한 예제(Using A Mutex - A Complete Example)

뮤텍스의 탄생부터 죽음까지 모두 알아봤기 때문에 이제는 예제를 살펴보겠습니다. 이 예제는 영광스러운 "종업원상"을 타기 위해 다투는 두 종업원을 시뮬레이션합니다. 빠르게 시뮬레이션하기 위해서 3개의 쓰레드를 쓰겠습니다. 하나는 Danny를 "종업원상"에 올리고 두번째 쓰레드는 Moshe를 올립니다. 세번째 쓰레드는 "종업원상"의 내용이 일치하는 지를 보여줍니다(즉, 정확하게 한 종업원의 데이타가 들어있음).
두 개의 프로그램이 있는데 하나는 뮤텍스를 쓰는 것이고 다른 하나는 쓰지 않는 것입니다. 둘 다 해보고 차이점을 알아본 다음, 멀티 쓰레드 환경에서 뮤텍스가 꼭 필요한 이유를 마음으로 느껴 보세요.

이 프로그램들은 파일 형태로 제공됩니다. 뮤텍스를 쓰는 것은 employee-with-mutex.c이고, 뮤텍스를 안 쓰는 것은 employee-without-mutex.c입니다. 소스에 있는 주석을 잘 읽어서 어떻게 동작하는지에 대해서 더 잘 이해하시기 바랍니다.

굶어죽기와 데드락 상황(Starvation And Deadlock Situations)

다시 기억을 되살려 보죠. pthread_mutex_lock()는 이미 잠겨 있는 뮤텍스에 대해서는 알 수 없는 시간 동안 멈춰 있을 수 있습니다. 만약에 그 잠김이 영원하다면 우리의 불쌍한 쓰레드는 "굶어(starved)" 죽습니다. 리소스를 얻으려 하지만 영원히 얻지 못하게 되는 것입니다. 이런 굶어 죽기( starvation)가 발생하지 않도록 하는 것은 프로그래머에게 달려 있습니다. pthread 라이브러리는 어떤 도움도 줄 수가 없습니다.

그렇지만, pthread 라이브러리는 "데드락(deadlock)"은 해결 할 수도 있습니다. 데드락이란 모두 같은 상태인 몇몇 쓰레드가 다른 쓰레드가 갖고 있는 리소스를 기다리는 상황입니다.(A deadlock is a situation in which a set of threads are all waiting for resources taken by other threads, all in the same set.) 당연히 모든 쓰레드가 뮤텍스를 기다리면서 멈춰있다면 아무도 다시 돌 수는 없을 것입니다. pthread 라이브러리는 이런 상황을 추적하다가 마지막 쓰레드가 pthread_mutex_lock()를 부르면 실패를 리턴하면서 EDEADLK 에러를 발생시킵니다. 프로그래머는 이런 값을 확인해서 데드락을 피할 방법을 찾아야 합니다.

세련된 동기화 - 조건 변수(Refined Synchronization - Condition Variables)

지금까지 살펴본 뮤텍스는 리소스에 대한 배타적 접근이라는 간단한 동기화를 제공합니다만, 가끔은 진짜 동기화가 필요할 경우가 있습니다.

서버에서, 한 쓰레드는 클라이언트의 요청을 읽어들이고 그 요청을 해석해서 여러 쓰레드에게 처리를 넘깁니다. 이 처리 쓰레드들은 처리할 데이타가 생길 경우에 그 사실을 알아야 할 필요가 있습니다. 그렇지 않다면 CPU 시간을 쓰지 않으면서 기다려야 합니다.
GUI(Graphical User Interface) 어플리케이션에서 한 쓰레드는 사용자 입력을 읽어 들이고 한 쓰레드는 그래픽 출력을 담당하며, 한 쓰레드는 서버에 요청을 보내고 그 응답을 처리합니다. 서버쪽을 담당하는 쓰레드는 서버에서 응답이 왔을 때 그래픽을 담당하는 쓰레드에게 알려줄 수가 있어야 합니다. 그래야 사용자에게 즉시 보여줄 수 있기 때문입니다. 사용자 입력 담당 쓰레드는 예를 들면 서버 담당 쓰레드가 아주 긴 동작중이더라도 사용자가 그것을 취소 시킬 수 있게 해주는 상황처럼 사용자의 요청에 항상 빠르게 응답해야 할 필요가 있습니다.

이 상황들은 모두, 쓰레드는 서로 어떤 사건에 대해서 상대방에게 통보할 수 있는 능력이 필요합니다. 이것이 바로 조건 변수가 탄생한 이유입니다.

조건 변수가 뭐죠?(What Is A Condition Variable?)

조건 변수는 어떤 일이 발생할 때까지 CPU 사이클을 낭비하지 않고 기다릴 수 있도록 해 주는 메카니즘입니다. 몇개의 쓰레드가 조건 변수를 기다리고 있고, 다른 쓰레드가 그 조건 변수에 대해서 시그널을 날려주면(사건을 통지) 기다리던 쓰레드중의 하나가 깨어나서 그 사건에 대해 반응을 하게 됩니다. 또한 그 조건 변수를 기다리고 있던 모든 쓰레드를 깨울 수 있게 브로드캐스트 할 수 있는 방법도 있습니다.

주의할 점은 조건 변수는 잠금을 지원하지 않는다는 것입니다. 따라서 조건 변수에 접근을 하려면 뮤텍스와 같이 사용을 해야 합니다.

조건 변수 만들고 초기화하기(Creating And Initializing A Condition Variable)

조건 변수를 만들려면 pthread_cond_t 형의 변수를 선언하고 알맞게 초기화 시켜줘야 합니다. 초기화는 간단하게 PTHREAD_COND_INITIALIZER 라는 매크로를 쓰던지, pthread_cond_init() 함수를 쓰면 됩니다. 매크로를 쓰는 예제를 살펴 보겠습니다.

pthread_cond_t got_request = PTHREAD_COND_INITIALIZER;

'got_request'라는 조건 변수를 선언하고 초기화 합니다.

주의사항: PTHREAD_COND_INITIALIZER는 실제로 구조체이기 때문에 조건 변수가 선언 될 때에만 쓰일 수 있습니다. 실행 시간에 초기화를 해야 한다면 pthread_cond_init()함수를 쓰기 바랍니다.

조건 변수 시그널 날리기(Signaling A Condition Variable)

조건 변수에 시그널을 날리는 방법은 두 가지가 있습니다. 하나는 pthread_cond_signal() 함수를 부르는 것이고(이 변수를 기다리고 있는 하나의 쓰레드만을 깨울 때), 또 하나는 pthread_cond_broadcast() 함수를 부르는 것입니다(이 변수를 기다리고 있는 모든 쓰레드를 깨울 때). 'got_request'가 적당히 초기화 됐다고 가정하고 예제를 살펴보도록 하죠.

int rc = pthread_cond_signal(&got_request);

혹은 브로드캐스트 함수를 써서,

int rc = pthread_cond_broadcast(&got_request);

두 함수 모두 성공했을 때는 'rc'를 0으로, 실패했을 때는 0이 아닌 값으로 세팅합니다. 실패 했을 경우에는 리턴값은 에러 이유를 나타냅니다(파라미터가 조건 변수가 아닐 때는 EINVAL를, 시스템 메모리가 부족할 때는 ENOMEM를 나타냅니다).

주의 사항: 시그널이 성공했다고 해서 어떤 쓰레드가 깨어났다는 뜻은 아닙니다. 그 조건 변수를 기다리던 쓰레드가 하나도 없었다면 아무일도 아닌 것이죠(즉, 시그널을 잃어버리는 것입니다).
그리고 시그널을 저장해놨다가 쓸 수도 없습니다. 만약에 시그널 함수가 리턴한 다음에 어떤 쓰레드가 그 조건 변수를 기다리기 시작한다면 그 쓰레드는 다른 시그널이 발생해야 깨어날 수 있습니다.

조건 변수 기다리기(Waiting On A Condition Variable)

어떤 쓰레드가 조건 변수에 시그널을 날리길 다른 쓰레드가 기다리려고 한다면 다음 두 함수 중에 한 함수를 쓰면 됩니다. pthread_cond_wait(), pthread_cond_timedwait(). 각 함수는 조건 변수와 뮤텍스(기다리기 전에 뮤텍스를 걸지도 모르기 때문에)를 넘겨 받아서 뮤텍스를 푼 다음에 조건 변수에 시그널이 들어올 때까지 잠들어 버립니다. 앞에서 살펴 봤던 pthread_cond_signal()에 의해서 시그널이 발생해, 깨어 나게 된다면 뮤텍스는 자동으로 다시 잠기고 리턴하게 됩니다.

두 함수가 다른 점은 pthread_cond_timedwait()에 기다릴 시간을 알려준다는 것인데 ETIMEDOUT의 에러값을 갖고 리턴을 해서 조건 변수가 시그널을 받은 것이 아니라 시간이 지나서 리턴했다는 것을 알려준다는 것입니다. pthread_cond_wait() 는 시그널을 받기 전에는 영원히 기다릴 것입니다.

두 함수를 어떻게 쓰는지 보여드리죠. 'got_request'는 적당한 조건 변수로 초기화 됐고 역시 'request_mutex'도 적당한 뮤텍스로 초기화 됐다고 가정합니다. 먼저 pthread_cond_wait() 함수를 봅시다.


/* 뮤텍스를 먼저 걸고 */
int rc = pthread_mutex_lock(&a_mutex);
if (rc) { /* 에러 났음 */
    perror("pthread_mutex_lock");
    pthread_exit(NULL);
}
/* 이제 뮤텍스가 걸렸고, 조건 변수를 기다린다.            */
/* pthread_cond_wait이 실행되는 동안 뮤텍스는 풀립니다.   */
rc = pthread_cond_wait(&got_request, &request_mutex);
if (rc == 0) { /* 조건 변수가 시그널을 받아서 깨어났습니다. */
               /* pthread_cond_wait()가 뮤텍스를 다시 걸어 줍니다.      */
    /* 할 일을 하세요... */
    .
}
/* 끝으로 뮤텍스를 풀어 줍시다. */
pthread_mutex_unlock(&request_mutex);

다음은 pthread_cond_timedwait() 함수를 쓰는 예제입니다.


#include <sys/time.h>     /* struct timeval 정의           */
#include <unistd.h>       /* gettimeofday() 선언           */

struct timeval  now;            /* 기다리기 시작하는 시각        */
struct timespec timeout;        /* 대기 함수에서 쓸 타임아웃값   */
int             done;           /* 다 기다렸나요?                */

/* 뮤텍스를 먼저 걸고 */
int rc = pthread_mutex_lock(&a_mutex);
if (rc) { /* 에러 났음 */
    perror("pthread_mutex_lock");
    pthread_exit(NULL);
}
/* 이제 뮤텍스가 걸렸음. */

/* 지금 시각을 얻는다. */ 
gettimeofday(&now);
/* 타임아웃값을 세팅 */
timeout.tv_sec = now.tv_sec + 5
timeout.tv_nsec = now.tv_usec * 1000; /* timeval은 마이크로(micro)초를 씁니다.         */
                                      /* timespec은 나노(nano)초를 씁니다.         */
                                      /* 1 나노초 = 1000 마이크로초       */

/* 조건 변수를 기다림 */
/* 유닉스 시그널이 타임아웃 전에 대기 상태를 멈추게 할 수 있기 때문에 루프를 써서 피하겠습니다. */
done = 0;
while (!done) {
    /* pthread_cond_timedwait()은 함수 시작부분에서 뮤텍스를 푼다는 것을 기억하세요. */
    rc = pthread_cond_timedwait(&got_request, &request_mutex, &timeout);
    switch(rc) {
        case 0:  /* 조건 변수가 시그널을 받아서 깨어 났음 */
                 /* pthread_cond_timedwait가 뮤텍스를 다시 걸어줍니다. */
            /* 할 일을 하시고... */
            .
            .
            done = 0;
            break;
        case ETIMEDOUT: /* 시간이 다 됐네요 */
            done = 0;
            break;
        default:        /* 에러가 났습니다.(즉, 유닉스 시그널을 받았습니다.) */
            break;      /* swithc문을 빠져나가지만 다시 while 루프를 돕니다. */
    }
}
/* 자, 끝으로 뮤텍스를 풀어 줍시다. */
pthread_mutex_unlock(&request_mutex);

보는바와 같이 타임아웃을 쓰는 버전이 더 복잡합니다. 따라서 필요할 때마다 코드를 만들지 말고 래퍼 함수등을 쓰는게 훨씬 좋을 것입니다.

주의사항: 두 개 이상의 쓰레드가 기다리고 있는 조건 변수가 시그널을 아주 많이 받는다고 할 때, 기다리던 쓰레드 중의 하나는 영원히 깨어 나지 못 할 수도 있습니다. 조건 변수가 시그널을 받았을 때 기다리던 쓰레드중 어떤 쓰레드가 깨어날 지에 대해서 알 수가 없기 때문입니다. 방금 깨어난 쓰레드가 대기 상태로 다시 들어가자마자 시그널이 다시 발생해 그 쓰레드가 다시 깨어나는 식의 동작이 계속 될 수 있기 때문입니다. 이럴 경우에 계속 깨어나지 못하는 쓰레드를 가르켜 "굶어죽었다(starvation)"라고 부릅니다. 이렇게 원치 않는 동작이 일어날 가능성이 있는 상황을 피하는 것은 전적으로 프로그래머의 책임입니다. 하지만 앞에서 봤던 서버 예제에서는 요청이 아주 늦게 들어오고, 서비스 응답을 처리할 쓰레드는 많을 것이기 때문에 아주 바람직한 상황입니다. 즉, 이 경우에는 요청이 발생하자마자 바로바로 처리될 것이기 때문입니다.

주의사항 2: 뮤텍스가 pthread_cond_broadcast로 브로드캐스트를 받았을 때, 그 뮤텍스를 기다리던 모든 쓰레드가 동시에 실행되는것은 아닙니다. 기다리던 각각은 자신의 대기 함수가 리턴하기 전에 뮤텍스를 다시 걸려고 시도를 하기 때문에 하나씩 실행이 됩니다. 즉, 뮤텍스를 걸고, 자기 할 일을 하고, 뮤텍스를 풀고하는 식으로 차례차례 실행이 됩니다.

조건 변수 없애기(Destroying A Condition Variable)

조건 변수를 다 썼다면 없애야겠죠. 이래야 조건 변수가 갖고 있던 시스템 리소스를 반환할테니까요. pthread_cond_destroy()로 이 일을 합니다. 제대로 동작하려면 이 조건 변수를 기다리는 쓰레드가 하나도 없어야 합니다. 사용법을 보여드릴텐데, 역시 'got_request'가 이미 조건 변수로 초기화 되어 있었다고 가정합니다.


int rc = pthread_cond_destroy(&got_request);
if (rc == EBUSY) { /* 이 조건 변수를 기다리는 쓰레드가 있군요. */
    /* 잘 처리하세요... */
    .
    .
}

어떤 쓰레드가 여전히 조건 변수를 기다리고 있다면, 상황에 따라 다르겠지만, 이 조건 변수의 사용에 어떤 허점이 있었을 수도 있고 적당한 쓰레드 종료 코드가 빠졌을 수도 있습니다. 최소한 디버깅 단계에서는 이 상황을 프로그래머에게 알려주는게 좋습니다. 아무 것도 아닐 수도 있고 아주 중대한 결함일 수도 있으니까요.

실제 상황에서의 조건 변수(A Real Condition For A Condition Variable)

조건 변수에 대해서 하나 짚고 가야겠습니다. 이것과 관련된 실제 조건에 대한 확인들이 없다면 조건 변수는 거의 쓸모가 없습니다. 확실히 하기 위해서 앞에서 소개했던 서버 예제를 잠깐 살펴보도록 하죠. 'got_request' 조건 변수가 처리할 새 요청이 들어왔을 때 시그널을 받는다고 가정하고 사용을 했습니다. 이들은 또한 어떤 요청 큐에 들어 있을 것입니다. 그 조건 변수가 시그널을 받았을 때, 기다리던 쓰레드가 있다면 그 쓰레드는 깨어나고 응답을 처리할 것이라는 것을 확신할 수 있습니다.

하지만, 새 요청이 들어온 순간에 모든 쓰레드가 바로 전 응답을 처리하느라 바쁘다면 어떻게 될까요? 이 순간에는 모든 쓰레드는 조건 변수를 기다리고 있지 않고 자기 일을 하고 있었기 때문에 그 조건 변수가 받은 시그널은 무시될 겁니다. 또한 각 쓰레드가 자기 일을 마치고 조건 변수를 기다리는 상태가 됐을 경우, 그 무시됐던 시그널이 다시 발생하지도 않습니다(또다른 새 요청이 없다고 가정하면). 따라서, 모든 쓰레드가 시그널을 기다리느라 멈춰있는 동안 최소한 한 개의 요청이 처리되지 못 하고 남아 있게 됩니다.

이 문제를 해결하기 위해서 요청이 미처리 된 갯수를 정수 변수에 갖고 있겠습니다. 그리고 각 쓰레드는 조건 변수를 기다리기 전에 그 값을 확인해서 그 값이 양수이면 (미처리 된 요청이 있다), 멈추지 않고 그 응답을 처리할 겁니다. 또한, 요청을 처리한 쓰레드는 이 변수를 하나씩 감소시켜야 하는데 이렇게 해야 숫자가 정확해 질것입니다.
이런 고려 사항들이 위에서 봤던 코드를 어떻게 바꾸는지 봅시다.


/* 미처리된 요청, 0으로 초기화 */
int num_requests = 0;
.
.
/* 먼저, 뮤텍스를 잠급시다. */
int rc = pthread_mutex_lock(&a_mutex);
if (rc) { /* 에러 있음 */
    perror("pthread_mutex_lock");
    pthread_exit(NULL);
}
/* 이제 뮤텍스는 잠겼고, 조건 변수를 기다립니다. */
/* 처리할 요청이 없다면             */
rc = 0;
if (num_requests == 0)
    rc = pthread_cond_wait(&got_request, &request_mutex);
if (num_requests > 0 && rc == 0) { /* 미처리 요청이 있네용 */
        /* 할 일을 합시다. */
        .
        .
        /* 미처리 요청수를 하나 줄입니다. */
        num_requests--;
    }
}
/* 마지막으로, 뮤텍스를 풀어줘야죠 */
pthread_mutex_unlock(&request_mutex);

조건 변수 사용법 - 완전한 예제(Using A Condition Variable - A Complete Example)

조건 변수의 실질적인 사용법을 보여주기 위해서 앞에서 설명했던 서버를 시뮬레이션하는 프로그램을 소개하겠습니다. 한 쓰레드는 수신자로서, 클라이언트의 요청을 받아 들여서 링크드 리스트에 요청을 집어 넣습니다. 핸들러 쓰레드는 이 요청을 처리하게 됩니다. 간단하게 하기 위해서 수신자 쓰레드는 실제 클라이언트에서 요청을 받아들이지 않고 자신이 요청을 만들어 내게 할 것입니다.

소스는 thread-pool-server.c에서 볼 수 있습니다. 소스안에 아주 자세한 주석이 달려 있으니까 소스를 먼저 읽어 본 다음에 밑에 나오는 설명을 참고하세요.

'main' 함수는 먼저 핸들러 쓰레드를 만들고, 자신의 메인 루프를 통해 수신자 쓰레드의 역할을 짊어집니다.
한 개의 뮤텍스로, 조건 변수와 요청을 기다릴 링크드 리스트, 두 개를 보호하는데 씁니다. 이렇게 하면 전체 설계를 간단하게 할 수 있습니다. 연습문제 하나 내죠. 이 예제를 두 개의 뮤텍스를 쓰는 방식으로 바꿔보세요.
여기서 쓰이는 뮤텍스는 재귀적 뮤텍스"여야" 합니다. 왜 그런가는 소스 코드중, 'handle_requests_loop' 함수를 보세요. 보면, 먼저 뮤텍스를 걸고, 'get_request' 함수를 부르는데, 여기서도 뮤텍스를 또 거는군요. 만약에 재귀적 뮤텍스를 안 썼다면 이 'get_request' 함수에서 뮤텍스를 거는 순간 영원히 멈춰버릴 것입니다.
'get_request' 함수에서 뮤텍스 거는 부분을 빼서 두 번 거는 문제를 풀 수 있지 않겠냐라고 할 지도 모르겠지만 이렇게 하면 결함이 있는 설계가 돼 버립니다. 아주 큰 프로그램에서 'get_request'를 다른 코드상에서 부를 수도 있기 때문입니다. 따라서 매번 쓸 때마다 뮤텍스가 적절하게 잠겼는지 확인할 필요가 있습니다.
일반적으로, 재귀적 뮤텍스를 쓸 때에는, 뮤텍스를 잠그고 푸는 것을 한 함수 안에서 하도록 해야 합니다. 안 그러면, 잠근 수만큼 풀기가 아주 어려워 지고 결국 데드락이 발생하게 될 겁니다.
pthread_cond_wait() 함수가 내부적으로 뮤텍스를 풀었다 다시 거는게 처음에는 헷갈릴 수도 있습니다. 제일 좋은 방법은 코드상에 이런 동작에 대해 주석으로 달아서, 다른 사람이 쓸데없이 뮤텍스를 또 걸지 않게 해 줄 수 있습니다.

개인적인 쓰레드 데이타 - 쓰레드만의 데이타("Private" thread data - Thread-Specific Data)

보통의 쓰레드 하나짜리 프로그램에서 가끔 전역 변수를 써야 할 때가 있습니다. 맞습니다. 나이 드신 훌륭한 선생님께서는 전역 변수를 쓰는게 아주 나쁜 습관이라고 말씀하셨습니다. 하지만 가끔 이게 편할 때가 있습니다. 특히나 한 파일 안에서만 보이는 정적 변수라면 더욱 그렇죠.

멀티 쓰레드 프로그램에서도 이런 전역 변수를 써야 할 경우가 있습니다. 모든 쓰레드에서 접근 가능한 하나의 변수에 대해서는 약간의 오버헤드를 갖는 뮤텍스를 써서 보호해야 한다는 것에 주의하시기 바랍니다. 게다가, 특정한 쓰레드에서만 쓰일 "전역" 변수가 필요할 수도 있고, 똑같은 "전역" 변수이나 다른 쓰레드에서는 다른 값을 가져야 할 때도 있습니다. 예를 들어, 각 쓰레드에서 전역적으로 접근할 수 있는 하나의 연결 리스트(그러나 같지 않은)가 필요하다고 가정해 보죠. 더군다나, 모든 쓰레드가 실행할 코드는 동일해야 합니다. 이런 경우에, 리스트의 시작을 나타내는 전역 포인터는 각 쓰레드에서 서로 다른 위치를 가르키고 있어야 합니다.

이런 포인터를 가지려면 메모리상의 위치가 다른 동일한 전역 변수가 있어야 합니다. 이것이 바로 쓰레드만의 데이타(thread-specific data) 메카니즘이 필요한 이유입니다.

쓰레드만의 데이타 지원 개요(Overview Of Thread-Specific Data Support)

쓰레드만의 데이타(TSD) 메카니즘에서는 키와 값이라는 개념이 필요합니다. 각 키는 이름을 갖고 있고 어떤 메모리 영역을 가르킵니다. 두 개의 서로 다른 쓰레드에서 이름이 같은 키를 갖고 있다면 항상 서로 다른 메모리 위치를 나타냅니다. 이 키를 가지고 접근할 수 있는 메모리 블럭을 할당해 주는 라이브러리 함수들이 이것을 처리해 줍니다. 키를 만들어 주는 함수(전체 프로세스에서 한 키에 대해서 한 번만 실행), 메모리를 할당해 주는 함수(각 쓰레드에서 실행), 특정 쓰레드에서 이 메모리를 다시 반환해 주는 함수, 전체 프로세스에서 그 키를 없애주는 함수등이 있습니다. 또, 키가 가르키는 데이타에 접근하는 함수와 그 값을 세팅하거나 값을 알아내는 함수도 있습니다.

쓰레드만의 데이타 블럭 할당하기(Allocating Thread-Specific Data Block)

pthread_key_create() 함수는 새로운 키를 만들어 내려고 할 때 쓰입니다. 이 키는 전체 프로세스의 모든 쓰레드에서 유효합니다. 키가 생성 됐을 때, 기본으로 NULL을 가르키게 됩니다. 다음에 각 쓰레드들은 자신이 원하는 값으로 이 복사본을 변경하게 됩니다. 사용법을 보여드리죠.


/* rc 는 pthread 함수의 리턴값을 저장하는데 쓰입니다.       */
int rc;
/* 키를 갖고 있을 변수 정의.                                */
pthread_key_t list_key;
/* cleanup_list 는 데이타를 청소해 주는 함수입니다.         */
/* 이것은 우리 프로그램에서 만들어 주는 것이지 TSD 자체의 것이 아닙니다. */
extern void* cleanup_list(void*);

/* 삭제시 불릴 함수를 넘겨서 키를 만듭니다.                 */
rc = pthread_key_create(&list_key, cleanup_list);

몇 가지 주의사항:

pthread_key_create() 가 리턴한 후에는 'list_key' 변수는 새롭게 생성된 키를 가르키게 됩니다.
pthread_key_create()의 두번째 인자로 넘겨진 함수 포인터는 쓰레드 종료시, pthread 라이브러리에 의해서 키 값의 포인터를 인자로 받아서 불리게 됩니다. 함수 포인터에 NULL 포인터를 넘길 수도 있는데 이렇게 하면 종료시 해당 키에 대해서는 아무 함수도 실행 되지 않습니다. 주의할 점은, 이 키가 한 쓰레드에서 한 번만 생성됐다고 하더라도, 각 쓰레드가 종료할 때마다 실행된다는 것입니다.
만약에 키를 여러개 생성했다면 키 생성 순서와는 상관없이 해당 종료 함수가 실행 될 것입니다.
pthread_key_create() 함수는 성공시 0을, 실패시 에러 코드를 리턴합니다.
PTHREAD_KEYS_MAX 만큼의 키 값 제한이 있습니다. PTHREAD_KEYS_MAX 가 넘어가게 되면 pthread_key_create() 함수에서 EAGAIN 에러 값을 받게 될 것입니다.

쓰레드만의 데이타에 접근하기(Accessing Thread-Specific Data)

키를 생성한 다음에는 pthread 함수를 써서 접근할 수 있습니다: pthread_getspecific()와 pthread_setspecific(). 첫번째 함수는 주어진 키에 대해서 그 값을 알아내는 데 쓰이고, 두번째 함수는 주어진 키에 데이타를 세트하는데 쓰입니다. 키 값은 간단하게 void 포인터(void *)이기 때문에, 아무것이나 저장할 수 있습니다. 사용법을 살펴보도록 하죠. 'a_key'는 pthread_key_t 타입으로서, 이미 적당히 초기화된 키 변수라고 가정합니다.


/* 이 변수는 pthread 함수의 리턴 코드값을 저장하는데 쓰입니다.    */
int rc;

/* 데이타를 저장할 변수를 정의합니다. 여기서는 integer 라고 하죠. */
int* p_num = (int*)malloc(sizeof(int));
if (!p_num) {
    fprintf(stderr, "malloc: out of memory\n";
    exit(1);
}
/* 변수를 아무 값으로 초기화 합니다.		      	*/
(*p_num) = 4;

/* 이제 이 값을 TSD 키에 저장합니다.				*/
/* 주의할 것은 'p_num' 을 저장하는게 아니라 		*/
/* p_num이 가르키는 값을 저장한다는 것입니다.  	*/
rc = pthread_setspecific(a_key, (void*)p_num);

.
.
/* 어쩌구 저쩌구...  */
.
.
/* 'a_key' 키의 값을 얻어서 출력. */
{
    int* p_keyval = (int*)pthread_getspecific(a_key);

    if (p_keyval != NULL) {
	printf("value of 'a_key' is: %d\n", *p_keyval);
    }
}

한 쓰레드에서 키 값을 세트한 후, 다른 쓰레드에서 그 값을 읽어보시기 바랍니다. NULL 을 얻게 될텐데 이 키 값은 쓰레드마다 서로 다르기 때문에 그렇습니다.

pthread_getspecific() 이 NULL을 리턴하는 두 가지 경우를 알아보겠습니다:

주어진 키가 유효하지 않다(즉, 키가 생성이 안 됐다).
키 값이 NULL이다. 이는 초기화가 안 됐거나 그 전에 pthread_setspecific()에 의해서 강제로 NULL로 세트됐을 경우중 하나이다.

쓰레드만의 데이타 블럭을 지우기(Deleting Thread-Specific Data Block)

pthread_key_delete() 함수는 키를 지울 때 쓰입니다만, 함수 이름 때문에 헷갈리지 말아야 할 것이 하나 있습니다. 이 함수는 해당 키가 갖고 있는 메모리를 지우지도 않고, 키 생성시 등록된 청소 함수를 부르지도 않습니다. 그러므로, 실행중에 이 메모리를 프리시켜야 한다면 직접 해 줘야 합니다. 하지만 보통, 전역 변수(쓰레드만의 데이타 역시)를 사용한다는 것은 쓰레드가 종료할 때까지 프리시킬 필요가 없을 테고, 이럴 경우에는 쓰레드 라이브러리가 종료함수를 불러줄 것입니다.

이 함수 사용법은 간단합니다. list_key를 알맞게 생성된 키를 가르키는 pthread_key_t 변수라고 가정하면 이런식으로 쓰면 됩니다:

int rc = pthread_key_delete(key);

성공시에는 0을 리턴하고, 주어진 변수가 유효한 TSD 키를 가르키지 않을 경우에는 EINVAL을 리턴합니다.

완전한 예제(A Complete Example)

아직 없습니다. 생각할 시간이 좀 필요하네요. 죄송합니다. 지금 당장 제가 생각할 수 있는 것은 '전역 변수는 아주 나쁘다'라는 것입니다. 앞으로 좋은 예제를 찾아보도록 하겠습니다. 혹시 좋은 예제가 있다면 제게 알려주시기 바랍니다.

쓰레드 취소와 끝내기

쓰레드를 만들었으니 끝내는것도 생각해 볼까요? 몇 가지를 살펴보죠. 쓰레드를 깨끗하게 끝낼 수 있어야 하겠죠. 그리고 아주 고약한 방법인 시그널을 사용하는 프로세스 포크(fork)와는 달리 pthread 라이브러리는 좀 더 신중하게 디자인 돼서 쓰레드를 취소한다든지 끝난 다음의 청소 작업등에 대한 완전한 시스템을 제공합니다. 한 번 살펴보죠.

Canceling A Thread

쓰레드를 끝내려고 할 때는 pthread_cancel를 쓰면 됩니다. 이 함수는 쓰레드 ID를 파라미터로 받아 그 쓰레드 ID로 취소 요청을 보냅니다. 이 요청에 대해 그 쓰레드가 어떻게 할 지는 그 쓰레드의 상태에 달려 있습니다. 즉시 취소될 수도 있고, 취소 위치(뒤에서 설명합니다)에 다다랐을때 취소될 수도 있고, 아예 무시해 버릴 수도 있습니다. 어떻게 쓰레드의 상태를 설정하며 취소 요청에 대해 어떻게 동작하는지에 대한 설정등에 대해서는 뒤에서 살펴보도록 하죠. 일단은 취소 함수를 어떻게 쓰는지 보겠습니다. 'thr_id'는 돌고 있는 쓰레드의 pthread_id 를 갖고 있는 변수라고 합시다.


pthread_cancel(thr_id);

pthread_cancel()은 0을 리턴하기 때문에 성공여부를 알 수가 없습니다.

쓰레드 취소 상태 설정하기

쓰레드의 취소 상태는 여러가지 방법으로 바꿀 수 있습니다. 첫번째는 pthread_setcancelstate() 함수를 쓰는 것입니다. 이 함수는 취소 요청을 받아 들일 것인지 아닌지를 결정합니다. 두 개의 파라미터가 필요한데, 하나는 새로운 취소 상태가 설정되어 있어야 하고 하나는 이전 취소 상태가 담겨질 변수입니다. 어떻게 쓰는지 보세요.


int old_cancel_state;
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &old_cancel_state);

이 함수를 부른 쓰레드는 취소될 수가 없습니다. 취소될 수 있도록 하려면 다음처럼 하면 됩니다.


int old_cancel_state;
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, &old_cancel_state);

두 번째 파라미터를 NULL로 넘겨주게 되면 예전 취소 상태에 대해서 알 수가 없습니다.

비슷하게 pthread_setcanceltype()이란 함수가 있는데 이 함수는 취소 요청에 대한 반응을 결정합니다. 이 때 이 쓰레드는 취소될 수 있다고 가정합니다. 가능한 반응으로는 취소 요청을 즉시(비동기적으로) 처리하는것과 취소 위치에 도착하기 전까지 취소를 미루는 것입니다. 다음은 비동기적 취소 방법 입니다.


int old_cancel_type;
pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, &old_cancel_type);

취소 위치까지 취소를 미루는 것은 다음처럼 하면 됩니다.


int old_cancel_type;
pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, &old_cancel_type);

두 번째 파라미터를 NULL로 이 함수를 부르면 예전 취소 상태에 대해 알 수 없습니다.

"취소 상태랑 타입을 설정 안 하면 어떻게 되나요?"라고 묻는다면 pthread_create()가 자동으로 PTHREAD_CANCEL_ENABLE (취소 요청 처리)과 PTHREAD_CANCEL_DEFERRED(취소 미룸)를 설정해 준다라고 대답해 드리죠.

취소 위치

지금까지 살펴본 것처럼, 쓰레드는 취소 요청을 즉시 처리하지 않을 수 있습니다. 대신 취소 위치에 도착할 때까지 그 요청을 미룰 수가 있습니다. 그럼 도대체 취소 위치란게 뭘까요?

보통, 쓰레드 실행을 오랫동안 정지시키는 함수는 취소 위치가 될 수 있습니다. 실제로는 특정 구현에 따라 달라지지고 얼마나 POSIX 표준을 따르냐(어느 버전의 표준)에 따라 달라집니다만 다음 함수들은 취소 위치입니다.

pthread_join()
pthread_cond_wait()
pthread_cond_timedwait()
pthread_testcancel()
sem_wait()
sigwait()

무슨 소리냐면 쓰레드가 이 중 한 함수를 실행중일 때, 뒤로 미룰 취소 요청이 있는지 확인하고 취소 요청이 들어와 있으면 자기가 끝난 다음에 취소 작업을 실행한 뒤 종료하게 됩니다. 이런 함수들이 실행중이 아니라면 방법은 한 가지 밖에 없는데, pthread_testcancel()를 쓰는 것입니다. 이 함수는 현재 쓰레드에서 대기중인 취소 요청이 있는지 확인해서 있다면 취소 작업을 실행하고, 없다면 그냥 리턴합니다. 보통 취소 상태로 들어가지 않고 긴 작업을 수행하는 쓰레드에서 쓰일 수 있습니다.

주의사항: 실제 pthread 표준에 일치하는 구현에서는, 프로세스를 블럭시키는 read(), select(), wait()등등의 시스템 콜들도 역시 취소 위치가 됩니다. 또한, 이 시스템 콜을 쓰는 표준 C 라이브러리들도 역시 마찬가지입니다(예를 들면 다양한 버전의 printf 함수들).

쓰레드 청소 함수 세팅하기(Setting Thread Cleanup Functions)

pthead 라이브러리가 제공해주는 기능중에, 자신이 종료하기 전에 자기 자신이 쓰던 리소스를 깨끗히 정리해주는 것이 있습니다. 이는 pthread 라이브러리에 의해서 자동으로 관련 함수가 불리거나 필요해 의해 스스로 부를 수 있기 때문에 가능해 집니다(즉, 자신이 pthread_exit()를 부르거나, 다른 쓰레드에 의해 취소될 때).

이를 위해 두 개의 함수가 제공됩니다. 하나는 pthread_cleanup_push() 함수로서 현재 쓰레드용 청소 함수 집합에 새로운 청소 함수를 추가해 줍니다. pthread_cleanup_pop() 함수는 pthread_cleanup_push()에 의해 추가된 마지막 함수를 제거해 줍니다. 쓰레드가 종료될 때는, 해당 청소 함수들은 등록됐던 반대 순서롤 불리게 됩니다. 즉, 마지막에 등록된 청소 함수가 제일 처음 불리게 됩니다.

pthread_cleanup_push() 함수의 두 번째 파라미터로 넘긴 변수가 청소 함수의 파라미터로 넘겨져서 불리게 됩니다. 이것들이 어떻게 쓰이는지를 살펴보도록 하죠. 여기 예제에서는 쓰레드가 시작할 때 할당받았던 메모리를 반환하는데 이 함수들을 적용시켜 보겠습니다.



/* 등록할 청소 함수        */
/* 할당된 메로리의 포인터를 받고 프리시켜 줌  */
void
cleanup_after_malloc(void* allocated_memory)
{
    if (allocated_memory)
        free(allocated_memory);
}

/* 쓰레드 함수      */
/* thread-pool 서버 예제에서 썼던 함수 그대로... */
void*
handle_requests_loop(void* data)
{
    .
    .
    /* 이 변수는 나중에 쓰일 겁니다. 그냥 읽어 나가세요..         */
    int old_cancel_type;

    /* 지금 이 쓰레드의 시작 시각을 기억하기 위해서 약간의 메모리를 할당 받습니다. */
    /* MAX_TIME_LEN 은 앞에서 미리 정의된 매크로라고 가정합니다.          */
    char* start_time = (char*)malloc(MAX_TIME_LEN);

    /* 청소 함수를 등록합니다. */
    pthread_cleanup_push(cleanup_after_malloc, (void*)start_time);
    .
    .
    /* 쓰레드의 메인 루프입니다. 어떤 일들을 하겠죠... */
    .
    .
    .

    /* 그리고 끝으로, 청소 핸들러를 제거할텐데 이 방법이 좀 이상해 보이겠지만   */
    /* 밑의 주석을 잘 읽어 보세요.     */

    /* 현재 쓰레드를 취소 미룸 상태에 둡니다.      */
    pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, &old_cancel_type);

    /* '1'을 넘기면, 청소 핸들러 집합에서 지워버리기 전에 청소 핸들러를 실행 시킵니다. */
    /* '0'을 넘기면, 청소 핸들러를 실행하지 않습니다.   */
    pthread_cleanup_pop(1);

    /* 쓰레드를 이전 취소 상태로 다시 되돌려 놓습니다.   */
    pthread_setcanceltype(old_cancel_type, NULL);
}

여기서 볼 수 있듯이, 메모리를 약간 할당한 뒤, 이 메모리를 쓰레드 종료 시점에서 프리 시키도록 청소 핸들러를 등록시킵니다. 메인 루프가 다 돌고 나면 청소 핸들러를 제거시키게 되는데, 이 때 등록한 함수, 같은 블럭에서 제거 시켜야 합니다. 왜냐하면, pthread_cleanup_push()와 pthread_cleanup_pop() (역주 : 원문은 pthread_cleanup_pop()과 pthread_cleanup_pop(), 오타임)이 실제로는 '{'와 '}'를 나타내는 매크로이기 때문입니다.

청소 함수를 제거할 때 이렇게 복잡한 코드를 쓰는 이유는 청소 함수 내에서 쓰레드가 취소되지 않게 하기 위해서입니다. 쓰레드가 비동기 취소 상태에 있을 수도 있기 때문에, 확실히 하기 위해서 취소 미룸 상태로 바꾼 다음에, 청소 함수를 제거하고, 마지막으로 이전 취소 상태로 되돌려 놓는 것입니다. 주의 할 점은, 쓰레드가 pthread_cleanup_pop() 자체내에서 취소 되지 않는다는 것인데, pthread_cleanup_pop()가 취소 위치가 아니기 때문입니다.

쓰레드 종료 동기화 하기(Synchronizing On Threads Exiting)

가끔은 다른 쓰레드가 끝나길 기다려야 하는 경우가 있습니다. pthread_join() 함수로 이 일을 할 수 있습니다. 이 함수는 두 개의 파라미터가 필요한데, 조인(join)될 쓰레드를 나타내는 pthread_t 타입의 변수와 해당 쓰레드의 종료 코드값이 담길(취소 됐다면 PTHREAD_CANCELED) void *의 주소를 나타내는 변수입니다. pthread_join() 함수는 이 함수를 부르는 쓰레드를, 조인될 쓰레드가 끝날 때까지 중지시킵니다.

예를 들어 앞에서 살펴봤던 thread-pool 서버 예제를 생각해 보죠. 코드의 끝 부분을 보면, sleep()을 불러서 프로세스가 끝나길 기다리고 있는 것을 볼 수 있습니다. 이렇게 한 이유는 메인 쓰레드가 다른 쓰레드의 지연된 처리가 끝났는지 어떤지를 알 수 있는 방법이 없기 때문입니다. 해결 방법은 이것이 비록 바쁜 루프가 되긴 하겠지만 지연된 요청이 없을 때까지 메인 쓰레드가 루프를 돌게 하는 것입니다.

지금까지 살펴 본 것들을 깔끔하게 구현하려면 다음 세가지 변경사항을 추가시키면 됩니다.

요청이 다 만들어지면, 플래그를 이용해서 핸들러 쓰레드에게 알려줍니다.
요청 큐가 비어 있을 때마다 더 이상 만들 요청이 있는지 없는지를 확인하게 합니다. 더 만들어질 요청이 없다면 쓰레드를 종료시킵니다.
메인 쓰레드는 자신이 만든 쓰레드들이 끝날 때까지 기다립니다.

앞에 두 가지 변경사항은 좀 쉽습니다. 'done_creating_requests'란 전역 변수를 하나 만들고 '0'으로 초기화를 시킨 다음, 각 쓰레드들은 조건 변수를 기다리기 전에 (즉, 요청 큐가 비어있을 때), 이 전역 변수 값을 확인합니다.
메인 쓰레드는 자신이 모든 요청을 다 만들어 낸 다음, 이 변수를 '1'로 세팅합니다. 그리고, 조건 변수에 브로드캐스트를 날려 혹시 조건 변수를 기다리고 있는 쓰레드가 확실히 'done_creating_requests' 플래그를 다시 확인 할 수 있게 해줍니다.

마지막 세번째 변경사항은 pthread_join() 루프로 처리할 수 있습니다. 각 핸들러 쓰레드마다 한 번씩 pthread_join()를 불러주면 됩니다. 이렇게 하면 모든 핸들러 쓰레드가 종료된 다음에 이 루프가 끝나게 됩니다. 따라서 전체 프로세스를 안전하게 종료할 수 있습니다. 만약에 이 루프를 쓰지 않는다면 핸들러 쓰레드가 요청을 처리하고 있는 중간에 전체 프로세스를 끝낼 가능성이 있습니다.

변경된 프로그램은 thread-pool-server-with-join.c 에서 볼 수 있습니다. 세 가지 변경 사항은 소스에서 'CHANGE'(대문자)란 곳을 찾아 보면 됩니다.

쓰레드 떼어내기(Detaching A Thread)

지금까지 pthread_join() 함수를 써서 쓰레드가 어떻게 조인 되는지를 살펴 봤습니다. 사실 조인가능한(join-able) 상태에 있는 쓰레드는 꼭 다른 쓰레드에 의해서 조인되어야 합니다. 그렇지 않다면 그 쓰레드가 갖고 있던 메모리 리소스가 완전하게 제거되지 않을 것입니다. 이는, 부모 프로세스가 자식 프로세스를 거둬들이지 않는 상황과 비슷합니다('고아'나 '좀비'프로세스라고 부르죠).

만약에 어떤 쓰레드가 다른 쓰레드에 조인이 필요없이 아무때나 종료하고 싶다면, 그 쓰레드는 떨어진(detached) 상태에 있어야 합니다. 이렇게 하려면, pthread_create()에 적당한 플래그를 줘서 쓰레드를 만들어 내던지, pthread_detach() 함수를 쓰면 됩니다. 여기서는 두 번째 방법을 살펴 보겠습니다.

pthread_detach() 함수는 파라미터가 한 개 필요합니다. 파라미터는 pthread_t 형으로서 떨어진(detached) 상태로 놓을 쓰레드를 나타냅니다. 예를 들면, 다음 코드처럼 쓰레드를 만들자마자 바로 떨어지게(detach) 할 수 있습니다.


pthread_t a_thread;   /* 쓰레드 구조체를 담을 변수              */
int rc;               /* pthread 함수의 리턴값을 위한 변수           */
extern void* thread_loop(void*); /* 쓰레드의 메인 함수를 선언 */

/* 새 쓰레드를 만드는데... */
rc = pthread_create(&a_thread, NULL, thread_loop, NULL);

/* 성공이라면 새 쓰레드를 떼어낸다. */
if (rc == 0) {
    rc = pthread_detach(a_thread);
}

물론, 떨어진(detached) 상태의 쓰레드를 즉시 갖고 싶다면 첫번째 방법 (pthread_create()를 부를 때, 떨어진(detached) 상태를 세트해서 부름)을 쓰는 것이 더 효과적입니다.

쓰레드 취소 - 완전한 예제(Threads Cancellation - A Complete Example)

다음 예제는 지금까지 예제들보다 훨씬 큰 예제입니다. 이 예제는 C에서, 약간은 깔끔한 멀티 쓰레드 프로그램을 어떻게 만드는가를 보여줍니다. 앞에서 썼던 thread-pool 서버 예제를 사용하겠습니다. 이 예제는 두 가지 면에서 업그레이드 될 텐데, 하나는 요청의 부하에 따라 핸들러 쓰레드의 숫자를 조절하는 기능입니다. 요청 큐가 커지면 새 쓰레드가 만들어지고, 큐가 다시 줄어들면 필요없는 쓰레드는 취소 될 것입니다.

두 번째는, 더 이상 처리할 요청이 없을 때 서버의 종료 방법을 고칠 것입니다. 깔끔하지 못한 sleep()을 쓰는 대신, 각 핸들러 쓰레드들이 자신의 마지막 요청을 처리하고 종료할 때까지, 메인 쓰레드가 pthread_join()을 써서 기다리도록 할 것입니다.

다음처럼 4개의 파일로 나눠서 구현됐습니다.

requests_queue.c - 이 파일에는 요청 큐를 처리하는 함수들이 있습니다. add_request() 와 get_request() 함수를 여기에 넣었는데, 앞에서 전역 변수로 정의됐던 큐 헤드용 포인터와, 요청 카운터, 큐용 뮤텍스, 조건 변수를 하나의 구조체로 묶어서 같이 넣었습니다. 이렇게 해서, 데이타에 대한 모든 조작이 한 파일 안에서 일어나게 되고, 이 파일안에 있는 모든 함수는 'requests_queue'라는 구조체에 대한 포인터를 받습니다.
handler_thread.c - 이 파일은 각 핸들러 쓰레드가 실행 시킬 메인 루프를 돌리는 함수들이 있습니다. (이 버전에서의 'handle_requests_loop()' 함수와, 밑에서 설명할 몇 가지 지역 함수들). 각 쓰레드간에 주고 받을 데이타들을 위한 구조체를 정의하고, pthread_create()에 파라미터로 그 구조체의 포인터를 넘깁니다. 이렇게 해서 세련되지 못한 전역변수의 사용을 대신합니다. 이 구조체에는 쓰레드 ID, 요청 큐 구조체에 대한 포인터, 뮤텍스, 조건 변수가 들어 있습니다.
handler_threads_pool.c - 여기서 쓰레드 풀(pool)의 추상화를 정의합니다. 여기에는 쓰레드를 만드는 함수, 취소시키는 함수, 프로그램 종료시 모든 활성화된 쓰레드를 없애는 함수들이 들어 있습니다. 요청큐에서처럼 구조체를 정의해서 쓰겠습니다. 이것들에 대해서는 메인 쓰레드 혼자만 접근하기 때문에 뮤텍스로 이것들을 막을 필요가 없습니다. 이렇게 하면 뮤텍스에 의한 약간의 오버헤드를 줄일 수 있는데, 비록 이런 오버헤드가 작을지라도, 아주 바쁜 서버에서는 큰 영향을 미치기 때문입니다.
main.c - 그리고 마지막으로, 이 모든 것들을 묶고, 관리하는 메인 함수입니다. 이 함수는 요청큐와 쓰레드 풀(pool), 핸들러 쓰레드들을 만들고, 요청을 발생시킵니다. 그 요청이 큐로 들어간 다음에는 큐 크기와 현재 활성화된 쓰레드의 숫자를 확인해서 큐 크기에 맞게 쓰레드 수를 조절합니다. 수위(water-mark) 알고리즘을 사용하는데, 코드를 보면 알겠지만, 좀 더 세련되고 복잡한 알고리즘으로 쉽게 바꿀 수가 있습니다. 여기서 쓰인 수위(water-mark) 알고리즘은 간단합니다. 수위가 높아지면 큐를 빨리 비우기 위해서 쓰레드들을 새로 만들어 내고, 수위가 낮아지면 원래 핸들러 쓰레드를 제외한 나머지 쓰레드들은 취소 시킵니다.

원래 프로그램을 좀 더 다루기 쉽게 고친 다음에 우리가 새로 배운 pthread 함수들을 다음과 같이 적용시켰습니다.

각 핸들러 쓰레드는 취소 미룸 상태로 만들어 집니다. 이렇게 하면 이 쓰레드들이 취소가 됐을때, 현재 처리중인 요청을 다 처리한 다음에 종료할 수 있게 됩니다.
각 핸들러 쓰레드는 또한 청소 함수를 등록하는데, 각 쓰레드가 종료시 뮤텍스를 풀고 종료토록 하기 위해서입니다. 이는 아마 거의 대부분이 취소 상태인 pthread_cond_wait()에서 취소 명령을 받을 것이기 때문에 정확히 동작 할 것입니다. 만약에 뮤텍스를 건 다음에 최소되거나 종료되면 다른 모든 쓰레드가 그 뮤텍스에 의해 '멈춰버릴' 것입니다. 따라서 청소 핸들러( pthread_cleanup_push() 함수로 등록함)에 뮤텍스를 풀어 주는 함수를 등록하는 것은 아주 확실한 해결책이 될 것입니다.
끝으로, 메인 쓰레드는 대충 종료하지 않고 아주 정확하게 종료되도록 세트됩니다. 종료할 시점이 되면, 'delete_handler_threads_pool()' 함수를 불러서 남아 있는 핸들러 쓰레드들을 기다리도록 pthread_join을 부릅니다. 이렇게 함으로써, 모든 핸들러 쓰레드가 자신의 마지막 요청을 다 처리하고 난 다음에 이 'delete_handler_threads_pool()' 함수가 리턴하게 됩니다.

자, 이제 소스 코드를 통해 모든 것을 살펴보시기 바랍니다. 헤더 파일을 먼저 읽으면 전체 디자인을 이해하기 쉽습니다. 컴파일하려면, thread-pool-server-changes 디렉토리로 들어가 'gmake'라고 치면 됩니다.

연습문제 1 : 마지막 예제 프로그램에는 종료 시점에 약간의 경쟁 상태(race condition)가 존재합니다. 이 경쟁이 뭐에 대한 건지 알 수 있겠습니까? 이 문제에 대해서 완전한 해결책을 제시할 수 있습니까?(힌트 - 'delete_handler_thread()'함수를 써서 쓰레드를 없애려고 할 때 무슨 일이 생기는지 생각해 보세요)

연습문제 2 : 우리가 사용한 수위(water-mark) 알고리즘은 새 쓰레드를 만들어 낼 때, 너무 느리게 동작하는 것 같습니다. 요청들이 처리 되기전에 큐에서 기다리는 평균 시간을 줄일 수 있는 다른 알고리즘을 생각해 보세요. 그리고 이 시간을 잴 수 있는 코드를 넣어 보세요. 여러분의 "최적화된 풀(pool) 알고리즘"을 찾을 때까지 계속 실험을 해 보세요. 주의 사항 - 시간을 재는 것은 getrusage, 시스템 콜로 할 수 있습니다. 정확한 측정값을 위해 각 알고리즘을 여러번 실행 시켜보시기 바랍니다.

쓰레드를 이용한 사용자 인터페이스 프로그래밍(Using Threads For Responsive User Interface Programming)

쓰레드가 아주 유용하게 쓰일 수 있는 분야로 유저 인터페이스(user interface)용 프로그램이 있습니다. 이런 프로그램들은 보통 한 곳에서 루프를 돌면서 사용자 입력을 읽고, 처리한 다음, 결과를 보여주는 식으로 되어 있습니다. 만약에 처리 부분이 시간을 아주 오래 잡아 먹고 있다면 사용자는 이 동작이 끝날 때까지 계속 기다려야 합니다. 이런 긴 처리 부분을 독립된 쓰레드로 돌리고, 다른 쓰레드에서는 사용자 입력을 받게 한다면, 그 프로그램은 좀 더 사용자의 반응에 민감하게 될 것입니다. 사용자는 그 긴 동작 중간에 취소를 시킬 수 있게 됩니다.

그래피컬한 프로그램에서는 이 문제가 더욱 심각해 집니다. 왜냐하면, 이런 프로그램은 자신의 윈도우를 다시 그리도록 윈도우 시스템에서 오는 메세지를 항상 기다리고 있어야 하기 때문입니다. 만약에 다른 일을 하느라고 너무 바쁘다면 자신의 윈도우는 텅 비어 있을 것입니다. 아주 안 좋아 보이죠. 이런 경우에, 한 쓰레드가 윈도우 시스템의 메세지를 처리하는 루프를 돌리면서, 다시 그리라는 요청에 항상 응답 할 수 있게 하는 것은 아주 좋은 방법입니다( 사용자 입력에 대해서도 마찬가지겠죠). 이렇게 오래 걸릴법한 동작이 필요하다 싶으면(최악의 경우에 0.2초 이상이라고 합시다), 독립된 쓰레드로 돌게 하십시요.

세번째 쓰레드를 쓰는 좀 더 좋은 방법이 있습니다. 이 세번째 쓰레드가 사용자 입력 쓰레드와 작업 수행 쓰레드의 제어와 동기화를 맏게 하는 것입니다. 사용자 입력 쓰레드가 사용자 입력을 받으면 제어 쓰레드에게 이 일을 처리하도록 요청하고, 작업 수행 쓰레드가 자신의 일 처리를 끝내면 결과를 사용자에게 보여주도록 제어 쓰레드에게 요청하게 하는 것입니다.

사용자 인터페이스 - 완전한 예제(User Interaction - A Complete Example)

사용자가 중간에 취소 시킬 수 있는, 파일에서 줄 수를 읽어들이는 간단한 문자 모드 프로그램을 작성해 보겠습니다.

메인 쓰레드는 줄 수를 세도록 쓰레드 하나를 만듭니다. 다음으로 사용자 입력을 확인하도록 두번째 쓰레드를 만듭니다. 그리고나서, 메인 쓰레드는 조건 변수를 기다립니다. 아무 쓰레드나 자신의 일을 마치면, 이 조건 변수에 시그널을 날려서 메인 쓰레드가 알게 합니다. 사용자의 취소 요청이 일어났는지 아닌지를 확인하기 위해서 전역 변수를 씁니다. '0'으로 초기화를 시키는데 만약에 사용자 쓰레드가 취소 요청을 받는다면(사용자가 'e'를 누른다면), 이 전역 변수를 '1'로 세팅하고 조건 변수에 시그널을 날리고 종료합니다. 줄 수를 세는 쓰레드는 자신의 계산이 다 끝났을 경우에만 조건 변수에 시그널을 날릴 것입니다.

프로그램을 읽기 전에 system() 함수의 사용법과 'stty' 유닉스 명령어에 대해서 설명드리겠습니다. system() 함수는 파라미터로 받아 들인 유닉스 명령어를 실행시킬 쉘을 하나 생성합니다. stty 유닉스 명령어는 터미널 모드 세팅을 바꾸는데 쓰입니다. 우리는 터미널을 라인 버퍼 모드에서 캐릭터 모드(raw 모드라고도 하죠)로 바꾸는데 썼습니다. 이렇게 하면, 사용자 입력 쓰레드에서 getchar() 를 부를 때, 사용자가 키를 누르자마자 즉시 리턴하도록 해줍니다. 만약에 이렇게 하지 않는다면, 사용자가 엔터(ENTER) 키를 누를 때까지 사용자의 입력을 버퍼에 저장해 놓을 것입니다. 끝으로, 이 캐릭터 모드는 별로 쓸모가 없기 때문에 프로그램이 종료하고 쉘 프롬프트를 다시 받으면, 사용자 입력 쓰레드는 원래의 터미널 모드(라인 버퍼 모드)로 돌아가도록 청소 함수를 등록시킵니다. 더 자세한 내용은 stty 매뉴얼을 참고하세요.

프로그램 소스는 line-count.c에서 받을 수 있습니다. 이 프로그램이 읽을 파일 이름은 'very_large_data_file'이라고 하드 코드 되어 있습니다. 이 이름의 파일을 하나 만드셔도 되고(충분한 시간동안 동작이 이뤄지도록 크게 만드세요), 저희가 제공하는 'very_large_data_file.Z' 파일을 받으셔서 압축을 풀어 사용하셔도 됩니다. 압축 푸는 명령어는 다음처럼 하시면 됩니다.

uncompress very_large_data_file.Z

압축이 풀리면 5메가(!) 짜리 'very_large_data_file'이 생기니까, 압축을 풀기 전에 디스크 용량이 충분한지 확인하시기 바랍니다.

멀티 쓰레드 어플리케이션에서 비시스템 라이브러리 쓰기(Using 3rd-Party Libraries In A Multi-Threaded Application)

멀티 쓰레드를 프로그램에 적용하려는 프로그래머에게 아주 중요한 것 하나만 더 말씀드리겠습니다. 멀티 쓰레드 프로그램은 동시에 똑같은 함수를 실행시킬 수도 있기 때문에, 한 쓰레드이상에서 동시에 실행 될지도 모르는 함수는 꼭 MT-safe(Multi-Thread Safe:멀티 쓰레드에 안전)해야 합니다. MT-safe한 함수 내부의 구조체나 다른 공유 리소스에 대한 접근이 뮤텍스로 보호된다는 뜻입니다.

멀티 쓰레드 프로그램에서 MT-safe하지 않는 라이브러리를 쓸 수 있는 가능한 방법은 두 가지가 있습니다.

오직 한 쓰레드에서만 이 라이브러리를 쓰기. 이 방법은 이 라이브러리 함수가 서로 다른 쓰레드에서 동시에 실행되지 않게 해줍니다. 하지만 이 방법은 문제가 있는데, 전체 설계에 제한 사항을 줄 수도 있다는 것입니다. 또한, 다른 쓰레드가 이 라이브러리의 함수를 쓰려고 할 가능성이 있기 때문에 쓰레드간 통신에 좀 더 신경을 써야 할 지도 모릅니다.
그 라이브러리 함수를 부를 때는 뮤텍스를 써서 보호할 것. 어느 쓰레드에서건 이 라이브러리의 함수를 부를 때는 하나의 뮤텍스를 쓰라는 뜻입니다. 뮤텍스를 걸고, 함수를 부르고, 뮤텍스를 푸는 순서로 사용하면 되겠습니다. 여기서 생길 수 있는 문제는 잠금이 그렇게 깔끔하게 이루어지지 않는다는 것입니다. 같은 라이브러리의 서로 다른 두 개의 함수가 서로 간섭하지 않는 독립된 함수임에도, 서로 다른 쓰레드에서 동시에 불릴 수 없을 수 있습니다. 두 번째 쓰레드는 첫번째 쓰레드가 함수 실행을 끝낼 때까지 뮤텍스에 걸려 있을 것입니다. 관련 없는 함수들에 대해서 서로 다른 두 개의 뮤텍스로 처리할 수도 있겠지만, 그 라이브러리가 실제로 어떻게 동작하는지 알 방법이 없기 때문에 어떤 함수끼리를 묶어야 하는지 알수가 없습니다. 거기다가, 혹시 안다고 할 지라도 새 버전의 라이브러리가 나왔을 때, 예전 버전과 다르게 동작할수도 있기 때문에 잠금 시스템 전체를 고쳐야 할 지도 모릅니다.

보시다시피, MT-safe하지 않은 라이브러리는 특별한 주의가 필요하기 때문에, 가능하면 비슷한 기능을 가진 MT-safe한 라이브러리를 찾아 쓰는 것이 제일 좋습니다.

쓰레드를 지원하는 디버거 쓰기(Using A Threads-Aware Debugger)

마지막 주의 사항입니다. 멀티 쓰레드 어플리케이션을 디버깅 할 때, 쓰레드를 "인식"하는 디버거가 필요합니다. 상용 개발 환경의 거의 대부분의 최신 디버거들은 모두 쓰레드를 처리할 수 있습니다. 리눅스에서, 거의 대부분의 배포판에 들어 있는 gdb는 쓰레드를 인식하지 못합니다. 'SmargGDB'라는 프로젝트가 있는데, gdb에 쓰레드 지원과, 그래픽 사용자 인터페이스(멀티 쓰레드 어플리케이션을 디버깅 할 때만 가능한)를 추가하는 프로젝트입니다. 어쨌든, 이것으로 다양한 사용자 레벨의 쓰레드 라이브러리를 쓰는 쓰레드 프로그램 에만 쓰이고, LinuxThreads를 디버깅하려면 커널 패치가 필요한데, 2.1.X 대 버전에서만 가능합니다. 더 자세히 알고 싶다면 http://hegel.ittc.ukans.edu/projects/smartgdb/를 찾아보시기 바랍니다. 또한 커널 2.0.32에 패치하는 법과 gdb 4.17을 쓰는 것에 대한 정보도 있는데, LinuxThreads homepage에서 찾아보시기 바랍니다.

Side-Notes

수위(water-mark) 알고리즘: 버퍼나 큐를 처리할 때 주로 쓰이는 알고리즘입니다. 버퍼나 큐에 데이타를 채워넣다가, 크기가 상위 한계를 넘으면 큐에 넣는 것을 멈춥니다( 혹은 비우는 것을 좀 더 빠르게 합니다). 이 상태를 하위 한계 이하로 떨어질 때까지 계속 유지하다가 떨어지면, 큐에 채워넣기를 계속하게 됩니다( 혹은 비우는 속도를 원래 속도로 되돌려 놓습니다).

'c or linux' 카테고리의 다른 글

동적 메모리 - 메모리 크기 변경 (0)	2005.03.09
pthread 개념 - Application Development Guide --Core Components (0)	2005.02.18
함수 포인터 (0)	2005.02.16
ctags 활용 (0)	2005.02.15
#ifdef __cplusplus (1)	2005.02.12

Posted by '김용환'

Java condition variable

java core 2005. 2. 18. 08:51

저자: Scott Oaks and Henry Wong, 역 한동훈

원문: http://www.onjava.com/pub/a/onjava/excerpt/jthreads3_ch6/index1.html

편집자 노트: J2SE 5.0에는 새로운 점들이 있습니다: wait()와 notify()를 이용한 스레드 간의 흐름 제어(coordinating)에 대한 이전 버전의 선택사항들은 이제 스레드 작업을 위한 새롭고 복잡한 전략들을 표현하고 있는 클래스들로 보강되었습니다. Scott Oaks와 Henry Wong의 Java Threads, 3판의 첫번째 발췌는 java.util.concurrent 패키지에 대한 것입니다.

6장. 고급 동기화 문제들

이 장에서는 데이터 동기화와 관련된 보다 깊이 있는 이해를 요구하는 문제들을 살펴볼 것입니다. 이 주제는 주로 데이터 동기화와 관련된 시간제어(timing) 문제입니다. 스레드를 여러 개 사용하는 자바 프로그램을 작성하는 경우 데이터 동기화와 관련된 문제들 때문에 프로그램 디자인에 어려움이 생기는 경우가 많습니다. 게다가, 데이터 동기화와 관련된 오류는 특정 순서대로 이벤트가 발생할 때 생기기 때문에 발견하기도 매우 어려운 경우가 대부분입니다. 데이터 동기화에 대한 문제는 주로 시간제어 의존성(timing dependencies) 때문에 밝혀지지 않는 경우가 빈번합니다. 프로그램이 정상적으로 실행되는 동안에 발생하는 데이터 손상 문제를 발견한다해도 디버거나 코드에 디버깅 문장을 추가하여 실행할 경우 프로그램의 시간제어(timing)이 완전히 바뀌어버렸기 때문에 데이터 동기화 오류가 더 이상 발생하지 않는 경우도 있습니다.

Java Threads, 3rd Edition

이들 문제는 간단히 해결되지 않습니다. 대신에, 개발자는 이러한 문제들을 고려하여 프로그램을 설계(design)해야 합니다. 개발자들은 무엇이 원인인지, 무엇을 찾아봐야 하는지, 문제 발생을 피하기 위해 어떤 테크닉을 사용할 수 있는지와 같은 다양한 스레드 관련 문제들을 이해하고 있어야 합니다. 또한, 개발자들은 프로그램에서 필요한 유형의 동기화를 제공하는 도구, 스레드 안정성(threadsafe)로 알려진 도구들과 같은 고급 수준의 동기화 도구를 사용하는 것을 고려해야 합니다. 이 장에서는 이러한 아이디어들을 함께 살펴볼 것입니다.

동기화 용어

특정 스레드 시스템에 대해 알고 있는 프로그래머들은 이 장에서 논의할 몇 가지 개념들을 해당 시스템에서 사용하는 용어로 사용하는 경향이 있으며, 이러한 스레드 시스템에 대한 배경지식이 없는 프로그래머들은 여기서 사용하는 용어들을 꼭 이해할 필요가 있는 것은 아닙니다. 그래서, 여기에서는 여러분이 알고 있는 용어와 이 장에서 사용하는 특정 용어들을 비교할 것입니다.

장벽(Barrier): 장벽은 다양한 스레드들이 만나는 집합 장소(rendezvous point)입니다. 모든 스레드는 반드시 장벽에 도착해야 하며, 그 이후에 스레드들중에 일부만이 장벽을 통과해서 계속 수행할 수 있도록 허가됩니다. J2SE 5.0은 장벽(barrier) 클래스를 제공하며, 이전 버전의 자바를 위한 장벽 클래스는 부록 A에 있습니다.
역주: Barrier는 장벽 또는 배리어라는 두 가지 모두 쓰이고 있으며, rendezvous point는 랑데뷰라고 얘기하는 것이 보통입니다.
상태변수(Condition variable): 상태 변수는 실제로 잠금(Lock)이 아니며, 잠금과 관련된 변수입니다. 상태 변수는 데이터 동기화 문맥(context)에서 자주 사용됩니다. 상태 변수는 일반적으로 자바의 wait-and-notify 메커니즘과 동일한 기능을 수행합니다. 이 메커니즘에서 상태 변수는 실제로 보호하고 있는 객체 잠금(object lock)입니다. J2SE 5.0에서는 명시적인 상태 변수를 제공하며, 이번 버전의 자바를 위한 구현은 부록 A에 있습니다. 상태 변수의 종류 두 가지는 4장에서 다루고 있습니다.
임계 영역(Critical section): 임계 영역은 동기화 되는 메서드나 블록입니다. 임계 영역은 동기화 메서드(synchronized methods)나 블록처럼 중첩될 수 없습니다.
역주: synchronized 키워드로 선언된 메서드들이 중첩될 수 없음을 의미합니다.
이벤트 변수(Event variable): 이벤트 변수는 상태 변수의 다른 용어입니다.
잠금(Lock): 이 용어는 특정 스레드가 동기화되는 메서드나 블록에 들어가 있도록 액세스가 허가된 상태를 지칭합니다. 이것은 잠금이 필요한 메서드나 블록에 진입한 스레드라고 얘기합니다. 3장에서 논의한 것처럼 잠금은 객체의 인스턴스 또는 클래스 수준에서 적용될 수 있습니다.
모니터(Monitor): 스레드 시스템에서 일관되게 사용되지 않는 용어로, 일부 시스템에서는 모니터를 간단히 잠금(Lock)이라 하며, 다른 시스템에서는 모니터를 wait-and-notify 메커니즘과 유사한 의미로 사용합니다.
뮤텍스(Mutex): 잠금의 또 다른 용어. 뮤텍스는 동기화 메서드 또는 블록처럼 중첩되지 않으며 운영체제 수준에서 프로세스간에 사용될 수 있습니다.
읽기/쓰기 잠금(Reader/writer locks): 여러 스레드가 연속적으로 공유 데이터로부터 데이터를 읽기만 하는 경우나 하나의 스레드가 공유 데이터에 쓰기만 하는 경우에 획득할 수 있는 잠금(Lock). J2SE 5.0에서는 읽기/쓰기 잠금을 위한 클래스를 제공하며, 이전 버전의 자바를 위해 비슷한 기능을 하는 클래스는 부록 A에 있다.
세머포어(Semaphores): 세머포어는 컴퓨터 시스템에서 저마다 다르게 사용되고 있습니다. 많은 개발자들은 자바 잠금에서 하는 것처럼 객체를 잠그기 위해 세머포어를 사용합니다. 세머포어를 이용한 보다 복잡한 작업은 코드의 임계 영역 획득을 중첩하기 위해 카운터를 이용하는 경우입니다. 이러한 형태의 잠금은 자바 잠금과 정확하게 동일합니다. 또한, 세머포어는 코드에 대한 액세스 보다 자원에 대한 액세스를 얻기 위해 사용할 수 있습니다. 이들 기능의 대부분을 구현한 세머포어 클래스들은 J2SE 5.0에서 이용할 수 있습니다.

J2SE 5.0에서 동기화 클래스 추가

독자들은 위 용어 목록을 읽는 동안 강력한 패턴을 알아챘을 겁니다. J2SE 5.0부터 위에서 언급한 거의 모든 것들이 자바의 핵심 라이브러리로 포함되어 있다는 것입니다. 이제, J2SE 5.0 클래스들에 대해서 간략하게 살펴볼 것입니다.

세머포어(Semaphore)

자바에서 세머포어는 기본적으로 카운터가 있는 잠금입니다. 잠금이 있는 경우에 액세스를 금지하기 위해 사용된다는 점에서는 Lock 인터페이스와 유사하지만, 차이점은 카운터입니다.

세머포어는 중첩될 수 없지만 잠금은 구현에 따라 잠금이 중첩될 수 있다는 점을 제외하면, Lock 인터페이스에 대한 카운터가 있는 세머포어는 잠금과 동일합니다.

Semaphore 클래스는 발행할 수 있는 허가권 숫자를 유지합니다. 이 같은 정책을 사용하면 여러 스레드가 하나 이상의 허가권을 가지는 것을 가능하게 합니다. 허가권에 대한 실질적인 사용은 개발자에 달려 있습니다. 따라서, 세머포어는 허용할 수 있는 잠금의 수를 표현하기 위해 사용됩니다. 마찬가지로, 네트워크 연결이나 디스크 공간과 같은 리소스 제한 때문에 병렬로 작업할 수 있는 스레드 수를 제어하기 위해 사용될 수 있습니다.

Semaphoreinterface를 살펴봅니다.


public class Semaphore {
   public Semaphore(long permits);
   public Semaphore(long permits, boolean fair);
   public void acquire( ) throws InterruptedException;
   public void acquireUninterruptibly( );
   public void acquire(long permits) throws InterruptedException;
   public void acquireUninterruptibly(long permits);
   public boolean tryAcquire( );
   public boolean tryAcquire(long timeout, TimeUnit unit);
   public boolean tryAcquire(long permits);
   public boolean tryAcquire(long permits,
                      long timeout, TimeUnit unit);
   public void release(long permits);
   public void release( );
   public long availablePermits( );
}

Semaphore 인터페이스는 Lock 인터페이스와 매우 유사합니다. 허가권을 얻거나 반환하기 위해 사용하는 acquire()와 release() 메서드는 Lock 인터페이스의 lock(), unlock()과 비슷합니다. tryAcquire() 메서드는 개발자가 잠금이나 허가권을 얻기 위해 사용하는 tryLock() 메서드와 비슷합니다. 이들 메서드는 허가권을 바로 얻을 수 없는 경우의 대기 시간과 잠금을 획득하거나 반환할 허가권의 수(기본값은 1)를 지정할 수 있습니다.

Semaphore는 Lock과 몇 가지 다른 점이 있습니다. 첫째, 생성자는 허용할 수 있는 허가권 수를 지정해야 합니다. 총 허가권 수나 남아있는 허가권을 반환하는 메서드가 있습니다. 이 클래스는 잠금을 획득하거나 반환하는 알고리즘만을 구현하고 있습니다. Lock 인터페이스와 달리 Semaphore에서는 어떤 상태 변수도 사용하지 않습니다. 중첩의 개념이 없습니다. 동일한 스레드가 여러번 획득하는 것은 세머포어로부터 허가권을 여러번 얻는 것입니다.

세머포어가 fair 플래그가 true로 설정하여 생성된다면 세머포어는 요청이 만들어지는 순서대로 허가권을 할당합니다. ? 이것은 선착순과 매우 유사합니다. 이 선택사항의 단점은 속도입니다. 이는 가상머신에서 허가권을 순서대로 얻는 것은 임의의 스레드가 허가권을 획득하는 것 보다 더 많은 시간이 걸리기 때문입니다.

장벽(Barrier)

모든 스레드 동기화 도구들 가운데 장벽은 아마도 가장 이해하기 쉬운 것이면서 가장 적게 사용되는 것이라 생각합니다. 동기화를 생각할 때 첫번째 고려사항은 전체 작업의 일부분을 실행하는 스레드 그룹, 스레드들의 결과를 동기화해야 하는 위치에 대한 것입니다. 장벽은 단순히 결과를 모으거나 다음 작업으로 안전하게 이동하기 위해 모든 스레드들을 동기화하기 위한 대기장소라 할 수 있습니다. 응용 프로그램이 단계별로 수행될 때 장벽을 사용할 수 있습니다. 예를 들어, 대다수의 컴파일러들은 소스를 읽어들이는 것과 실행 파일을 생성하는 작업 사이에 많은 임시 파일과 다양한 경로를 만들어냅니다. 이런 경우에 장벽을 사용하여, 모든 스레드가 같은 단계에 머무르는 것을 보장할 수 있습니다.

이런 단순함에도 불구하고, 장벽은 왜 널리 사용되지 않는가? 기능은 단순하기 때문에 자바에서 제공되는 저수준(low-level) 도구로도 수행할 수 있습니다. 우리는 장벽을 사용하지 않고 이 문제를 두가지 방법으로 해할 수 있습니다. 첫번째는 상태 변수에 따라 대기하는 스레드를 만드는 것입니다. 마지막으로 수행되는 스레드는 다른 스레드들에게 작업이 완료되었음을 알리기 위해 장벽을 반환합니다. 두번째 방법은 join() 메서드를 사용하여 대기 종료 스레드를 사용하는 것입니다. 모든 스레드가 연결될 때 프로그램의 다음 단계를 위한 새 스레드를 시작하는 방법입니다.

그러나, 어떤 경우에는 장벽을 사용하는 것이 더 바람직합니다. join() 메서드를 사용할 때 스레드가 모두 종료되고, 우리는 새 스레드를 시작합니다. 따라서, 스레드는 이전 스레드 객체가 저장했던 상태 정보를 잃어버립니다. 따라서, 스레드는 종료되기 전에 상태를 저장해야 합니다. 뿐만아니라, 항상 새 스레드를 생성해야 한다면, 논리 연산자를 함께 사용할 수 없습니다. 왜냐하면, 각각의 하위작업에 대해 새 스레드를 생성해야하고, 하위 태스크에 대한 코드는 각각의 run() 메서드에 있어야 하기 때문입니다. 이런 경우에 모든 로직을 하나의 메서드로 작성하는 것이 더 쉬울 수 있습니다. 특히, 하위 작업이 매우 작은 경우에는 더욱 그렇습니다.

장벽 클래스의 인터페이스입니다.


public class CyclicBarrier {
   public CyclicBarrier(int parties);
   public CyclicBarrier(int parties, Runnable barrierAction);
   public int await( ) throws InterruptedException, BrokenBarrierException;
   public int await(long timeout, TimeUnit unit) throws InterruptedException,
               BrokenBarrierException, TimeoutException;
   public void reset( );
   public boolean isBroken( );
   public int getParties( );
   public int getNumberWaiting( );
}

장벽의 핵심은 await() 메서드입니다. 이 메서드는 기본적으로 상태변수의 await() 메서드와 비슷하게 동작합니다. 여기에는 장벽이 스레드를 풀어줄 때까지 대기하는 것과 만료시간 제한까지 대기하는 선택사항이 있습니다. 정확한 스레드 수가 대기중일 때 장벽이 통지를 수행하기 때문에 signal() 메서드가 있을 필요가 없습니다.

장벽을 생성할 때 개발자는 장벽을 사용하는 스레드 수를 지정해야 합니다. 이 숫자는 장벽을 동작시키는데 사용됩니다. 장벽에서 대기중인 스레드 수가 지정된 스레드 수와 일치할 경우에만 스레드가 모두 해제됩니다. 이 뿐만 아니라, run() 메서드를 구현하는 객체의 동작까지 지정할 수 있는 방법도 있습니다.

장벽이 해제되기 위한 조건을 만족하는 경우 스레드를 해제하기 전에 barrierAction 객체의 run() 메서드가 호출됩니다. 이를 이용하여 스레드 안정성이 없는 코드를 실행할 수 있습니다. 일반적으로, 이것을 이전 단계의 정리 코드 또는 다음 단계를 위한 준비(setup) 코드라 합니다. 장벽에 도달한 마지막 스레드가 동작을 실행하는 스레드가 됩니다.

await() 메서드를 호출하는 각 스레드는 고유한 반환값을 돌려 받습니다. 이는 장벽에 도달한 스레드 순서와 관련있는 값입니다. 이 값은 개별 스레드가 프로세스의 다음 단계를 수행하는 동안 작업을 어떤식으로 분배할지 결정하는 경우에 필요합니다. 첫번째로 도착한 스레드는 총 스레드 수 보다 1 작은 값이 되고, 마지막으로 도착한 스레드는 0이 됩니다.

일반적인 사용에서, 장벽은 매우 간단합니다. 모든 스레드는 필요한 수 만큼의 스레드가 도착할 때까지 기다립니다. 마지막 스레드가 도착하자마자 동작(action)이 실행되고, 스레드는 해제되고, 장벽은 재사용할 수 있습니다. 그러나, 예외 조건이 발생하고, 장벽이 실패할 수 있습니다. 장벽이 실패할 때 CyclicBarrier 클래스는 장벽을 없애고, BroenBarrierException과 함께 await() 메서드에서 대기중인 스레드를 모두 해제합니다. 장벽은 다양한 이유로 깨질 수 있습니다. 대기 스레드가 중지(interrupted)될 수도 있으며, 스레드가 제한시간 조건 때문에 깨질 수도 있으며, barrierAction에 발생한 예외 때문에 깨질 수 있습니다.

모든 예외 조건에서 장벽은 간단히 깨집니다. 따라서 개별 스레드는 이 문제를 해결해야 합니다. 게다가, 장벽은 다시 초기화 될 때까지 초기화되지 않습니다. 즉, 이 상황을 해결하는 복잡한 알고리즘은 장벽을 재초기화하는 경우도 포함해야 합니다. 장벽을 다시 초기화하기 위해 reset() 메서드를 사용하지만, 장벽에 이미 대기중인 스레드가 있다면 장벽은 초기화되지 않습니다. 즉, 장벽은 동작하지 않게 됩니다. 장벽을 재초기화하는 것은 상당히 복잡하기 때문에 새로운 장벽을 생성하는 것이 보다 더 쉬울 수 있습니다.

마지막으로 CyclicBarrier 클래스는 몇 가지 보조 메서드를 제공합니다. 이들 메서드는 장벽에 대기중인 스레드 수에 대한 정보라든가 장벽이 이미 깨졌는지를 알려줍니다.

카운트다운 래치(Countdown Latch)

카운트다운 래치는 장벽과 매우 유사한 동기화 도구입니다. 실제로, 래치는 장벽 대신 사용할 수 있습니다. 자바를 제외한 일부 스레드 시스템은 세머포어를 지원하는 기능들을 구현하기 위해 카운트다운 래치를 사용하기도 합니다. 장벽 클래스와 마찬가지로 스레드가 어떤 상태를 대기하는 기능을 제공합니다. 차이점은 대기 해제 조건이 대기중인 스레드 수가 아니라는 것입니다. 대신에, 지정된 숫자가 0이 될 때 스레드가 해제됩니다.

CountDownLatch 클래스는 카운트를 감소시키는 메서드를 제공합니다. 동일한 스레드가 이 메서드를 여러 번 호출할 수 있습니다. 뿐만아니라, 대기중이 아닌 스레드도 이 메서드를 호출할 수 있습니다. 카운터가 0이 되면 모든 대기 스레드가 해제됩니다. 경우에 따라 대기중인 스레드가 없는 경우도 가능하며, 지정된 수 보다 많은 스레드가 대기하는 것도 가능합니다. 래치가 발생한 다음에도 대기를 시도하는 스레드는 즉시 해제됩니다. 래치는 초기화(reset)되지 않습니다. 뿐만 아니라, 래치가 발생한 후에는 카운트를 감소시키는 어떠한 시도도 동작하지 않습니다.

카운터다운 래치의 인터페이스는 다음과 같습니다.


public class CountDownLatch {
   public CountDownLatch(int count);
   public void await( ) throws InterruptedException;
   public boolean await(long timeout, TimeUnit unit)
                  throws InterruptedException;
   public void countDown( );
   public long getCount( );
}

인터페이스는 매우 간단합니다. 생성자에서 초기화할 숫자를 지정합니다. 오버로드 메서드 await()는 카운트가 0이 될 때까지 스레드를 대기시킵니다. countDown()과 getCount() 메서드는 카운트를 제어하는 기능을 제공합니다. ? countDown()은 카운트를 감소시키고, getCount()는 카운트를 조회합니다. timeout 변수가 있는 await() 메서드의 반환값이 boolean인 것은 래치가 발생했는지의 여부를 나타내기 위한 것입니다. ? 래치가 해제된 경우 true를 반환합니다.

익스체인저(Exchanger)

익스체인저는 다른 스레드 시스템에서 해당하는 것을 찾아볼 수 없는 동기화 도구를 구현한 것이다. 이 도구를 설명하는 가장 쉬운 방법은 데이터 전달을 하는 장벽의 조합이라고 얘기하는 것이다. 이것은 스레드 쌍이 서로를 만나도록(랑데뷰) 하는 장벽이다. 스레드가 쌍으로 만나게 되면 서로 데이터를 교환하고 각자의 작업을 수행한다.

***
역주: 현재 C#의 다음 버전을 위해 시험중인 Comega에 포함된 Polyphony C#을 기억한다면 익스체인저가 자바에서만 찾아볼 수 있는 것이 아님을 알 것이다. 다만, 상용 언어들중에 적용된 것이 드물뿐이다. Polyphony C#을 기억한다면 다음 코드를 살펴보기 바란다.


public class Buffer {
   public string Get() & public async Put( string s ) {
      return s;
   }
}

Get()이 스레드이며 Put()이 스레드이다. Put()이 여러번 수행되어도 Get()이 수행되지 않는 한 어떤 일도 발생하지 않는다. Get()이 1회 실행되면 처음 실행된 Put()과 쌍이 되어 처리가 발생하고 데이터를 교환할 수 있다. 위 예제는 Polyphony C#(또는 C# 3.0에 포함되리라 알려진)으로 구현할 수 있는 하나의 예이며 자바의 Exchanger 클래스는 Polyphony의 특정 구현에 해당한다.
***

익스체인저 클래스는 스레드간 데이터를 전달하는데 주로 사용되기 때문에 동기화 도구라기 보단 컬렉션 클래스에 가깝습니다. 스레드는 반드시 짝을 이뤄야하며, 지정된 데이터 타입이 교환되어야 합니다. 이럼에도 불구하고, 이 클래스는 나름대로의 장점을 갖고 있습니다.


public class Exchanger {
   public Exchanger( );
   public V exchange(V x) throws InterruptedException;
   public V exchange(V x, long timeout, TimeUnit unit) 
          throws InterruptedException, TimeoutException;
}

exchange() 메서드는 데이터 객체와 함께 호출되어 다른 스레드와 데이터를 교환합니다. 다른 스레드가 이미 대기중이라면 exchange() 메서드는 다른 스레드의 데이터를 반환합니다. 대기 스레드가 없으면 exchange() 메서드는 대기 중인 스레드가 생길때까지 대기합니다. 만료시간(timeout) 옵션은 호출하는 스레드가 얼마나 오랜동안 대기할 것인가를 제어합니다.

장벽 클래스와 달리 Exchanger 클래스는 깨지는 경우가 발생하지 않기 때문에 매우 안전합니다. 얼마나 많은 스레드들이 Exchanger 클래스를 사용하는가는 중요하지 않습니다. 왜냐하면, 스레드가 들어오는 대로 짝을 이루기 때문입니다. 스레드는 간단하게 예외 조건을 생성할 수 있습니다. 익스체인저는 예외 조건까지 스레드들을 짝짓는 과정을 계속 수행합니다.

읽기/쓰기 잠금(Reader/Writer Locks)

때때로 시간이 오래 걸리는 작업에서 객체로부터 정보를 읽어올 때가 있습니다. 읽어들이는 정보가 변경되지 않게 잠금을 할 필요는 있지만, 다른 스레드가 정보를 읽는 것까지 막아버릴 필요는 없습니다. 모든 스레드가 데이터를 읽을 때는 각 스레드가 데이터를 읽는 것에 영향을 주지 않기 때문에 방해할 필요가 없습니다.

실제로, 데이터 잠금이 필요한 경우는 데이터를 변경할 때 뿐입니다. 즉, 데이터에 쓰기를 하는 경우입니다. 데이터를 변경하는 것은 데이터를 읽고 있는 스레드가 변경된 데이터를 읽게 할 가능성이 있습니다. 지금까지는 스레드가 읽기나 쓰기 작업을 하는 것에 관계없이 하나의 스레드가 데이터에 액세스하는 것을 허용하는 잠금이었습니다. 이론적으로 잠금은 매우 짧은 시간동안만 유지되야 합니다.

만약 잠금이 오랜 시간동안 유지된다면 다른 스레드들이 데이터를 읽는 것을 허용하는 것을 생각해볼 필요가 있습니다. 이렇게 하면 스레드들간에 잠금을 얻기 위해 경쟁할 필요가 없습니다. 물론, 우리는 데이터 쓰기에 대해서는 하나의 스레드만 잠금을 획득하게 해야 하지만, 데이터를 읽는 스레드들에 대해 그렇게 할 필요는 없습니다. 하나의 쓰기 스레드만 데이터의 내부 상태를 변경합니다.

J2SE 5.0에서 이러한 형태의 잠금을 제공하는 클래스와 인터페이스는 다음과 같습니다.


public interface ReadWriteLock {
   Lock readLock( );
   Lock writeLock( );
}

public class ReentrantReadWriteLock implements ReadWriteLock {
   public ReentrantReadWriteLock( );
   public ReentrantReadWriteLock(boolean fair);
   public Lock writeLock( );
   public Lock readLock( );
}

ReentractReadWriteLock 클래스를 사용하여 읽기-쓰기 잠금을 생성할 수 있습니다. ReentrantLock 클래스와 마찬가지로 이 선택사항은 잠금을 정당하게(fair) 분배합니다. "Fair"라는 의미대로 선착순, 먼저 도착한 스레드에게 먼저 잠금을 허용하는 것과 매우 가깝게 동작합니다. 잠금이 해제될 때 읽기/쓰기에 대한 다음 집합은 도착 시간을 기준으로 잠금을 얻습니다.

잠금의 사용은 예상할 수 있습니다. 읽기 스레드는 읽기 잠금을 획득하는 반면 쓰기 스레드는 쓰기 잠금을 획득합니다. 이 두 가지 잠금은 모두 Lock 클래스의 객체입니다. 그러나, 한기지 주요한 차이점은 읽기/쓰기 락은 상태 변수에 대한 다른 지원을 한다는 것입니다. newCondition() 메서드를 호출하여 쓰기 잠금과 관련된 상태 변수를 획득할 수 있으며, 읽기 잠금에 대해 newCondition() 메서드를 호출하면 UnsupportedOperationException이 발생합니다.

이들 잠금은 중첩이 가능합니다. 즉, 잠금의 소유자가 필요에 따라 반복적으로 잠금을 획득할 수 있습니다. 이러한 특성 때문에 콜백이나 다른 복잡한 알고리즘을 안전하게 수행할 수 있습니다. 뿐만아니라, 쓰기 잠금을 가진 스레드는 읽기 잠금도 획득할 수 있습니다. 그러나, 그 반대는 성립하지 않습니다. 읽기 잠금을 획득한 스레드가 쓰기 잠금을 획득할 수 없으며, 잠금을 업그레이드하는 것도 허용되지 않습니다. 그러나 잠금을 다운그레이드하는 것은 허용됩니다. 즉, 이것은 쓰기 잠금을 해제하기 전에 읽기 잠금을 획득하는 경우에 수행됩니다.

이 장의 뒤에서는 궁핍현상(lock starvation)에 대해 자세히 살펴볼 것입니다. 읽기-쓰기 잠금은 궁핍현상에서 특별한 의미를 가집니다.
이 절에서는 J2SE 5.0에서 제공하는 보다 높은 수준의 동기화 도구를 살펴볼 것입니다. 이러한 기능들은 이전 버전의 자바에서 제공하는 기능들로 직접 구현해야 했던 것입니다. 또는 서드 파티 제품에서 작성한 도구들을 사용해야 했습니다. 이들 클래스들은 과거에는 수행할 수 없었던 새 기능들을 제공하지 않지만 완전히 자바로 작성되었습니다. 그런 점에서는 편리한 클래스입니다. 다시 말해서, 개발을 보다 쉽게 하고 보다 높은 수준에서 응용 프로그램 개발을 할 수 있게 하기 위해 설계되었습니다.

이들 클래스들간에는 중복되는 부분이 상당히 많습니다. Semaphore는 하나의 허가권을 가진 세머포어를 선언하는 것으로 부분적으로는 Lock을 시뮬레이트하는데 사용할 수 있습니다. 읽기-쓰기 잠금의 쓰기 잠금은 부분적으로 상호 배제 잠금(mutually exclusive lock - 약자로 뮤텍스)을 구현하고 있습니다. 세머포어도 읽기-쓰기 잠금을 시뮬레이트하기 위해 사용될 수 있습니다. 카운트다운 래치도 각 스레드가 대기 하기전에 카운트를 감소시키는 장벽처럼 사용될 수 있습니다.

이들 클래스를 사용하여 얻을 수 있는 주된 장점은 스레드와 데이터 동기화 문제를 해결해 준다는 것입니다. 개발자들은 가능한한 높은 수준에서 프로그램을 설계하고, 낮은 수준의 스레드 문제에 대해 걱정하지 않아도 됩니다. 교착상태의 가능성, 잠금과 CPU 궁핍현상을 비롯한 복잡한 문제들에 대한 것들을 어느 정도 벗어버릴 수 있습니다. 그러나, 이러한 라이브러리를 사용하는 것이 개발자로 하여금 이러한 문제에 대한 책임을 완전히 제거하는 것은 아닙니다.

'java core' 카테고리의 다른 글

Runtime in jdk5.0 (0)	2005.03.26
Annotation (0)	2005.03.18
Threads from aritma (0)	2005.02.12
Reference object model from java world (0)	2005.01.28
Garbage collection from javaworld (0)	2005.01.28

Posted by '김용환'

이전 1 ··· 402 403 404 405 406 407 408 다음

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`