PDA

View Full Version : I Have No Idea What This Java Code Does (Concurrency)


Kraetos
2009-03-01, 20:18
So my professor isn't the greatest commenter in the world, and there's this class on the CVS server, and I have no idea what it does. (It's just 40 lines. I imagine it's not too complicated if you're a more experienced programmer than myself. The comments are me, not the professor.


public class Buffer
{
private final static int BUFFSZ = 10;
private int buffer[] = new int[BUFFSZ];

private int pIn = 0, pOut = 0, nItems = 0, value;

public synchronized int read()
{
while(nItems == 0) // will wait if the current index is 0
{
try
{
wait();
} catch (InterruptedException e){}
}
value = buffer[pOut];
pOut = (pOut + 1) % BUFFSZ;
System.out.println(" ~~~ pOut = " + pOut); // for debugging
nItems--;
System.out.println(" ~~~ After Read, nItems = " + nItems); // for debugging
notifyAll();
return value;
}

public synchronized void write(int value)
{
while(nItems == BUFFSZ) { // will wait if the current index is the last in the buffer
try
{
wait();
} catch (InterruptedException e){}
}
buffer[pIn] = value;
pIn = (pIn + 1) % BUFFSZ;
System.out.println(" ~~~ pIn = " + pIn); // for debugging
nItems++;
System.out.println(" ~~~ After Write, nItems = " + nItems); // for debugging
notifyAll();
}
}


No idea might be a little extreme. I know that it's preventing four other threads - two readers, two writers - from interfering with each other. The consumers/producers sleep for a random amount of time between 0 and 1 seconds after each read/write.

I think those wait() calls are necessary so they don't mess up the item count (nItem) and potentially cause the read()/write() methods to interfere with each other.

I think that the loops in the two synchronized methods are designed to wait when they hit the bound of the buffer. i.e. when the buffer is empty, read() waits until notified. Conversely, if the buffer is full, write() will wait to avoid overflowing the array.

What's the deal with this line in the read() method:


pOut = (pOut + 1) % BUFFSZ;


and it's brother in the write() method? I don't understand why/how it's changing the value of pOut and why it's not causing problems if the other reader tries to read to the same index. (Or if the other writer tries to write to the same index.)

I can post the producer/consumer/main classes if they'll help, but the buffer is the important one. The other three are pretty normal.

ShadowOfGed
2009-03-01, 20:33
That's basically using the array as a circular list.

This lets you continuously loop through elements [0, BUFFSZ-1] on both the read and write end in O(1) time. The modulus operator (%) just makes it so that if (pOut + 1) is beyond the end of the array, the "next" element becomes 0 instead.

Kraetos
2009-03-01, 21:16
The modulus operator (%) just makes it so that if (pOut + 1) is beyond the end of the array, the "next" element becomes 0 instead.

:o Obviously. Not sure why I didn't get that. Thanks :)

I suppose the larger question is, is there anything more complicated going on here w/r/t keeping the reader/writer objects from interfering from each other than my analysis above? (The class is a concurrent programming class.)

adamb
2009-03-02, 05:09
This is an example of using a Java monitor to provide safe concurrent execution of, as mentioned before, I circular buffer.

The wait() suspends (or sleeps not entirely sure which) that thread of execution until the thread is woken with a 'notify()' or 'notifyAll()' call. E.g. if you are attempting to read, but there are no items to read (nItems == 0) that thread will suspend until an item is written to the buffer and write calls notifyAll(). This will wake your suspended thread that had called read, and it can exit the while loop and execute the useful code.

The wait() is within a while loop for safety. The while condition obviously determines when the thread should wait, but when notify is called, it may not be this specific thread that was meant to be woken, so the condition may still be false. Alternatively, due to how Java using the notify and continue method of waking threads, its possible that the condition was true and the thread woken, but the condition has been set back to false between the time the thread was told to wake, and the time it begins executing. In the example above this could be multiple readers waiting on an item to be written to the buffer. Basically use while loop around your wait() as above and not if().

Hope that helps

Enki
2009-03-02, 22:47
The above posts are correct. There are subtleties though which aren't covered.

First mod is a poor choice for the reset of the buffer. mod uses repeated division to generate the remainder and while that is O(1) in relation to the number of items in the buffer the time it spends in the mod circuitry if the mod is not a clean power of two is far greater then the very small O(1) time a simple conditional and assignment takes.

if pOut >= BUFFSZ {
pOut = 0;
}

That takes two short machine instructions, the pOut = 0; is already in both versions, the mod version, just takes longer to get the answer ready for the assignment. If it isn't time to reset the buffer pOut is left alone which is even faster yet.

The wait() statements aren't really monitors. Java synchronized is monitor-like, not a monitor. This means there is a thread scheduling free-for-all every time a notify() or notifyAll() is called and is the reason that hackish while loop is needed. Frankly this is a solution prone to potential starvation at worst or very sporadic performance at best. Far better is to use the Monitor classes provided in chapter four of High-Performance Java Platform Computing (http://code.google.com/p/hpjpc/downloads/detail?name=HPJPC%20Christopher%20and%20Thiruvathu kal.pdf&can=2&q=) where the code is provided here (http://hpjpc.googlecode.com/files/hpjpc-2007-11-07.zip). The book is out of print, the authors have posted the PDF at the link I provided as they also provided the examples and code, the code link I provided is just for the classes, not all the examples.

I use those in my OS course and we cover the differences and code them, then compare the results for correctness, ordering and performance. That rest is style, I don't particularly like the style -- I happen to think good style comes from clear thought and messy code very easily flows from unclear thought. It also shows lack of attention to detail and sloppiness, quick and dirty in the worst sense of the words.

My advice is don't write code like this professor. If code is well written and has a reasonable style it tells it's own story with a minimum of comments. You write code for humans to read, not computers to execute -- unless you are programming in assembler or machine code. This code 'looks' like it was written for a machine to understand, not a person. I have found that type of 'look' to be a universal bad sign. We also need to provide students with code examples that show care and thought for what is being done, not the shortest number of lines we can get away with.

ShadowOfGed
2009-03-03, 00:17
First mod is a poor choice for the reset of the buffer. mod uses repeated division to generate the remainder and while that is O(1) in relation to the number of items in the buffer the time it spends in the mod circuitry if the mod is not a clean power of two is far greater then the very small O(1) time a simple conditional and assignment takes.

On some architectures (x86 included), integer modulus and division are actually the same instruction; it's just a matter of which of the result(s) you consume. If you're compiling directly to the hardware, your tradeoffs are weighed against things like:

How often would my branch be mis-predicted? (Dictated by BUFFSZ.)
What is the penalty for said mis-prediction?
How tight is my loop?

But this is Java, so you're not even that close to the hardware. Also, most optimizing compilers will see the modulus operator and choose the best implementation for you. If it's a compile-time constant, powers of two (and other known constants) will be replaced by more efficient arithmetic. If it's not a compile-time constant, the compiler can determine if a conditional or modulus is more efficient.

In either case, you've only written one piece of code that says (very concisely) what it does.

One of the better pieces of advice I've heard: avoid premature optimization. In cases like this, where the compiler can determine the best choice, you don't want to force less optimal code by "optimizing" it yourself.

Enki
2009-03-03, 16:58
This isn't premature optimization. This is choice of an algorithm which guarantees the shortest execution time across all architectures. Choice of algorithm IS what Hoare?/Knuth?/Djikstra? advocated, not obtuse "I think I can beat the compiler" optimizations. What I think you are missing here isn't that I advocate picking some clever assembler routine over another to do the same task. I advocate picking an algorithmically different task that has provably superior performance in all but the single most trivial case, where performance is equivalent.

A simple ALU bitwise comparison between a literal and a value is a bitwise parallel operation completed in a single pipeline step in every ALU I have run across. Repeated division and/or modulo are by nearly always multi-pipeline steps, well except for the trivially low powers of 2 like when you are doing 2 mod 2 or eight mod 2 when the number of shifts is very small.

A more important consideration is as you mention, conciseness of code. It is impossible to beat this:
if (index >= BUFFER_SIZE) {
index = 0;
}
I would be shocked to find anyone who doesn't recognize EXACTLY what that does in a buffer's coding, where I usually have a whole third to half of my OS class students that don't get that a clock is equivalent to a modulo 12 buffer until I explicitly explain it for them. Cute and overly concise to the point of needing to be "in the know" to understand some bit of code just plain isn't helpful. So, big win in conciseness, provably better performance, why would using modulo or repeated division be better for this particular task?