{ by david linsin }

May 13, 2009

Handling Concurrency in Domain Models

I'm working on a project right now that has a rich domain model, which we plan to use in different environments. One group of consumers are mobile clients, the others are highly concurrent servers. We try to cater to both environments in terms of parallelization, as well as keeping the domain model free of technical details.

When I design a domain model, I tend to keep it as technology independent as possible, that's probably the reason why I'm not the biggest fan of Annotations. You kind of lock yourself into a certain type of technology and you probably introduce dependencies on 3rd party code. Overall I think you are polluting your core classes with details that don't belong there.

One of those technical aspects I'm referring to is concurrency. You can either choose to prepare your domain model for a concurrent environment or leave it to the consumer of the core classes. Take a look at the following class as an example:

public class Foo {
private Map<Long,Bar> bars;

public Foo() {
bars = new HashMap<Long,Bar>;
}

public void addBar(Long argId, Bar argBar) {
bars.put(argId, argBar);
}

public Map<Long,Bar> getBars() {
return bars;
}
}


Obviously, this class is not meant to be used in multithreaded code. There are a couple of issues, which could lead to an unpredictable state of the Collection bar. Here's another version:

public class Foo {
private final Map<Long,Bar> bars;

public Foo() {
bars = new HashMap<Long,Bar>;
}

public void addBar(Long argId, Bar argBar) {
bars.put(argId, argBar);
}

/**
* @return immutable Map implementation
*/
public Map<Long,Bar> getBars() {
return Collections.immutableMap(bars);
}
}


As you can see there are a few easy things, you can do to prepare this class for usage in a parallelized environment. In fact, I think you should always do them, even if your code won't run in multiple threads. However, this code is not thread-safe either and you need to jump through a couple of more hoops until it is:

public class Foo {
private final Map<Long,Bar> bars;

public Foo() {
bars = new HashMap<Long,Bar>;
}

public synchronized void addBar(Long argId, Bar argBar) {
bars.put(argId, argBar);
}

/**
* @return immutable Map implementation
*/
public synchronized Map<Long,Bar> getBars() {
return Collections.unmodifiableMap(bars);
}
}


I think this version gets pretty close and you could call it thread-safe. I say close, because a) I'm not Brian Goetz and b) we don't know more about Bar. However, we can be sure, that there shouldn't be any problems with the Collection bars. You can add new Bar instances and retrieve them concurrently without affecting the Collection itself in any way.

The price we paid for this is clarity. The code gets a bit more complicated. I know there are other ways to make the Collection bar safely concurrent modifiable, e.g. using a ConcurrentHashMap. The point is, that if you have a complex and fairly large domain class, making it thread-safe would mean littering it with all those technical details, mixing responsibilities and making it way more complicated to maintain.

So where is the right place to handle concurrency?

While researching this topic, I found a great discussion on stackoverflow, where Alex Miller suggests the following:

Start from the data. Decide which data is explicitly shared and protect it. If at all possible, encapsulate the locking with the data. Use pre-existing thread-safe concurrent collections.

Whenever possible, use immutable objects. Make attributes final, set their values in the constructors. If you need to "change" the data consider returning a new instance. Immutable objects don't need locking.

For objects that are not shared or thread-confined, do not spend time making them thread-safe.


This doesn't answer my question directly, but it suggest to handle concurrency on the level of data. That means using the java.util.concurrent classes in your domain model and only for data which is shared between threads.

So what I take away from all of this, is that, it isn't really a question of where to handle concurrency, but rather when. I think if you can minimize the technical pollution of your domain model to the data that needs it, you can contain the damage of clarity. With the java.util.concurrent Collections you have an abstraction that hides the concurrency aspect quite nicely. I guess the combination of both is the way to go: Only optimize data that needs it, with a good abstraction, that hides the technical details.

10 comments:

Moritz Post said...

Good thinking. Although some people properbly know it, the ussage of Collection.immutable*(*) does actually not create a new *. It creates a wrapper for the given *. So it does not copy the content and you have to be aware that by adding new items through the class you also modify the gotten *.

(Written on the Asus 1000HE ;) )

Alex Miller said...

Hey good advice! But I may be biased. :)

I think I would use a ConcurrentHashMap here by default, *unless* there were methods that required updating bars along with some other attribute.

That would help avoid the concurrency problem you've created in the final version. (You've got immutableMap but there is no such method - I assume you mean unmodifiableMap.)

The unmodifiableMap just prevents you from calling through to the original map with methods that modify state. But the bug is that you did not specify any synchronization on reading the value. Wrapping it doesn't alleviate the need to supply synchronization on both write and read. Without the synchronization on read, the get is never guaranteed to see puts from another thread.

I blogged some more info on why this is particularly bad for HashMap here.

And that's why I default to using concurrent collections with encapsulated synchronization policies instead of doing it outside the data structure. :)

Alex Miller said...

Oops, that link to dzone was totally an accident: direct link.

David Linsin said...

Thanks for the hints Alex, I fixed the stuff.

I think I would use a ConcurrentHashMap here by default, *unless* there were methods that required updating bars along with some other attribute.That was my final conclusion as well. I just wanted to highlight how ugly the synchronized stuff can make your domain code.

And that's why I default to using concurrent collections with encapsulated synchronization policies instead of doing it outside the data structure.That's the statement I was looking for in the first place! Thanks!

David Linsin said...

Thinking about it now, I guess declaring "bars" volatile , should have done the job as well...I should be able to omit the synchronized block.

Khalil said...

I believe the modifier to the bars field should be private not public, otherwise it would negate adding synchronized. Volatile on its own won't be sufficient, volatile deals with visibility, and declaring the bars field as final would give bars the same visibility garenties. However what you want is exclusive thread access in add and get operations, and for that you would need locking.

David Linsin said...

Thanks Khalil, didn't see that one either!

Toko Perlengkapan Bayi said...

Thanks for taking this opportunity to discuss this, I feel strongly about it and I take pleasure in learning about this topic. If possible, as you gain information, please add to this blog with new information. I have found it extremely useful.

jakarta hotel said...

I often do not comment on blogs but your blog has such a method and writing model that I have no choices but to remark here. Nice submit, keep it up.

Anonymous said...

I'm always terrified I'm going to smash bottle of perfume in my bag, so I reckon a travelo would be a good investment! I also love how your to-do list book has become a blogging notebook! xxx


com_channels

  • mail(dlinsin@gmail.com)
  • jabber(dlinsin@gmail.com)
  • skype(dlinsin)

recent_postings

loading...