Sunday, December 9, 2007

Search Engine and Ranking

Crawling

Start from an HTML, save the file, crawl its links

def collect_doc(doc)
-- save doc
-- for each link in doc
---- collect_doc(link.href)

Build Word Index

Build a set of tuples

{
word1 => {doc1 => [pos1, pos2, pos3], doc2 => [pos4]}
word2 => {...}
}

def build_index(doc)
-- for each word in doc.extract_words
---- index[word][doc] << style="font-weight: bold;" size="5">Search

Given a set of words, locate the relevant docs

A simple example ...

def search(query)
-- Find the most selective word within query
-- for each doc in index[most_selective_word].keys
---- if doc.has_all_words(query)
------ result << doc
-- return result


Ranking

There are many scoring functions and the result need to be able to combine these scoring functions in a flexible way

def scoringFunction(query, doc)
-- do something from query and doc
-- return score

Scoring function can be based on word counts, page rank, or where the location of words within the doc ... etc.
Scoring need to be normalized, say within the same range.
weight is between 0 to 1 and the sum of all weights equals to 1

def score(query, weighted_functions, docs)
-- scored_docs = []
-- for each weighted_function in weighted_functions
---- for each doc in docs
------ scored_docs[doc] += (weight_function[:func](query, doc) * weight_function[:weight])
-- return scored_docs.sort(:rank)

Collaborative Filtering

Given a set of user's rating on movies, how to recommend movies to a new user ?

Ideas from the book: Collective Intelligence chapter one

Given a set of ratings per user
[ person1 => {movieA => rating-1A, movieB => rating-1B},
person2 => {movieX => rating-2X, movieY => rating-2Y, movieA => rating-2A}
...
]

Determine similarity

person1 and person2 are similar if they have rate the same movies with similar ratings

person_distance =
square_root of sum of
-- for each movie_name in person1.ratings.keys
---- if (person2.ratings.keys contains movie_name)
------ square(person1.ratings[movie_name] - person1.ratings[movie_name])


Person similarity =
0 if no common movies in corresponding ratings
1 / (1 + person_distance) otherwise

How to find similar persons to personK ?
Calculate every other person's similarity to personK, and sorted by similarity.

How about movie similarity ?

Invert the set of rankings to ...
[ movieA => {person1 => rating-1A, person2 => rating-2A},
movieB => {person1 => rating-1B}
movieX => {person2 => rating-2X}
movieY => {person2 => rating-2Y}
...
]

movie_distance =
square_root of sum of
-- for each person_name in movieX.ratings.keys
---- if (movieX.ratings.keys contains person_name)
------ square(movieX.ratings[person_name] - movieY.ratings[person_name])


Movie similarity =
0 if no common persons in corresponding ratings
1 / (1 + movie_distance) otherwise

How to find similar movies to movieX ?
Calculate every other movie's similarity to movieX, and sorted by similarity.

Making recommendations

Lets say there is a new personK provide his ratings. How do we recommend movies that may interests him ?

User-based filtering

For each person in persons
-- similarity = person_similarity(personK, person)
-- For each movie_name in person.ratings.keys
---- weighted_ratings[movie_name] += (similarity * person.ratings[movie_name])
---- sum_similarity[movie_name] += similarity

For each movie_name in weighted_ratings.keys
-- weighted_ratings[movie_name] /= sum_similarity[movie_name]

return weighted_ratings.sort_by(:rating)


Item-based filtering

Pre-calculate the movie similarity

For each movie in movies
-- neighbors[movie] = rank_similarity(movie, no_of_close_neighbors)

{
movie1 => {movieA => similarity1A, movieB => similarity1B}
movie2 => {movieC => similarity2C, movieD => similarity2D}
}

At run time ...
personK => {movieX => rating-kX, movieY => rating-kY}

For each movie in personK.ratings.keys
-- for each close_movie in neighbors[movie]
---- weight_ratings[close_movie] += neighbors[movie][close_movie] * personK.ratings[movie]
---- similarity_sum[close_movie] += neighbors[movie][close_movie]

For each target_movie in weight_ratings.keys
-- weight_ratings[target_movie] /= similarity_sum[target_movie]

return weight_ratings.sort_by(:rating)

Monday, November 26, 2007

Distributed UUID Generation

How to generate a unique ID within a distributed environment in consideration of scalability ?

Also, the following feature requirements are given ...
  • The ID must be guaranteed globally unique (this rule out probabilistic approaches)
  • The assigned ID will be used by client for an unbounded time period (this means that once assigned, the ID is gone forever and cannot be reused for subsequent assignment)
  • The length of the ID is small (say 64-bit) compared to machine unique identifier (this rule out the possibility of using a combination of MAC address + Process id + Thread id + Timestamp to create a unique ID)
Architectural goals ...
  • The solution needs to be scalable with respects to growth of request rates
  • The solution needs to be resilient to component failure

General Solution

Using a general distributed computing scheme, there will be a pool of ID generators (called "workers") residing in many inter-connected, cheap, commodity machines.

Depends on the application, the client (which request for a unique id) may collocate in the same machine of the worker, or the client may reside in a separate machine. In the latter case, a load balancer will be sitting between the client and the worker to make sure the client workload is evenly distributed.

Central Bookkeeper

One straightforward (maybe naive) approach is to have the worker (ie: the ID generator) make a request to a centralized book-keeping server who maintains a counter. Obviously this central book-keeper can be a performance bottleneck as well as a single point of failure.

The performance bottleneck can be mitigated by having the worker making a request to the centralized book-keeper for a "number range" rather than the "id" itself. In this case, id assignment will be done locally by the worker within this number range, only when the whole range is exhausted will the book-keeper be contacted again.

When the book-keeper receive a request of a number range, it will persist to disk the allocated number range before returning to the client. So if the book-keeper crashes, it knows where to start allocating the number range after reboot. To prevent the disk itself being a SPOF, mirrored disk should be used.

The SPOF problem of the book-keeper can be addressed by having a pair of book-keepers (primary/standby). The primary book-keeper need to synchronize its counter state to the standby via shared disks or counter change replication. The standby will continuously monitor the health of the primary book-keeper and will take over its role when it crashes.


Peer Multicast

Instead of using a central book-keeper to track the number range assignment, we can replicate this information among the workers themselves. Under this scheme, every worker keep track of its current allocated range as well as the highest allocated range. So when a worker exhaust its current allocated range, it broadcast a "range allocation request" to all other workers, wait for all of their acknowledgments, and then update its current allocated range.

It is possible that multiple clients making request at the same time. This kind of conflicts can be resolved by distributed co-ordination algorithms (there are many of well-known ones and one of them is bully algorithm)

For performance reason, the worker doesn't persist its the most updated allocated id. In case the client crashes, it will request a brand-new range after bootup, even the previous range was not fully utilized.


Distributed Hashtable

By using the "key-based-routing" model of DHT, each worker will create some "DHT Node" (with a randomly generated 64-bit node id) and join a "DHT ring". Under this scheme, the number range is allocated implicitly (between the node id and its immediate neighbor's node id).

Now, we can utilize a number of nice characteristics of the DHT model such as we can use large number of user's resource for workers with O(logN) routing hobs. And also the DHT nodes contains replicated data of its neighbors so that if one DHT node crashes, its neighbor will take over its number range immediately.

Now, what if a DHT node has exhausted its implicitly allocated number range ? When this happens, the worker will start a new DHT node (which will join at some point in the ring and then has a newly assigned number range).

Wednesday, October 31, 2007

How bad code is formed

Ignore in some case consultants do this purposefully to keep their contract, in general no one intentionally create unmaintainable code. However, bad code silently creep in over the period of the product's life cycle.

Manager's mind

Lets write the code quick. We don't have enough time for the next rollout. If we cannot rollout, the project may die quickly. Lets put long-term maintainability as a secondary priority, which we'll improve after the rollout when we have time. By the way, who knows if I am still working in this project after a year.
  • Unfortunately, if this rollout is successful, the next release cycle has an even more tighter schedule
  • Code is "written once" and "read several hundred times" throughout its lifecycle. Subsequent maintenance cost is several hundreds times more than the gain we made in its initial rollout.
I cannot justify any resources spent on refactoring because it doesn't add any features to my next release. The code is working already. You mention about improved code maintainability and I don't know how to quantify that.
  • This is how bad code retains and accumulates. After its momentum has grown to a certain point, then everyone just accept the reality that the code is already bad enough that no motivation is there to improve it.
Lets have the developers focusing in writing the real code. We'll have the QA folks worry about the testing. This way, they can proceed at full speed in parallel.

Developer's mind

Similar logic is working elsewhere but it doesn't fit exactly what I want. It isn't provide enough feature for my need and it also provide some features that I don't need. It is not worthwhile to spend my time to read and understand how it works, lets re-implement my version from scratch.
  • This is how duplicated logic getting into the system. There are many ways to do one thing.
Similar logic is already working elsewhere. It is not worthwhile to spend my time reinventing the wheel. Lets copy it here and modify for my need.
  • This is another way of how duplicated code is getting into the system. When you change the code in one place, you need to make corresponding changes in many other places where you have made copies, otherwise some copies may not be working correctly.
  • Due to "laziness", variable names are kept the same in their copies, making the copied code even harder to understand.
This code is not broken, please don't make any change
  • This is how bad code retains and accumulates. After its momentum has grown to a certain point, then everyone just accept the reality that the code is already bad enough that no motivation is there to improve it.

Of course the code is hard to read because it is solving a complex problem. If the code is easy to read, it must not be doing anything significant. You just need to be smarter when reading the code.

I am not convinced that this is the right way of doing this kind of things. Therefore, I am not going to follow the convention even it has been done in many other places. I am going to do it my way.

I don't know why I have to do it this way. But since this is the way we do this kind of things in many other places, I just follow that convention. Is "consistency" a problem ?

Coding for maintainability

One important criteria for judging the value of any serious software product is how maintainable it is over its life cycle. It is important to notice that code is "write once" but "read many times" when people subsequently maintains it (enhancing features, fixing bugs, optimizing performance). Any cost spent in understanding any non-trivial code will be multiplied many times by itself and added to the overall maintenance budget of the software product.

It is so important to spend all the effort to make the code as readable as possible. Once you do it, you will start to get your dividends over the product's life cycle.

Lets set some criteria/characteristics of code quality from a maintainability standpoint.

Good code

Just by looking at its signature, you can guess what a class/method/function is doing without looking at its implementation. A very descriptive name of the class/method/function and parameter/variable are carefully chosen such that its associated semantics is very clear. When you open up its implementation, it is doing exactly what you guess.

Since the naming is so descriptive, it doesn't need a lot of comments to make the code readable. Comments are only appears for code segments that are implementing a complex algorithm or when optimizing performance of hotspots (where code readability is intentionally sacrificed).

The responsibility of a class/method/function is very clear. There is only one way to do a particular thing and is well-encapsulated within the responsible class/method/function. There is no duplicated code.

Responsibility tends to be distributed evenly so you don't have a package with too many classes or a class with too many instance variables/methods. And you won't have a method that span more than 2 pages of code lines.

Bad code

Similar code shows up in many places, it is hard to know which of these are "similar" and which are "exactly same"

Extremely long methods (couple hundreds lines of code within a method) and usually accompanied by deeply nested case statements

Only have trivial comments and lack of useful comments

There are many different ways to do one particular thing although they are supposed to achieve the same result

To do a particular thing, the caller need to make multiple calls to different objects in a certain sequence.

The class/method/function seems to be implying certain responsibilities but when you open up its implementation, you figure out it is doing something completely different. You quickly lose confidence in your understanding and has to trace down into all implementations details to see what it is actually doing in every place.

There are some places in the code where you cannot figure out what it is trying to do


In my next blog, I'll discuss how bad code is formed.

Thursday, October 25, 2007

Understand Legacy Code

The first step to support Legacy System is to understand the code. One approach is to read through all the lines of code. Quite often, this is quite ineffective, you may be wasting a lot of time reading dead code.

What I typically do is quickly scan through the package structure of the code, the class name and public methods. If the code structure is good, then I can get some rough idea about where the logic lies. However, I rely more on analyzing the run time behavior of the code to determine its logic flow.

A technique that I use frequently is to just run the system and determine the execution path. At the first step, I need to determine the entry points of the system.

Locate the entry points

Usually, the entry points are ...
  • main() method where the system boots
  • background maintenance tasks
  • event-driven callbacks
Finding out these entry points is non-trivial, especially if the documentation of the overall system is poor. I have built a tool that allows me to instrument those classes that I am interested so that when I start the system, it will print out all the entry points when they execute. Entry point is the lowest method of the stack trace that you are interested. The tool is based on AspectJ which allows me to inject some print statements when certain methods are executed.

After I get an idea about where the entry points are, the next step is to drill down into each execution point to study what it is doing.

Drill down into entry points

The first thing I usually do is to read through the code trying to understand what it is doing. Eclipse IDE is very effective to navigate through the code to see which method is calling which method. The most frequently used Eclipse feature is ...
  • "open declaration" which jumps to the method being called
  • "references" which list all the methods that is calling this methods
  • "Calling hierachy" which is a tree of callers which calls this method
  • "Type hierachy" which is a tree of Class hierachy
After that, I will add trace statements into the code to verify my understanding. Consider Trace statement is an enhanced logging mechanism which include the thread id, the calling stack trace and a message. Trace statements is temporary and its purposes is help me to understanding the code. After that, I will remove my trace statement. Since I don't want anyone to use the trace statements in the production environment, I wrote a "Trace" class so that I can easily remove it when I am done.

Document the understanding

Draw a UML diagram to describe my understanding. And also take notes about any observed issues and what can be improved.

Legacy System Phenomenen

As mentioned in my earlier blog, the software engineering skills we've learned assumes a green-field project where we starts on a clean sheet of paper. In this case, we can apply a number of practices to build a solid foundation for future evolution and maintenance.
  1. Test Driven Development will force you to document the external behavior of your system in terms of Unit Test
  2. Enforce a coding standard and use a tool to enforce that
  3. Make sure the code is regularly reviewed to ensure readability and modularity
In reality, less than 2% project are under this situation. Mostly likely, we are dealing with a piece of software that is not built on this foundation. In many cases, even the projects starts from a good foundation, it will deteriorate over time because of schedule pressure, lack of engineering skills, people turnover ... etc.

How can one support such a legacy system ? By support, I mean the following ...
  • Fix bugs when it occurs
  • Tune performance when loading increases
  • Add features when new requirement arrives
In order to do the these things effectively, we need to first "understand" the legacy code and then "refactor" it. "Understanding" is about how to quickly establish an insight of the inner workings of the legacy code without making any modifications that may accidentally change its external behavior and causes bugs. "Refactoring" is about how to improve the understandability of legacy code by making modification to the code without changing its external behavior.

In later blogs, I will share my techniques of how to understand a piece of code written by someone else and also how to refactor it safely.

Monday, October 22, 2007

"Test" as the "Spec"

One of a frequently encountered question in enterprise software development is where does the hand off happen from the architect (who design the software) to the developer (who implements the software). Usually, the architect designs the software at a higher level of abstraction and then communicate her design to the developers, who break down into more concrete, detail-level design before turning into implementation-level code.

The hand off happens typically via an architecture spec written by the architect, composed of UML class diagrams, sequence diagrams or state transition diagrams ... etc. Based on the understanding from these diagrams, the developers go ahead to write the code.

However, the progression is not as smooth as we expect. There is a gap between the architecture diagrams and the code and a transformation is required. During such transformation, there is a possibility of mis-interpretation, wrong assumption. Quite often, the system ends up to be wrongly implemented due to miscommunication between the developer and the architect. Such miscommunication can either due to the architect hasn't described his design clear enough in the spec, or because the developer is not experienced enough to fill in some left-out detail that the architect consider obvious.

One way to mitigate this problem is to have more design review meetings, or code review session to make sure what is implemented is correctly reflecting the design. Unfortunately, I found such review sessions are usually not happening either because the architect is too busy in other tasks, or she is reluctant to read the developer's code. It ends up the implementation doesn't match the design. Quite often, this discrepancy is discovered at a very late stage and left no time to fix. While developers start patching the current implementation for bug fixing or adding new features, the architect lose the control on the architecture evolution.

Is there a way for the architect to enforce her design at the early stage given the following common constraints ?
  1. The architect cannot afford frequent progress/checkpoint review meetings
  2. While making sure the implementation compliant with the design at a higher level, the architect doesn't want to dictate the low level implementation details
Test As A Specification

The solution is to have the architect writing the Unit Tests (e.g. JUnit Test Classes in Java), which acts as the "Spec" of her design.

In this model, the architect will focus in the "interface" aspect and how this system interact with external parties, such as the client (how this system will be used), as well as the collaborators (how this system uses other systems).



The system will expose a set of "Facade" classes which encapsulate the system's external behavior and act as the entry point to its client. Therefore, by writing Unit Tests against these "Facades", the architect can fully specify the external behavior of the system.

A set of "Collaborator" classes is also defined to explicitly capture the interaction of this system to other supporting systems. This "Collaborator" classes are specified in terms of Mock Objects so that the required behavior of supporting system is fully specified. On the other hand, the interaction sequence with the supporting systems are specified via "expectation" of Mock objects.

The architect will specify the behavior of "Facade" and "Collaborator" in a set of XUnit Test Cases, which acts as the design spec of the system. This way, the architect is defining the external behavior of the system while giving enough freedom for the developers to decide on the internal implementation structure. Typically, there are many "Impl Detail" classes which the Facade delegates to. These "Impl Detail" classes will invoke the "Collaborator interface" to get things done in some cases.

Note that the architect is not writing ALL the test cases. Architecture-level Unit Test are just a small subset of the overall test cases specifically focus in the architecture level abstraction. These tests are specifically written to ignore the implementation detail so that its stability will not be affected by change of implementation logic.

On the other hand, developers will also provide a different set of TestCases that covers the "Impl Detail" classes. Usually, this set of "Impl Level TestCase" which change when the developers change the internal implementations. By separating these 2 sets of test cases under different categories, they can evolve independently when different aspects of the system changes along its life cycle, and resulting in a more maintainable system as it evolves.

Example: User Authentication

To illustrate, lets go through an example using a User Authentication system.
There maybe 40 classes that implements this whole UserAuthSubsystem. But the architecture-level test cases only focused in the Facade classes and specify only the "external behavior" of what this subsystem should provide. It doesn't touch any of the underlying implementation classes because those are the implementor's choices which the architect doesn't want to constrain.

** User Authentication Subsystem Spec starts here **

Responsibility:

  1. Register User -- register a new user
  2. Remove User -- delete a registered user
  3. Process User Login -- authenticate a user login and activate a user session
  4. Process User Logout -- inactivate an existing user session

Collaborators:

  • Credit Card Verifier -- Tell if the user name match the the card holder
  • User Database - Store user's login name, password and personal information

public class UserAuthSystemTest {
UserDB mockedUserDB;
CreditCardVerifier mockedCreditCardVerifier;
UserAuthSystem uas;

@Before
public void setUp() {
// Setup the Mock collaborators
mockedUserDB = createMock(UserDB.class);
mockedCardVerifier =
createMock(CreditCardVerifier.class);
uas =
new UserAuthSubsystem(mockedUserDB,
mockedCardVerifier);
}

@Test
public void testUserLogin_withIncorrectPassword() {
String uName = "ricky";
String pwd = "test1234";

// Define the interactions with Collaborators
expect(mockUserDB.checkPassword(uName, pwd)))
.andReturn("false");
replay();

// Check the external behavior is correct
assertFalse(uas.login(userName, password));
assertNull(uas.getLoginSession(userName));

// Check the collaboration with collaborators
verify();
}

@Test
public void testRegistration_withGoodCreditCard() {
String userName = "Ricky TAM";
String password = "testp";
String creditCard = "123456781234";
expect(mockCardVerifier.checkCard(userName,creditCard)))
.andReturn("true");
expect(mockUserDB.addUser(userName, password)));
replay();
uas.registerUser(userName, creditCard, password));
verify();
}

@Test
public void testUserLogin_withCorrectPassword() { .... }

@Test
public void testRegistration_withBadCreditCard() { .... }

@Test
public void testUserLogout() { .... }

@Test
public void testUnregisterUser() { .... }
}

** User Authentication Subsystem Spec ends here **

Summary

This approach ("Test" as a "Spec") has a number of advantages ...
  • There is no ambiguation about the system's external behavior and hence no room for mis-communication since the intended behavior of the system is communicated clearly in code.
  • The architect can write the TestCase at the level of abstractions she choose. She has full control in what she wants to constraint and what she wants to give freedom.
  • By elevating architect-level test cases as the spec of the system's external behavior. They become more stable and independent of changes in implementation details.
  • This approach force the architect to think repeatedly what is the "interface" of the subsystem and also what are the collaborators. So the system design is forced to have a clean boundary.

Open Source Phenomenon

The Open Source Phenomenon has changed drastically the skills set requirement for our software development professionals today.

Traditionally, software development involves not just writing the business logic of your application, but also a lot of plumbing logic (such as data persistence, fault resilience). These plumbing logic are typically independent of your business logic, and also quite complicated (although they are well-studied engineering problem). In most projects, large portion of the engineer effort had gone to re-inventing solution for these recurring plumbing problem.

The situation is quite different today in our software development world. Thanks to the Open Source Phenomenon, most of the plumbing logic are available in some forms of software libraries or frameworks with source code available. There are different kinds of open source licensing model with difference degrees of usage or modification restrictions. Nevertheless, the software development game plan was changed drastically since then. Instead of building all plumbing from scratch, we can just leverage what is out there as Open source libraries/frameworks. Open source phenomenon bring huge savings in the project cost and also reduce time to market.

However, surviving in the Open Source world requires a different set of software engineering skill sets that is not taught in our college. Most techniques we learn from colleges assumes a green-field development environment and we learn how to design a good system from scratch rather than how to understand or adapt existing code. We learn how to "write code" but not "read code".

Skills to survive in the Open Source world are quite different. Such as ...

"Knowing how to find" is more important than "knowing how to build"

"Google", "Wikipedia" are your friend. Of course, "Apache", "SourceForge" is good place to check. Sometimes I find it is not easy for a hard-core software engineer to look for other people's code to use. Especially before she has certain familiarity of the code base. The mindset need to be changed now. The confidence should be based on its popularity and in some case, you just use it and see how it goes.

"Quick learning" and "start fast"
Looking at the usage examples is usually a good start to get a quick understanding on how things works. Then start to write some Unit Test code to get a better feel on the area that you are interested.

Be prepared to "switch"
This is very important when you are using less proven Open Source products. Even with proven open source product, the level of supports may fluctuate in future, not something under your control. Therefore, your application architecture should try hard to minimize the coupling (and dependencies) to the open source products.

In most cases, you need just a few features out of the whole suite of the open source product. A common technique that I use often to confine my usage is to define an interface that captures all the API that I need. And then write an implementation which depends on the Open Source product. Now the dependency is localized in the implementation class. Whenever I switch, I just need to rewrite the implementation of the interface. I can even use IOC (inversion of control) mechanism to achieve even zero dependencies between my application to the Open source product.

Ability to drill down

Quite often, the open source provides most of what your need, but not 100%. In other scenarios, you want to do some enhancement, or fixing a bug. You need to be able to drill down into the code and feel comfortable to make changes.

This is a tricky area. I know many good engineers who can write good code, but only very few can pick up code written by other people, and be able to familiar with that. As I said earlier, we are trained to write code, but not read code. In dealing with open source product, code reading skills is very important.

In fact, the skills of reading code is generally very important as most projects are in maintenance mode rather than green-field development mode. I will share code reading technique more detail in later blog when I talk about doing surgery on legacy code. But here is a quick summary of an important subset that are relevant to dissecting open source products ...
  1. Look at its package structure, class name, method name to get some rough ideas about the code structure and data structure.
  2. Put some trace code that prints out its execution path, then run some examples to verify your understanding.
  3. For further drill down, put in break points and step through the code
  4. Document your findings
  5. Subscribe to the community mail alias and send an email to ask

Reverse Engineering Tools

There are a couple of tools that analyze existing code base to extract its design or external behavior. For example, Agitar can analyze your Java bytecode and generate JUnit tests ... Altova can analyze your code to generate the UML diagram.

These reverse engineering tools provide value to legacy application where you have a piece of running production code but very little information about its design, and you have very little test where you can understand its expected behavior. These tools can quickly analyze the code and tell you its observed behavior. It "discover" what the code is current doing and give you a better starting point to do further analysis.
Giving you a better starting point is all it provides. They are typically unable to distinguish between the intended behavior (sitting at a higher level of abstraction) from its actual implementation (sitting at a much lower level). In fact, I doubt if automatically extract the intended behavior is at all possible. The end result is that it swarms you with a lot of low-level, implementation-specific details (because the actual implementation is all it knows) and then you need to manually analyze what is important and what is unimportant to you.
In other words, it gives you a different starting point with a different format (e.g. a set of test methods and class invariants it detects instead of implementation Java code). But it doesn't raise your level of abstraction which I argue has to be done manually anyway. So does it give you a better start ? I think it is better but not too much better. By following a set of code surgery, refactoring technique, we can achieve a much better result. I will share such techniques in later blogs.

Now, is these tools useful for new development ? I don't think so. I am a stronger believer of TDD (Test-Driven-Development) where the test itself should be regarded as the "spec" and so by definition has to be written first and has to be written manually by the application designer. In TDD, you write your test first and write it manually (not generated). You put in all the expected behavior in your test methods, and you also put in Mock object to test its interaction. You diagram your UML model to capture your major classes and methods at a higher abstraction level.

Unit Testing Async Calls

Most Unit testing frameworks assumes a synchronous call model that the calling thread will block until the called method returns. So a typical test method will look like the following ...
class SyncServiceTest {
@Test
void testSyncCall() {
  Object input = ...
  Object expectedOutput = ...
  SyncService syncService = new SyncService();
  assertEqual(syncService.call(input), expectedOutput);
}
}

However, this model doesn't fit if the call is asynchronous; ie: the calling thread is not blocked and the result will come later from a separated callback thread. For example, if you have the following AsyncService, where the "call()" method returns immediately and the result comes back from a callback interface.
class AsyncService {
public static interface ResultHandler {
  void handleResult(Object result);
}

ResultHandler resultHandler;

public AsyncService(ResultHandler resultHandler) {
  this.resultHandler = resultHandler;
}

public void call(final Object request) throws InterruptedException {
  Executor executor =
    new ThreadedExecutor().execute(new Runnable() {
      public void run() {
        Object result = process(request);
        resultHandler.handleResult(result);
      }
    });
}
}
The test method need to be changed. You need to block the calling thread until the result comes back from the callback thread. The modified test code will look like the following ...
class AsyncServiceTest {

Object response;

@Test
public void testAsyncCall() {
  Object input = ...
  Object expectedOutput = ...
  final Latch latch = new Latch();

  AsyncCall asyncCall =
    new AsyncCall(new AsyncCall.ResultHandler() {
      public void handleResult(Object result) {
        response = result;
        latch.release();
      }
    });

  asyncCall.call(input);
  latch.attempt(3000);
  assertEquals(response, expectedOutput);
}
}
In this test method, the calling thread is waiting for the callback thread. A latch is used for the synchronization. You cannot use Thread.wait() and Thread.notify() for this purpose because there is a possibility that notify can be called before the wait and then one thread will wait indefinitely. Latch is a once-off switch and if the "release" is called first, then all subsequent "attempt" call will return immediately.