Tuesday, December 18, 2012

Java Collections - Synchronization Myth

Introduction:

Java community tried to reduce developers pain by adding synchronization mechanism for various java collections. There are 6 flavors of API available which could help developers to synchronize different java collections. Here is the list of API:
  • Collections.synchronizedList()
  • Collections.synchronizedCollection()
  • Collections.synchronizedMap()
  • Collections.synchronizedSet()
  • Collections.synchronizedSortedMap()
  • Collections.synchronizedSortedSet()
The collections returned by these methods are capable to handle synchronization for basic operations on collections e.g. get(), set(), add(), remove() etc.

The Myth:

As soon as  developers synchronize their objects using synchronization APIs, they start to believe that they need to put more code to deal with synchronization for these objects, which is not true for all cases. Let me demonstrate this with simple example.

"An ArrayList Need to be accessed by two different threads, One thread need to iterate over list and other will try to add few new entries"

Here is threads detail:
  • t1 : This thread uses Iterator to traverse through the list
  • t2 : This thread uses for loop with index based approach
  • t3 : This thread uses Iterator with sync block to traverse through the list
  • t4 : This thread will try to add new entries in the list.
Source Code:
I tried to execute code in different combinations:

case1:

Use t1 and t4. Start both threads. 
"ConcurrentModificationException" will be thrown.

 

case2:

Use t2 and t4. Start both threads. 
Successful execution with concurrently updating the list.

 

case3:

User t3 and t4. Start both threads. 
Successful execution with no concurrent update.

Conclusion:  

Developers need to be careful about synchronization even if collections has been synchronized using Collections API.

Tuesday, December 11, 2012

Efficient Logging in Java

Logging is very important part of any programming language. It gives important information to developers to understand the program behavior. Its almost next to impossible to debug issues without logs. So logs are very important but logging itself is overhead for any program. It shares valuable CPU time, consume program memory, perform I/O operations. So Logs are necessary evils. 

Various logging framework provide various techniques to reduce these overhead. Common techniques to improve performance

  • Configuration Techniques
    • Set default logging level to Severe
    • On demand enable Logging for specific module
  • Programming Techniques
    • Use discretion while putting log level for evey log
    • Check log level before processing logging command
Recently I come across some other ways to improve logging efficiency. I tried to run few tests around them. Here is test case result:

Test Code using "java.util.logging.Logger"
Test Results "java.util.logging.Logger"
Test Results "java.util.logging.Logger"
Test Code using "org.apache.logging.log4j.Logger"
Test Results "org.apache.logging.log4j.Logger"

Conclusion
  1. Apache Logger is more efficient in logging than java default logger.
  2. String concatenation technique in less efficient than new placeholder technique. [But for log4j placeholder technique is available in  log4j 2.0 only.]
  3. Its very efficient to check log level before logging command for dynamic logs [The logs which require string concatenation or placing values in placeholders ]
  4. Its inefficient to check log level before logging command for simple/static logs.
Reference:
http://logging.apache.org/log4j/2.x/performance.html