Tuesday, August 11, 2009

Basic Math premises..useful to tackle algorithm problems

As I have solved many algorithm problem, either just out of interest or to prepare for interviews, I have realized that these basic math premises are quite handy. We have all learnt these in school/college etc. but we often forget these basics and do not write efficient algorithms. Feel free to suggest more to add to the list..
  1. Every whole number can be represented by the product of a unique combination of prime numbers.    E.g. 48 = 2^4 x 3, 46 = 2 x 23
  2. To confirm if a number is a prime or not, it is enough to check if there exists a factor between 2 and the sqrt(number). No need to check until number-1.
  3. A factorion is an integer which is equal to the sum of factorials of its digits. There are exactly four such numbers:
    1=1!

    2=2!

    145=1!+4!+5!

    40585=4!+0!+5!+8!+5!

Tuesday, July 28, 2009

Decimal precision challenges for developers

No matter how powerful a programming language is, due to the inherent problems in representing precision in a fixed number of bytes in computer memory, accurate calculations are not possible when using data types like 'double' in Java.

I am just summarizing the common mistakes people make when using doubles to do accurate calculations and of course a few solutions and/or things to keep in mind when coding with doubles.

Demonstrating the problem:
Let's take a simple usecase. Say you are writing an order management system, where you want to buy 10.6 shares of some product in total, and you need to check how many fills have already come in and place an order for the remaining quantity only if it is >= 0.1.. lame but.. simple problem..

A naive code will look like this..
double goal = 10.6;
double fill1 = 3.3 + 7.2;
double remaining = goal - fill1;
System.out.println("Remaining Qty: " + remaining);
if (remaining >= 0.1) {
 System.out.println("We have a problem, too many unfilled quantity");
} else {
 System.out.println("Incorrect Conclusion: Too few quantity remaining, so let's buy the remaining!");
}
If you run the above program, the output would be...
Remaining Qty: 0.09999999999999964
No Problem, too few quantity remaining

Problem is obvious from the output. The remaining quantity calculated is not exactly 0.1 as you might have guessed because of precision issues. So, the program ends up concluding that it doesn't have to do anything, but in fact it has to place more orders!

Some solutions and coding guidelines:

1. Use BigDecimal.

  • Pros: Accurate
  • Cons: Slow, may not be a great option in latency sensitive applications, depending on how many calculations you have.
2. Use integers or longs for everything, for e.g. instead of $1.11, use 111 cents.

3. Say, you still need a decimal point and cannot use integer or long for everything, then maintain a decimal point yourself. As of this writing, I don't know if there are any open source libraries that already do it for you.

4. Never compare decimal values directly, always...always use a EPSILON (a very small value that is relevant in your usecase)

E.g. Instead of remaining == 0.1, use Math.abs(remaining - 0.1) <= EPSILON 

where EPSILON could be somethig like 10e-8

        To aid with comparing using EPSILON, you might want to take a look at the MathUtils library from Apache instead of writing those utility methods yourself.


Monday, March 5, 2007

TreeMap vs PriorityQueue

Comparison

Category
TreeMap
PriorityQueue
Implementation
Uses Red-Black tree
Uses Min (max) tree
Remove Time Complexity
O(log(n))
O(log(n))
Insert Time Complexity
O(log(n))
O(log(n))
Duplicates
No
Yes
Search Time Complexity
O(log(n))
O(n)

When to use PriorityQueue?

As the above comparison shows, when you need duplicates and when you don't care about finding a particular elements by key, you should use this.

When to use TreeMap?

Again, as the comparison shows, when you need duplicates and when you need to find an element based on a particular key, use this. There are ways to put duplicates into TreeMap using a custom Comparator, but try to avoid that as it breaks the general understanding of the TreeMap for users.

Tuesday, July 25, 2006

Very handy Unix Commands

If you are like me, an awesome application developer :), only spends time on Unix for the usual process or disk monitoring, not like a system programmer, these commands can be very handy...

List all directories with sizes
du -sh *
h -- displays the information in human readable format, using G (Gigabyte), M (Megabyte) etc.

List of all directories sorted by size
du -sm * | sort -nr

Size of a file system (drives in lay man terms)

--Current File system
   df .

--Current file system in human readable form
   df -h .

Size of all file systems
df -h

Find processes sorted by the maximum amount of RAM / CPU usage
top
Once inside, press Shft+Q
Now, Press the letter next to the option 'RES' (to sort by Resident Memory usage)
Press the letter next to the option 'CPU' (to sort by CPU usage)