Skip to main content

Math algorithms in Java

10 replies [Last post]
kirillcool
Offline
Joined: 2004-11-17

Among one of the things that have been bothering me about Java's libraries is a complete lack of any mathematical packages. I'm not talking about BigInteger or BigDecimal, or even not about hyperbolic trigonometric functions (that took amazing 10 years to make it into JDK). I'm talking about matrix, linear algebra and related packages.

If you take a look at a scientific community, it has not been an avid adopter of the Java technology (besides the ubiquitous multi-threading examples). There are many reasons which are essentially distilled to a single point - Java has no systematically based and tested mathematical packages.

The majority of the scientific community write in C++, MatLab and (believe it or not) Fortran :O. The later one has such an extensive algorithmic database, that most of the modern algorithms are based on this code (using an equivalent of JNI to access it). Try to write a sparse-matrix linear equation solver in Java - it doesn't even has a sparse matrix package. Well, for this matter it doesn't even have a dense matrix package.

Of course, the algorithms are not simple, but neither XML or NIO. This has been really hurting Java in the academic community: it has nothing to do with the assumption that Java runs way too slow for them. They have nothing to base their work on. I've implemented an image-processing package in Java, i had to implement everything from the beginning (including a lot of JUnit tests).

There are a lot of packages out there (like Jama or MTJ). Some of them use native libraries :(, some of them are way too slow for practical purposes :(, some of them offer only a limited number of functions. Comparing that with the code base for MatLab or Fortran - and you have a definite loser.

This is not really a suggestion for JSR or JCP, just a thought on one of the bigger holes in a vast span of issues covered by the current Java standard libraries :)

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
jwenting
Offline
Joined: 2003-12-02

> Bjorn, regarding your remarks:
>
> The main disadvantage of the existing packages (say,
> Colt, Jama and MTJ) is that they try to accomplish
> similar goals without laying down a common ground of
> data structures, algorithms or API. Take a look at
> the
In other words, anything anyone might ever need should be part of the core API...
Or maybe just anything that you might ever need???

>
> Of course, LAPACK or BLAS are not a part of standard
> C++ or Fortran. However, this is because the standard
> for these two languages is a compiler that can
> understand the syntax and a very limited set of
> auxiliary libraries. Java set out from the beginning

No, it's because they were designed so people could write their own libraries and use those provided by others.
And so is Java, in fact Java supports that even more in that it's compatible without recompilation with libraries created on completely different hardware and OS platforms.

> common tasks (and far beyond). Look at all the
> advanced core packages for 2D and 3D graphics,
> XML-related technologies and so on. Of course there

Which IMO shouldn't be there and originally weren't...

> are tens of "competing" technologies for XML (like
> JDOM, XOM, ...), but i personally prefer to use
> (sometimes) clumsier SAX and DOM parsers (via factory
> of course) of JDK than to bundle third-party
> packages.
>
That's your choice, don't force it on the world.

> And why can't we reuse the API of BLAS from the
> Fortran community. If it took them 10 years to come
> up with it, there's no need to reinvent the wheel,
> just to implement it using Java's design patterns.
>
Go ahead, implement it in Java and make it available.
Don't scream murder because Sun doesn't do it for you and saddle us all with a hundred megabytes of extra code and documentation to download.

>
> To conclude - it is my opinion that packages that are
> not part of Java core are not tested well (just
> because they are not used widely enough). In

And why should Sun invest the resources to implement and test something that noone except you will afterwards use?

> addition, it is very unfortunate that Java doesn't
> have standard matrix package (just because it doesn't
> set a common ground for implementors like you).

Same thing. If enough people were to need it it might be there.
As it is use a third party package or write your own. That's the beauty of a truly modular language, something you don't seem to comprehend.

bjornoh
Offline
Joined: 2004-03-04

I do agree that having a standard interface is a good idea, but the problem is finding this standard interface. There's simply too many implementation ideas floating around (do have a look at all the Fortran/C++/Java implementations to get an idea). Of course, the Fortran BLAS could serve as a starting point, but what should the Java calling conventions be?

It is very desirable to have an optimized (read assembly coded) BLAS library, so an underlying JNI mechanism could be the first starting point. Secondly, a more userfriendly interface should be provided, and here some classes could be introduced. You have correctly assumed that I am the author of MTJ, and in MTJ I provide a set of classes which hide the details of the BLAS, while offering very good performance. Some issues with MTJ might be its large codesize (in terms of number of classes), and also that it calls either a native BLAS library or the JLAPACK one (which is a Java translation of BLAS + LAPACK).

Some effort at standardization is going on with the Multiarray JSR, but it has had no progress for the last few years. Also, it only addressed dense multidimensional arrays, not sparse matrices nor algorithms for any type of algorithm. The latter could of course be provided at a later stage.

So a standard would be nice, but somehow we have to agree on one which addresses all concerns while also being small and manageble.

(btw: I use a different JLAPACK than the one you link to, goto www.netlib.org/java/f2j)

ackumar
Offline
Joined: 2005-01-24

Hi,

I am a starter in JAVA. Currently I am trying to use JLAPACK for calculating SVD for a huge sparse matrix m X n (m>n).The documentation is not clear enough for me to understand.

I found that the class DGESVD is used to calculate SVD
(is it correct!!!!)

In the documentation, the part which explained abt arguments says that the arguments JOBU and JOBVT are character type. But in the method DGESVD it is given as String. When I tried to pass character ('A') as parameter I am getting error like cannot be applied to method DGESVD. If I give String ("A") as parameter the input matrix A is getting overwritted.

I am also not clear abt the parameters double[] work, int lwork, intW info.

So I like to know more details abt the problem mentioned above. And if u have any document which is having clear documentation of JLAPACK plz send me.

I implemented JAMA and MTJ for calculating SVD but I used DenseMatrix for a small matrix. So I am not sure whether
these two package works for huge sparse matrix.

I also like to know which package is apt for calculating SVD for a huge sparse matrix.

Regards
ackumar

fastjack
Offline
Joined: 2004-02-13

True, the JLAPACK API is really far from being userfriendly. Any hint how to use that one is highly appreciated.

And about the general question, which matrix package to use: I looked at JScience.org and found it nice and rather active, compared to other libs. But: it doesn't seem to be complete enough, for example I didn't find a SVD decomposer. (And I need one.) But the authors plans for 2005 look promising!

Maybe interested folks should join that project and help bringing it forward, so that we have a nice open source linear algebra package for java sometime in the future.

Does someone have a more general link which explains SVD and its implementations? (I am almost desperate enough to implement it myself, but I still need to understand how it works ;-) So hints in that direction are also appreciated.)

So long
Daniel

vpatryshev
Offline
Joined: 2004-06-30

Man, if you check out the sources of Java, you'll discover that they do not even use the standard libraries for trigonometric functions, instead they use some nice homegrown IBM code - the advantage being that it seems to calculate cosine up to the last bit, and the disadvantage is that it is 50 times slower than processor commands. Boloto...

mthornton
Offline
Joined: 2003-06-10

If you check more carefully you will find that the code you mention is used for StrictMath (in accordance with the spec). For Math, HotSpot replaces it with some faster code. On x86 it still has to check the argument because the x86 argument reduction is not accurate enough to meet the relaxed spec for Java Math.

bjornoh
Offline
Joined: 2004-03-04

If I understand your post correctly, you're suggesting that numerical/mathematical packages for Java are not as tested as those written in Fortran or C++ ? And also, you seem to think it is unfortunate that Java has no standard matrix package.

I would like to (attempt) to address these concerns. Regarding testing, it is natural that packages written in Fortran tends to be more tested simply because Fortran has been around for so much longer than Java. However, the most prominent library in Fortran for numerics is probably LAPACK, and if you examine its featureset, you will see that it doesn't implement any general sparse matrix algorithms, only algorithms for dense/symmetrical/banded matrices. There are sparse matrix packages in Fortran, but they tend to be focused on sparse direct solvers (ie. MUMPS), and there are very few general, maintained matrix packages.

Furthermore, C++ doesn't really have extensive matrix libraries (C does, but that's again due to C being an old language, akin to Fortran). Some attempts in C++ are the MTL and uBLAS. But development of MTL has stopped, and uBLAS has performance issues with sparse matrices (and it doesn't provide any higher level algorithms such as solvers).

So to answer your first concern, Java is simply too new to be firmly entrenched in numerics (as Fortran is). Many people in the numerics community are using it, especially as people are starting to migrate C++ codes since performance of Java is becoming less of a concern now.

Next, Java doesn't provide a standard matrix package. Well, neither does Fortran nor C++. BLAS has established itself as an "industry standard", but a Fortran/C++ compiler isn't obliged to provide BLAS (some omit it). I would also like to say that premature standarization can be harmful, since we do not know how to build the "best" matrix library (yet). It took the Fortran community over 10 years to agree on the BLAS as it is today, while there is no standard for C++ yet (valarrays was attempted in ANSI C++, but very few people use it due to performance issues). Perhaps uBLAS will be the standard C++ matrix library?

So again, we should develop and test software, try out new ideas for library development, and perhaps in a few years time the Java community can agree on a standard (or recommended) matrix package.

And finally, what are your complaints with Jama or MTJ? Jama is nice for small matrices, is easy to use and learn, and only requires Java 1.1 to run. MTJ doesn't require native libraries (uses JLAPACK), and it provides more functionality than both Matlab and Fortran (if you're refering to just LAPACK here), especially concerning sparse matrices and solvers.

kirillcool
Offline
Joined: 2004-11-17

Bjorn, regarding your remarks:

The main disadvantage of the existing packages (say, Colt, Jama and MTJ) is that they try to accomplish similar goals without laying down a common ground of data structures, algorithms or API. Take a look at the http://hoschek.home.cern.ch/hoschek/colt/V1.0.3/doc/index.html documentation of Colt - it's a mess of various packages which haven't even been ported to a single roof. What is even worse, if i wish to switch implementation, say from Colt to MTJ - guess what? Can't be done. The class names are different, the function names are different. I can't even write a factory to provide me with an implementation because these two packages have nothing in common. A unified approach of Java core providing a set of interfaces and basic implementation along with a factory to retrieve the implementation of my choice (which could be plugged in using a third-part JAR) can easily address this issue.

Of course, LAPACK or BLAS are not a part of standard C++ or Fortran. However, this is because the standard for these two languages is a compiler that can understand the syntax and a very limited set of auxiliary libraries. Java set out from the beginning to provide us with all we need to accomplish most common tasks (and far beyond). Look at all the advanced core packages for 2D and 3D graphics, XML-related technologies and so on. Of course there are tens of "competing" technologies for XML (like JDOM, XOM, ...), but i personally prefer to use (sometimes) clumsier SAX and DOM parsers (via factory of course) of JDK than to bundle third-party packages.

And why can't we reuse the API of BLAS from the Fortran community. If it took them 10 years to come up with it, there's no need to reinvent the wheel, just to implement it using Java's design patterns.

And finally, about the native code in MTJ (which I assume is your creation). The http://www.math.uib.no/~bjornoh/mtj/ website talks specifically about NNI (native numeric interface) that has to be compiled for every platform (the sources for them reside in http://www.math.uib.no/~bjornoh/mtj/src/ directory. Of course, there's JLAPACK that you are using, but it's http://www.cs.unc.edu/Research/HARPOON/jlapack/ homepage says that it has been last updated in 1998, so not much progress there is likely (as i saw from the source code, it doesn't have sparse matrices, so how do you use it to solve sparse equations?) As for the Colt - it had its last version in 2002.

To conclude - it is my opinion that packages that are not part of Java core are not tested well (just because they are not used widely enough). In addition, it is very unfortunate that Java doesn't have standard matrix package (just because it doesn't set a common ground for implementors like you).

Regards
Kirill
http://jroller.com/page/kirillcool

zander
Offline
Joined: 2003-06-13

> Of course there
> are tens of "competing" technologies for XML (like
> JDOM, XOM, ...), but i personally prefer to use
> (sometimes) clumsier SAX and DOM parsers (via factory
> of course) of JDK than to bundle third-party
> packages.

The clumsier is natural if you want it in the JDK; the JDK has a short turnaround and very very little feedback during that turnaround.
The thing is; as soon as a package goes into the core, its API will be frozen. Additions are the only thing allowed (but those are rare). At the same time the big crowds only see the code as soon as the API has been frozen and then you can fix minor, but not major bugs.
So, the theory that stuff in the JDK means well tested is only true if you wait for a couple of years for several releases to have been published.
And even then you will most probably find another package that has a better API doing the same stuff.

> To conclude - it is my opinion that packages that are
> not part of Java core are not tested well (just
> because they are not used widely enough).

Take a look at the bugparade, it has thousands and thousands of bugs open. Being well tested does not mean all bugs are fixed.
My experience is the exact opposite; an API of an open source package is allowed to evolve much faster due to its 'release early and release often'. The turnaround time for a bugfix is in days instead of years.

And on top of that; the mathmatics community is much smaller then the (for example) EJB community; and since Sun fixes bugs based on number of votes, you can do the math on that one.

> In
> addition, it is very unfortunate that Java doesn't
> have standard matrix package (just because it doesn't
> set a common ground for implementors like [Bjorn]).

I suggest you also check out http://www.jscience.org .

Also remember that "legacy services" (services where little invention can be done since its been done many times before), like the math section you are concerned about, have a track record of providing many times more mature libraries in open source implementations then in commercial settings.
I suggest you broaden your view and take your eyes off of Sun.

mthornton
Offline
Joined: 2003-06-10