What it means to speak German fluently and to be able of C++
Several years ago one of our key coders moved from the south of Germany (where our HQ is located at the Black Forest) to the cold and rainy north, so we had to to find a suitable substitute. After screening lots of applications, we picked few to invite for an interview. It declared the candidate's ability to speak German and C++. So she...
on Feb 20, 2010
RESTless about RESTful
These days there is much discussion about REST and HATEOAS, and many people feel urged to reinterpret what HATEOAS means or what Roy Fielding's often-cited dissertation allegedly would say in their understanding, and what HATEOAS should be implemented like therefore. While I first felt amused about this "dispute about nothing" (just ask Mr Fielding if you don't...
on Feb 14, 2010
In my famous company innoQ I currently have the opportunity to work on a real cool tool: Bundle-Bee. It claims to be able to take any OSGi bundle and distribute the computational load to an ad-hoc grid (e.g. all machines in an office) without special setup or configuration.
We just released version 0.5.3 which is still very restricted and far from feature complete - we don't even...
on Jan 28, 2010
This blog entry describes how WebSphere eXtreme Scale uses memory. This allows customers to better size how much memory they need when storing a large number of key value pairs in a grid.
The text is in my personal blog at this link.
on Oct 28, 2009
The W3C Social Web Incubator Group is organizing a free Bar Camp in the Santa Clara Sun Campus on November 2nd to foster a wide ranging discussion on the issues required to build the global Social Web.
Imagine a world where everybody could participate easily in a distributed yet secure Social Web. In such a world every one will be able to control their own information, and every business would...
on Oct 26, 2009
Content at: http://blog.arungupta.me/2009/10/hudson-webinar-and-qa-1014-10am-pt/.
on Oct 13, 2009
Java Champion Alan Williamson posted "A Simple Java class for Amazon SimpleSQS": "With such a beautiful service such as the Amazon Simple Queue Service, it shouldn't be wrapped up with a lot of complicated layers of classes for utilizing. That is why I developed the simple POJO, single class method for utilising Amazon SQS from within Java..."
on Aug 31, 2009
Expanding on the fun from my previous blog entry:
I hereby publicly claim that there exists no Java distributed computing framework that is equally flexible, and as fast, as cajo.
I challenge all wizards: Dare thee make me eat mine own words?
Welcome everyone, even teams, pick thee thine favourite magick: EJB, Spring, Jini, CORBA, JXTA, Terracotta, GridGain, etc... or even craft thine own! Let's...
on Jul 26, 2009
MapReduce is a programming model for processing vast amounts of data. One of the reasons that it works so well is because it exploits a sweet spot of modern disk drive technology trends. In essence MapReduce works by repeatedly sorting and merging data that is streamed to and from disk at the transfer rate of the disk. Contrast this to accessing data from a relational database that operates at...
on Mar 18, 2008
For the past twelve months, I have been involved with the Service Component Architecture (SCA) specifications and two of the open source SCA implementations. Now that SCA is gaining industry traction, I would like to use my weblog here to introduce the technology and demostrate how SCA can be used for building standards-based enterprise class applications using service orineted principles and...
on Jan 26, 2008
I've bumped into consistent hashing a couple of times lately. The paper that introduced the idea (Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web by David Karger et al) appeared ten years ago, although recently it seems the idea has quietly been finding its way into more and more services, from Amazon's Dynamo to memcached (...
on Nov 27, 2007
I am very pleased to announce a most significant breakthrough from the the cajo project, in the ease with which distributed computing can be accomplished in Java; and in only 20 kilobytes. It works with all JREs, 1.3 and later. (And before you Rocket Scientists out there ask; yes, it's also 64-bit clean ;)
Just three methods: (click the link, for greater detail)
void export(Object object);
on Sep 3, 2007
I've raved about the MapReduce parallel programming model in the past, and Apache Hadoop (the framework for running MapReduce applications), and Amazon's compute and storage webservices (EC2 and S3). Now I've written an article - Running Hadoop MapReduce on Amazon EC2 and Amazon S3 - about using them all together to do some data crunching.
The nice thing is that you can fire up a fair sized...
on Jul 20, 2007
SalutafugiJMS is a peer-to-peer implementation of the Java Messaging Service specification that uses ZeroConf DNS-SD discovery and TCP sockets to communicate in a distributed computing system. I built it after seeing Daniel Steinberg's JavaOne talk on ZeroConf. SalutafugiJMS uses SomnifugiJMS as a skeleton and Apple's Bonjour implementation of ZeroConf for muscle inside special SomnifugiJMS...
on Jun 24, 2007
There's a small presence at JavaOne... but an advanced research project at Sun, "Project Caroline", is gaining some *real* interest at the conference.
Project Caroline is a hosting platform designed to support SaaS providers in the development and delivery of dynamically scalable Internet-based services. The key idea around the platform is to present a pool of distributed compute, storage, and...
on May 8, 2007
Ok, I think I've spent enough time on preliminaries, so this time I'm gonna show you some UML diagrams and code. I also have to introduce you Emmanuele Sordini, one of my best friends and co-author of the Mistral project. Emmanuele is an engineer like me (but he's more on the C++ side) and an amateur photographer like me (but he's more on the astronomic photography) and some months ago told me...
on Nov 21, 2006
In March I wrote of affordable web-scale computing:
I would love an API that exposes Google's MapReduce, a simple programming model for crunching on large datasets. You can write and run MapReduce programs today, using Hadoop, but it's only really useful if you have enough machines at your disposal. The pay-as-you-go model of S3 (and Sun Grid) would be very attractive to developers who want...
on Aug 24, 2006
In case you haven't heard of it, Amazon S3 is a web service for storing data.
The two great things about it are that it's simple (look at its nice REST API), and it's cheap (with a pay-as-you-go charging model).
This latter point explains the growing number of startups that are using it to launch new business ventures: no data silos to maintain, and pay by the gigabyte.
My favourite innovative...
on Aug 13, 2006
With the launch of Amazon S3 (Simple Storage Service) we are seeing a continuation of the trend for the big web companies to monetize their computing infrastructure by opening it up to developers.
It is probably only a matter of time before we see Google create something similar, which would essentially be a limited public interface onto the Google File System.
I would love an API that exposes...
on Mar 17, 2006
In a previous blog
I wrote about Nutch's MapReduce implementation, for distributed processing of massive data sets. This, and the closely related Nutch Distributed File System (renamed Hadoop Distributed File System), have now been moved into a standalone project called Hadoop.
According to Doug Cutting, who created Hadoop (as well as Lucene and Nutch), the name comes from:
The name my kid...
on Feb 8, 2006