Skip to main content
View by: Most Recent | Topic | Community | Webloggers   
Monthly Archives:    

Blogs by topic Programming and user alois

• Accessibility • Ajax • Blogging • Business • Community 
• Databases • Deployment • Distributed • Eclipse • Education 
• EJB • Extreme Programming • Games • GlassFish • Grid 
• GUI • IDE • Instant Messaging • J2EE • J2ME 
• J2SE • Jakarta • JavaFX • JavaOne • Jini 
• JSP • JSR • JXTA • LDAP • Linux 
• Mobility • NetBeans • Open Source • OpenSolaris • OSGi 
• P2P • Patterns • Performance • Porting • Programming 
• Research • RMI • RSS Feeds • Search • Security 
• Servlets • Struts • Swing • Testing • Tools 
• Virtual Machine • Web Applications • Web Design • Web Development Tools • Web Services and XML 


Programming

Here is a little code challenge ! I'm actually working on a text-mining/semantic web application focused (for the moment) on biomedical informations and developed in Java. We are using external tools for text-mining analysis and unfortunatly theses tools don't handle HTML pretty well ... If we send raw HTML to the text-mining service, he simply break. So we must convert HTML to plain-text before processing text, and because the tools return identified words by giving their positions, we must translate theses position (or indexes) to find corresponding word in the original HTML. I created a simply implementation and posted it on gist.github.com ... can you make it better ?
on Jun 30, 2010 | Permalink | Discuss
 I'v just published an integration module for using GridGain with Spring Batch. Using this module you can distribute Spring Batch processing inside a GridGain grid with the implementation of remote chunking.
on May 4, 2010 | Permalink | Discuss