This post will help you get started using Apache Spark DataFrames with Scala on the MapR Sandbox. The new Spark DataFrames API is designed to make big data processing on tabular data easier. A Spark DataFrame is a distributed collection of data organized into named columns that provides operations to filter, group, or compute aggregates, and can be used with Spark SQL.
on Jun 28, 2015
SQL will become one of the most prolific use cases in the Hadoop ecosystem, according to Forrester Research. Apache Drill is an open source SQL query engine for big data exploration. REST services and clients have emerged as popular technologies on the Internet. Apache HBase is a hugely popular Hadoop NoSQL database. This blog post discusses combining all of these technologies: SQL, Hadoop, Drill...
on Jan 6, 2015
Web Services and XML
Backbone.js gives structure to web applications by providing models with key-value binding and custom events, collections with a rich API of enumerable functions, views with declarative event handling, and connects it all to your existing API over a RESTful JSON interface. JAX-RS provides a standardized API for building RESTful web services in Java. This example will show how to...
on Sep 16, 2013
This and the next series of blog entries will highlight the Top
10 most critical web application security vulnerabilities
identified by the Open
Web Application Security Project (OWASP).
You can use OWASP's WebGoat
to learn more about the OWASP Top Ten security vulnerabilties. WebGoat
is an example web application, which has lessons showing "what not to
do code", how to exploit the code, and...
on Sep 29, 2009
Here is a review of some concurrency tips from Joshua Bloch, Brian
Goetz and others.
Prefer immutable objects/data
Immutable objects do not change after construction. Immutable objects
are simpler, safer, require no locks, and are thread safe. To
make an object immutable don't provide setters/mutator methods, make
fields private final, and prevent...
on Sep 17, 2009