sanjay_dasgupta's Blog
A Java7 Grammar for VisualLangLab
VisualLangLab now has a Java7 (JDK7) grammar! Read on to find out how you can use the grammar to locate usages of the new Java7 (project coin) language features in the source-code of the Oracle JDK 7u3 itself.
If you are new to VisualLangLab (an easy-to-learn-and-use parser-generator) condsider reading the tutorial A Quick Tour.
Java7 (JDK7) Grammar Specification
The Java7 (JDK7) grammar used is based on the contents of Chapter-18. Syntax, of The Java Language Specification (Java SE 7 Edition). A PDF version of this book is available online. The grammar in the book has been changed a little as described below.
Java7 (JDK7) Features
The grammar includes the following Java7 "project coin" language features
- Strings in switch
- Binary integral literals
- Underscores in numeric literals
- Multi-catch
More precise rethrow(see details below)- Improved type inference for generic instance creation (diamond)
try-with-resources statementSimplified varargs method invocation(see details below)
The grammar passes samples of code containing features 5 and 8 (More precise rethrow and Simplified varargs method invocation respectively), although no particular changes were made (relative to the Java6 grammar). Because of this, the grammar can not specifically distinguish these constructs (from the containing Java6 feature).
Actions to Print Feature-Name and Line-Number
Simple actions have been added to the grammar that print out a short message giving the feature-name and line-number whenever any of the new Java7 language features is recognized. The darkened rectangular areas in Figure-1 below (in which the VisualLangLab GUI is being used as a run-time environment for the Java7 grammar) illustrate some such output. There are no such actions for features 5 and 8 (see list above) since the grammar can not specifically distinguish these features.
The output of the actions (darkened areas in Figure-1 below) appear before the status line itself. Thus, in Figure-1 below, the four Diamond announcements belong to the file PlatformComponent.java , while the next set of announcements (two Multi-Catch and one Diamond) belong to the file ManagementFactory.java.

Figure-1. Action output indicating new language feature use and location
The last part of this article (Where are the Actions?) describes how you can locate and inspect the action-code functions that produce the highlighted output in Figure-1 above.
Grammar Changes
The following changes (to the contents of Chapter-18. Syntax) were required to make the grammar accept all source files of the Oracle JDK 7u3. The changed grammar rules are reproduced below. Additions to the original grammar are underlined, while deletions are struck out. Certain changes were made even after this blog was first published; notes within the grammar identify these changes. The attached grammar file (jls-se7-NN.txt, see below) has also been updated as needed.
Try it Yourself
To try it yourself, proceed as follows:
- Download the latest version of VisualLangLab: VLL4J.jar (you must have version 10.23 or later, as earlier versions do not recover from java bug 5050507).
VisualLangLab is started just by double-clicking VLL4J.jar. You must have a JRE (6.0 or later) installed, and users on Linux, UNIX, Mac OS, etc. will need to enable execution (chmod +x ...) first - Get the VisualLangLab Java7 grammar: jls-se7-38.txt. After the file has been downloaded, rename it to jls-se7-38.vll (".vll" is the standard file-extension for VisualLangLab grammar files, but java.net blogs do not permit attachment of such files). Within VisualLangLab, open the grammar file by clicking the Open button (near the red "1" in Figure-2 below) or invoking File -> Open from the main menu, selecting jls-se7-38.vll in the file-chooser dialog presented, and clicking the Open button. (Grammar file updated 2012-MAR-09 08:40 IST)
- Unzip the file src.zip from the Oracle JDK into a directory with a well-known name.
- Within VisualLangLab, click the Parse file button (near the red "2" in Figure-2 below) or select Test -> Parse file from the main menu, select the directory containing the JDK source files (from step 3 above), and click the Open button.
VisualLangLab dredges up all the files contained in the chosen directory tree (last step above), and parses them one by one. You should see a growing/scrolling list of status information (one line per file) in the Parser Log area (bottom right of GUI), as in Figure-3 and Figure-4 below. The time taken to complete parsing of all 7485 files will vary depending on the power of your computer. On my desktop computer (Pentium Dual-Core E5700 @ 3.00 GHz with 2 Gb memory, running Ubuntu 10.10) it takes approximately 11 minutes.
Important note: The top-level parser rule CompilationUnit must be selected in the toolbar's dropdown-list (as in Figure-2 below) when parsing is started (step 4 above).

Figure-2. VisualLangLab buttons
Analyzing the Results
When parsing the Oracle JDK 7u3's source files you should see 16 failures (see the status line at the bottom of the GUI after all files are parsed). A group of 14 failures occur because the files contain C (source and header) code belonging to Java's launcher. The red status lines in Figure-3 below show this group.
Figure-3. Parse failures of C source and header files under directory launcher
In addition, you may see a few more failures that occur as a consequence of java bug 5050507 within VisualLangLab's lexical-analyzer. Figure-4 below shows some such failures. The number of these failures is not consistent -- being dependent on the amount of memory available to the JRE, the JRE version, etc.
Figure-4. Parsing failures due to Java bug 5050507
Which Java7 Features are Used in JDK 7u3?
For greater flexibility in analyzing the Parser Log information, you should copy it into a text-file first. The logged information can be copied to the clipboard by clicking the Copy log button (near the red "3" in Figure-2 above) or selecting Log -> Copy log from the main menu. You can then paste (Edit -> Paste in most editors) the copied information into an empty text file.
Source-files that failed to parse can be found by searching for the string ": ERROR" (without the quote marks, and with one blank between the colon and the 'E'). Source-files that use specific Java7 language features can be found by searching for the following strings:
- multi-catch
- try-with-resource
- case-with-string
- diamond
- underscore-numeric-literal
- binary-literal
My own analysis of the Parser Log produced the following results:
- diamond - most used Java7 language feature
- try-with-resource - used in 7 files
- multi-catch - used in 5 files
- case-with-string - used in just 1 file
- underscore-numeric-literal - not used in Oracle JDK 7u3
- binary-literal - not used in Oracle JDK 7u3
Where are the Actions?
If you want to locate and understand the actions that produce the messages shown above, this section is for you.
Parser-rules that contain one or more actions are distinguished with a small, green icon with a white arrow shape as in Figure-5 below (above the red "1" in the figure). After selecting such a parser-rule, look for rule-tree nodes with the action annotation (like the one above the red "2" in Figure-5). Selecting (clicking on) such a node causes its action-code function to be displayed under the Action Code panel (top right of GUI, at red "3" in the figure).
Figure-5. Inspecting action-code functions
Action-code functions are explained fully in Action Code Design. The action-code functions used with this grammar vary widely in complexity. The structure/complexity of the action-code reflects the structure and complexity of the AST produced by the parser (which is explained in AST Structure and Action Code).
A tutorial that explains parser development with VisualLangLab can be found in VisualLangLab - A Quick Tour. If you are a Scala user, you may also find Rapid Prototyping for Scala Parser Combinators interesting.
| Attachment | Size |
|---|---|
| VisualLangLab-Buttons.png | 18.07 KB |
| Launcher-Files-Errors.png | 83.42 KB |
| Action-Output.png | 72.01 KB |
| Java-Bug-5050507-Errors.png | 92.6 KB |
| Inspecting-Actions.png | 84.42 KB |
| jls-se7-35.txt | 49.57 KB |
| jls-se7-37.txt | 49.12 KB |
| jls-se7-38.txt | 49.09 KB |
- Login or register to post comments
- Printer-friendly version
- sanjay_dasgupta's blog
- 5222 reads
Vulcan-ized Rhino: Telepathic Power for your Code
In this article we coax the JVM's Rhino (an elusive, misunderstood, and ignored member of the ecosystem) into a mind meld, giving it access to the JVM's thoughts, experiences, memories, and knowledge; and take it where no Rhino has gone before !
Let me set the context with some quick code:
ScriptEngineManager sem = new ScriptEngineManager();
ScriptEngine jsEngine = sem.getEngineByName("javascript");
...
String message = "Hello rhino!";
...
jsEngine.eval("println(message)");
Everyone knows that this code does not work (it produces a "ReferenceError: "message" is not defined"). To make it work the variable message must be put into the script engine's bindings, as described in these articles. That's easily done. But the overhead and distraction of the extra boilerplate makes the body of code much less intuitive. (The 4-line example above already has 2 lines of distracting boilerplate!)
A Quick Example
What can we do to make something as simple as "println(message)" in a script just work? In fact, let's raise the bar some more. Take a look at Sqrt.java. Let's say you were explaining that code to a novice, and wanted to provide a probe into the while loop of the running program, by adding the line in red:
...
while (Math.abs(t - c/t) > epsilon*t) {
t = (c/t + t) / 2.0;
if (args.length == 2)
VulcanRhino.eval(args[1]);
}
...
Think of class VulcanRhino as your friendly telepathic pachyderm, and eval() its static, void JavaScript evaluator. The idea is that a JavaScript snippet could be passed into the program as an optional second command-line argument. That snippet (specified at run time) could contain logic with references to any of the in-scope Java variables. The code above is a simple example. But this approach allows you to include any number of VulcanRhino.eval()s, located wherever the invocation of a static void function would be legal, each executing a different script. Each invocation of VulcanRhino.eval() has access to all in-scope variables at its location.
Our modified Sqrt.java would run normally (doing nothing unusual) if run with just one command-line argument, but giving it a second argument awakens the slumbering telepath. Here are a few sample runs (the different colors separate the command line from the program's output) ...
| See how "t" evolves | Track value of "c/t" |
|---|---|
|
|
The last line of output (struck out) is not from the script, but is the program's normal 1-line output. The examples above use scripts to track the values of "t" and "c/t" respectively. But you are free to pass in any expression that makes sense at the location of VulcanRhino.eval(). You may even use it for something completely unforeseen ...
| Timing the loop | Memory problem? |
|---|---|
|
|
The one thing you can not do with a script in this way is to assign a value to a variable.
The Vulcan-Rhino User Guide
To use this approach, you must pre-process your source-code using a tool described below. This step is the key to the magic -- it augments each VulcanRhino.eval() in your code with something that gives it access to all the in-scope variables. So, proceed as follows:
- edit your program (say Sqrt.java), adding
VulcanRhino.eval()s as required, and save it with a different name (say SqrtVR.java) - pre-process SqrtVR.java following instructions below. Save the output as Sqrt.java. Note: this overwrites any other Sqrt.java
- run as usual, making sure that class
VulcanRhinois on the classpath. The VulcanRhino.java source should be compiled and deployed as required.
To pre-process a file use the following command:
java >Sqrt.java -cp VLL4J.jar net.java.vll.vll4j.api.Vll4j VulcanRhino.vll SqrtVR.java
The files used are described below:
- VLL4J.jar is the distribution JAR from the VisualLangLab project (a visual parser-generator)
- VulcanRhino.vll is the transformation grammar described further under Pre-Processor Internals below
- SqrtVR.java is actually Sqrt.java saved with a different name
If you have trouble with the above steps, check the following:
- does a SqrtVR.java file exist?
- have you edited SqrtVR.java to add
VulcanRhino.eval(args[1]) - copy and paste the command line above directly
- ensure
VulcanRhinohas been compiled and exists on the classpath
How Does it Work?
Let's first get VulcanRhino out of the way. Observe that eval() does nothing special, but there is another function defVars() that enables the caller to inject information about variables into the JavaScript engine.
import javax.script.ScriptContext;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
public class VulcanRhino {
public static void eval(String script) {
try {
engine.eval(script);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void defVars(Object... args) {
engine.getBindings(ScriptContext.ENGINE_SCOPE).clear();
for (int i = 0; i < args.length; i += 2) {
String name = (String)args[i];
Object value = args[i + 1];
engine.put(name, value);
}
}
static ScriptEngine engine = new ScriptEngineManager().getEngineByName("javascript");
}
Next take a look at the pre-processed version of Sqrt.java.
...
while (Math.abs(t - c/t) > epsilon*t) {
t = (c/t + t) / 2.0;
if (args.length == 2)
{VulcanRhino.defVars("epsilon", epsilon, "c", c, "t", t, "args", args); VulcanRhino.eval(args[1]);}
}
...
The part you added is still in red. But the pre-processor has spliced in the blue text. The pre-processor makes this change at each occurrence of VulcanRhino.eval(...), injecting information about the locally visible variables into the JavaScript engine.
Pre-Processor Internals
I won't go into all the details here, presuming that not everyone is interested. So the remaining part of the article is a short summary of the technique together with links to all the other material you will need to understand the details.
The pre-processor uses a parser for the Java language to analyze your program and obtain information about which variables are visible at each VulcanRhino.eval(...) location. It then modifies the source code by wrapping each VulcanRhino.eval(...) in a block ({ ... }) preceded by a VulcanRhino.defVars(...) call that injects the information required into the JavaScript engine.
The parser-generator used is the easily learned, completely visual tool VisualLangLab. For an introductory tutorial look at A Quick Tour. Scala programmers will find Rapid Prototyping for Scala useful too.
The last piece of the puzzle is in the grammar file VulcanRhino.vll. This file contains a Java grammar modified with action functions that perform the pre-processing. To examine the grammar, its rules, and the code in the action functions, proceed as follows:
- double-click VLL4J.jar (the same file used in the pre-processing step described above). this will start up the VisualLangLab GUI as shown in Figure-1 below
- select "File" -> "Open" from the main menu, choose the grammar file VulcanRhino.vll, then click the Open button
- in the rule-tree (the JTree at the left of the GUI) select (click on) the node just below the root node (see red arrow). This will cause the action-code associated with this parser-rule to be displayed under Action Code (right side of the GUI). This is the code (in JavaScript) that pre-processes your code
alt="Using VisualLangLab">
Figure-1 VisualLangLab GUI with VulcanRhino grammar loaded
The information used by the action-code above is in several global variables (VLL members). That information is gathered by other action-code in other rules. To examine all the remaining code proceed as follows:
- select the rule named block (use the combobox in the toolbar), and click on the reference node labeled blockStatement
- select the rule variableDeclaratorId, and click on the sequence node just below the root node
- select statement, click on the node just below the token node for FOR
If you do want to pursue this further, a thorough reading of A Quick Tour is strongly recommended. You will also need AST and Action Code and Editing the Grammar Tree.
| Attachment | Size |
|---|---|
| Using-VisualLangLab.png | 49.78 KB |
- Login or register to post comments
- Printer-friendly version
- sanjay_dasgupta's blog
- 1912 reads
Preview of VisualLangLab Pure-Java Version Avaliable
A preview of the pure Java version of VisualLangLab is available here. The GUI, and other characteristics, remain virtually unchanged (see documentation), but the download is very much smaller as it does not bundle the entire Scala API. The preview does not yet support packrat parsing, and an API for application programs is not yet available. All grammar development and testing features are however available.
The reduction in jar-file size was achieved by rewriting in Java the elements of the Scala API actually used (and not bundling entire jar files). This pure Java version will differ from the previous version in the following ways:
- Action-code must be written in JavaScript only.
- Minor changes to the AST structure.
- The AST is described in terms of Java/JVM data-structures.
- The API will support application programming in all JVM languages uniformly (including Scala).
The documentation, examples, and sample-code all remain applicable with no (or very minor) changes. You can get the preview version here. To start VisualLangLab Java version, just download and double-click the VLL4J.jar file.
Help me test this version by using it for your next parser project. For a comprehensive tutorial, see A Quick Tour. This tutorial was written for the previous version, and uses a few Scala action-code functions (which are not supported in the pure Java preview). But you can find JavaScript versions of these action-code functions in the sample grammars bundled with the preview version. Just select "Help -> Sample grammars -> TDAR-Expr-Action" from the main menu (of the pure Java preview version).
- Login or register to post comments
- Printer-friendly version
- sanjay_dasgupta's blog
- 1252 reads
VisualLangLab 7: New Features, Expanded Tutorials
A new tutorial that exercises VisualLangLab using all the examples and techniques in Chapter-3, A Quick Tour for the Impatient, of the book The Definitive ANTLR Reference can be found at this link.
Various other improvements have been made in version 7:
- A new WildCard pseudo-token that matches any defined token has been added to facilitate recovery from errors in the grammar.
- Each part of the AST description is now tagged with the contributing sub-rule's description field to clarify the sub-rule to AST-segment association.
- Intuitive icons have been added to the rule-tree's context menus
- use of the token-creation dialogs have been simplified
These changes make VisualLangLab even more powerful and user friendly.
- Login or register to post comments
- Printer-friendly version
- sanjay_dasgupta's blog
- 2906 reads
VisualLangLab supports all JVM Languages
With the release of version 6.01, VisualLangLab can support all -- present & future -- JVM languages.
VisualLangLab's approach of composing parsers at runtime by using combinator functions instead of generating code (as other parser generators do) enables these parsers to be embedded into a host program in any JVM language. Eschewing code generation also eliminates all host-language specificity, so all yet-to-be-invented languages are already supported!
Before 6.01, the parser-generated AST was defined in terms of Scala types, so the API was awkward to use from any other language. Release 6.01 optionally provides ASTs crafted from basic JVM types only, so host programs in any JVM language -- present or future -- will be able to use the API natively.
The VisualLangLab documentation includes example host programs in Scala and Java. A third example featuring Clojure is in the works, and will feature in the documentation soon.
- Login or register to post comments
- Printer-friendly version
- sanjay_dasgupta's blog
- 1490 reads




Comments
There was an error in the grammar file (VulcanRhino.vll) ...
by sanjay_dasgupta - 2012-01-28 01:21
There was an error in the grammar file (VulcanRhino.vll) that I corrected at around 08:10 hours GMT on 28th Jan. Although the example in the article would still have worked correctly, anyone who tried to use this approach with code containing a for loop would have noticed that the for's index variable was not being removed from scope at the end of the for statement. My apologies for any inconvenience caused.