Thursday, February 28, 2013

Creating a Maven Project in Eclipse

This tutorial demonstrates how to create a simple Maven project in Eclipse. I assume you already have Eclipse installed. If you don't have Maven installed yet, read my article on installing Maven, before proceeding. This tutorial also uses Eclipse and the m2eclipse plugin. Read my tutorial on setting up Maven in Eclipse, if you haven't setup the m2eclipse plugin.

Now that you have Eclipse, Maven, and the m2eclipse plugin installed, let's get started creating a simple Maven project in Eclipse. Within Eclipse, click the "File -> New -> Project..." menu option. Expand the Maven option, select "Maven Project" and click "Next >".



You'll be presented with the "New Maven Project" dialog. Leave the default options as-is and click "Next >".


The next dialog shows you a list of the archetypes available. An archetype is a packaged application framework that is reusable and redistributable. For this example, choose the "maven-archetype-quickstart" archetype and click "Next >".


Next, we need to provide a group id and artifact id for our Maven project. The Group ID will be used to identify your project across all of your Maven projects. The Group ID naming scheme is the same as a package name in Java. Typically, your Group ID will be your domain name followed by the name of you project, and possibly followed by a sub-project name, if you plan to have multiple sub-projects or subsystems within your main application. The Artifact ID is the name you want for your JAR file when your project is packaged and deployed. For this tutorial, enter "org.examples.quickstart" for the Group Id and "quickstart-app" for the Artifact Id. Leave the version number as-is and enter whatever you'd like for your initial package name. I chose org.examples.quickstart for my project. Click the "Finish" button.


After clicking "Finish", you'll find your Maven project has been created and viewable in your Eclipse project explorer.



Setting Up Maven in Eclipse

The easiest way to use any tool (in my opinion) is to have it integrated with your IDE. This tutorial shows you how to setup Maven in your Eclipse IDE. If you don't already have Maven installed, read my tutorial on installing Maven before proceeding with this tutorial.

With Maven and Eclipse installed on your PC, the next step is to install the m2eclipse plugin. We'll use the Eclipse marketplace to install the m2eclipse plugin. From within Eclipse, click the Help -> Marketplace menu option. Next, click on the "Search" tab. Enter "maven integration" in the search field and click the "Go" button. When the search is complete, there will be a list of options. Scroll down to find "Maven Integration for Eclipse" and click the "Install" button.


The next dialog will ask you to confirm the installation. Select just the "m2e - Maven Integration for Eclipse" option and click the "Next >" button.




You'll be prompted to accept the terms of the license agreement. Accept the terms and click the "Finish" button. At the end of the installation process, you'll be prompted to restart Eclipse. Proceed by restarting your Eclipse IDE. To begin using m2eclipse, read my tutorial on creating a simple Maven project.

Installing Maven on Windows

It goes without saying, that Maven depends on Java. This tutorial assumes that you're running at least version 1.5, so before proceeding with this tutorial, run "java -version" from your command prompt to verify the version of your Java installation.

You can download Maven from the Apache website: http://maven.apache.org/download.cgi

Simply download the binary ZIP of the most recent version to your desktop.



After downloading the ZIP file, unzip it to whatever directory you like. I'll assume (for simplicity), that you've unzipped it to the root of your C: drive, resulting in C:\apache-maven-3.0.5\

After you've unzipped the binary, there are two environment variables you need to set. The first is M2_HOME. The other is to append the PATH variable.





To test your Maven installation, open a new command prompt and type the "mvn -v" command. If you've installed it correctly, you should get a description of your Maven installation. If you don't see the Maven description, double-check that your M2_HOME and PATH environment variables are set correctly.


Thursday, February 14, 2013

Introduction to Java Bytecode

When you compile your Java applications, the Java compiler compiles your code down to Java bytecode that is run on the Java Virtual Machine (JVM). Java bytecode is similar to assembly language in its set of instructions. This is a very simple introduction to Java bytecode, along with an example of how a simple arithmetic expression will look when compiled down to bytecode. To keep this tutorial at a "bytecode 101" level, let's only focus on the following four bytecode instructions.
  1. ldc  integer-constant
  2. This instruction will push a constant integer value onto the stack.
  3. imul
  4. This instruction will multiply the top two integer values on the stack, replacing the two integers with the result of the multiplication operation.
  5. iadd
  6. This instruction will add the top two integer values on the stack, replacing the two integers with the result of the addition operation.
  7. isub 
  8. This instruction will subtract the top two integer values on the stack, replacing the two integers with the result of the addition operation.

Take the following arithmetic expression:

7 + 8 * 9 - 24

This expression will be compiled down to the following Java bytecode:

ldc   24
ldc   7
ldc   8
ldc   9
imul
iadd
isub

As you can see, the JVM places information and data onto the stack and performs stack operations on the information. Obviously, this is not a complete coverage of bytecode operations and the JVM, but I hope this helps provide a basic understanding of what Java bytecode looks like, while also showing how it works. In future tutorials, I'll provide more complicated examples. In the meantime, if you're interested in more information about the JVM and some of the optimizations it performs, read my blog post on JVM optimizations.

Tree Grammars - Abstract Syntax Trees in ANTLR

This tutorial shows how to create an abstract syntax tree (AST) from a combined grammar in ANTLR. Let's start with a simple grammar in ANTLR.

grammar someGrammar;

options {
  language = Java;
  output = AST;
}


program

    :
        variableDeclaration+
    ;
   
variableDeclaration

    :
        type ID ';'

    ;

type
    :
        'int' | 'char'
    ;

ID
    :
        ('a'..'z' | 'A'..'Z' | '_') ('0'..'9' | 'a'..'z' | 'A'..'Z' | '_')*
    ;

WS : ('\t' | '\r\' | '\n' | ' ')+ { $channel = HIDDEN; } ;


The first step in creating an AST is to make sure that we have output = AST; defined in the options section of your grammar. However, this alone, will not create an AST. Alone, this will create a parse tree, which is essentially a general form of an AST. A parse tree will consist of everything in our source code, including noise that doesn't provide any semantic meaning to your language, such as semicolons that terminate statements. Moreover, the parse tree will be a flat tree (i.e. just a flat list of tree nodes). What we'd like, however, is a tree with depth and structure, like this.

 

To remove the noise that's inherent in a parse tree, give it depth/structure, and generate an AST that only includes the relevant information we need, the easiest method is to add rewrite rules to our grammar. These rewrite rules tell ANTLR how we want our AST to be structured. Here's how we would add a rewrite rule to the variableDeclaration definition to tell ANTLR to restructure variableDeclaration to a more useful AST node. Note that I've bolded the two changes that I made to the grammar. I added a tokens section, in addition to a rewrite rule to the variableDeclaration.

grammar someGrammar;

options {
  language = Java;
  output = AST;
}


tokens {
  VAR;
}

program

    :
        variableDeclaration+
    ;

variableDeclaration
    :
        type ID ';'

        ->   ^(VAR type ID)
    ;

type
    :
        'int' | 'char'
    ;

ID
    :
        ('a'..'z' | 'A'..'Z' | '_') ('0'..'9' | 'a'..'z' | 'A'..'Z' | '_')*
    ;

WS : ('\t' | '\r\' | '\n' | ' ')+ { $channel = HIDDEN; } ;



The tokens section defines an imaginary token/node called VAR that we later use in the rewrite rule. This was created because we want to know that it's a variable declaration; however, there is no VAR keyword in the grammar. The rewrite rule is defined as ^(VAR type ID). Note that the terminating semicolon isn't part of this rule. That's because we don't want the semicolon in our AST. The semicolon is important to the syntax of our language to terminate statements, but once we've parsed the input source code, we don't need the semicolon, because it doesn't provide any semantic meaning.

In summary, an AST is a tree with depth and structure, that eliminates anything from the input source code that doesn't provide semantic meaning. In addition, it can (as in this example), add imaginary tokens to help add necessary structure to support the semantic meaning of the source code. In a later tutorial, I'll demonstrate how to write a tree walker to process the resulting AST.