Tuesday, June 04, 2013

Deploy Application Binaries (*.war) to OpenShift

RedHat OpenShift is a PaaS that provides a cloud hosting for your applications.

I'd like to share a practice that I use to deploy my Java application to OpenShift.

I only have experience with Tomcat 7 (JBoss EWS 2.0) cartridge and non-scalable applications, so I will talk about them. However this may be applied to other environments.

I use GitHub to store my application codebase, and I also use Gradle as a build tool.

If you use Maven for your builds and you have all your dependencies in public Maven repositories or these repositories that are accessible from OpenShift, then this blog post is likely not for you.

As of today OpenShift does not support Gradle as a build tool, and I have some of my dependencies in my private/local repositories that are not available from OpenShift, this is why I build my application locally and only deploy binaries to OpenShift.

When you create OpenShift application there is a Git repository that you may use to deploy your code. You can also use this Git as your primary source storage (or you can synchronize with your GitHub repo), but I don't do this.

This Git repo has specific directory structure and OpenShift auto-deployment rely on this structure, this is one of the reasons I don't use this Git repo as my primary code base -- I use multiple deployment targets for my project and OpenShift is only one of them.

The directory structure contains /webapps folder where you can put your *.war file and OpenShift will deploy it when you Git push.

If you do this, however, you will find soon that your Git repository will eat all your server-side disk quota (which is only 1GB for free). This is because remote Git repository will hold all revisions of your  binaries. My *.war file size is near 50MB -- this is typical for most small-to-medium Java applications. So after you do 20 deployments -- you will be out of free space.

Usually you don't need all these revisions of your binaries, so to fix this situation you first should delete your remote Git history and adopt some other practice for deployments.

Here is how I do this.

Delete old revisions of your binaries from your remote OpenShift Git repo

  1. First you need to do a git clone or a git pull to fetch recent version of your remote repo. Lets name the folder you've cloned to as OLD_REPO. You will need this to restore your git hooks that are in the .openshift subfolder, and maybe some other configs except your binaries (see step 8 below). 
  2. SSH connect to your OpenShift instance.
  3. cd ~/git/.git/objects
  4. rm -rf *
  5. cd ..
  6. rm refs/heads/master
  7. Do a fresh git clone from remote OpenShift Git. It will tell you that you've cloned empty repository -- this is correct, your remote repository now clean. Lets name your new clone folder as NEW_REPO.
  8. Copy contents of OLD_REPO to the NEW_REPO. You should copy all except .git folder, because NEW_REPO will already contain itself .git folder.
  9. Delete NEW_REPO/webapps/*.war -- these are your previous binaries:
    git rm webapps/*.war
At this stage you will have empty remote Git repository and local clone with latest revision of what you've had in remote before deleting it except your binaries.

Way to deploy new binaries

To deploy new binaries you have to copy them manually to OpenShift. I do this using SCP command.

I created a shell-script upload-war.sh with the following content:

scp $PROJECT_X_WORKSPACE_DIR/project-x/project-x-web/build/libs/*.war $PROJECT_X_OPENSHIFT_ADDRESS:~/app-root/data/webapps/ROOT.war

As you see I use environment variables to tell the script where my local binary is located and where should I place them in remote OpenShift. PROJECT_X_OPENSHIFT_ADDRESS is the username@address you use when connect to OpenShift by SSH.

My project has only one target *.war artifact and I copy it to remote data folder under the ROOT.war name. In OpenShift the data folder is the place where you store your custom files.

After I copied the file I have to tell OpenShift to deploy it.
To do this I modify build action hook which is located here: NEW_REPO/.openshift/action_hooks/build to make it look like this:

#!/bin/bash
# This is a simple build script and will be executed on your CI system if
# available.  Otherwise it will execute while your application is stopped
# before the deploy step.  This script gets executed directly, so it
# could be python, php, ruby, etc.

cp $OPENSHIFT_DATA_DIR/webapps/* $OPENSHIFT_REPO_DIR/webapps/

Here $OPENSHIFT_DATA_DIR and $OPENSHIFT_REPO_DIR are OpenShift built-in environment variables.

Add this script to version control, commit and push to OpenShift remote.

When you commit this hook will copy the binary you copied earlier and deploy it. So next time when you will release new version, just run upload-war.sh and do some dummy commit/push to the OpenShift remote and thats it.

Sunday, April 07, 2013

Render Tapestry5 Block to a string from code

Recently I integrated Select2 component to my tapestry5 application.

Select2 can load data from server using ajax when user scrolls through drop down.

There could be any data formats in ajax response provided that developer implements javascript results() function that parses the response into the format expected by select2.

Select2 provides two more callback functions that developers usually implement to post-process returned data:
  • id() function that retrieves id from the choice object;
  • formatResult() function that builds HTML markup which is used to display choice object in drop down.
I used Tynamo's tapestry-resteasy to build REST service that returns data to select2, and my REST service returned everything needed to implement id() and formatResult() functions.

I could just implement formatResult() to build HTML markup for select2, but I already had similar tapestry5 component that builds the same markup and I wanted to increase code reuse.

To do that I built tapestry5 service that does just this – EventResponseRenderer (see code below).

To use it:
  1. Declare a block you want to render, as you usually do. In my example I created separate page –internal/CompanyBlocks.tml – and there is my addressBlock;
  2. Declare event handler method that handles "Render" event and returns the block you want to render (note that you may also use tapestry5 AjaxResponseRenderer.addRender());
    1. In my example this is:
      • public Block onRenderFromCompanyAddress(Company company)
    2. Note that you must specify id of tapestry5 component in event handler method name. This is the limitation of my EventResponseRenderer implementation and only required to conform tapestry5 API. This could be id of any component from the page;
    3. You will probably want to declare some parameters that you will use to initialize page properties required for block renderer. You don't have to implement type coercers for them, because you will pass values for these parameters from code;
  3. @Inject instance of EventResponseRenderer and call its render() method passing instance of RenderEvent to it. RenderEvent constructor accepts pageName where event handler method declared, componentId that was used in event handler method name (see previous step), and eventArgs – list of objects that will be passed as parameters to the event handler method. See method CompanyResourceImpl.createMatch()
  4. Thats it.

Usage Example



EventResponseRenderer Implementation


Sunday, March 11, 2012

Serving Tapestry5 Assets As Static Resources

In Tapestry5 you use assets to reference *.js*.css or image files from your templates/code. The reference may look like:

    <link rel="stylesheet" type="text/css" href="${context:/css/all.css}" />

During the render phase Tapestry5 converts the ${context:/css/all.css} part to asset URL, which may look like the following (see Asset URLs section here):

    <link rel="stylesheet" type="text/css" href="/assets/stage-20120310/ctx/css/all.css" />

Here "stage-20120310" -- is an application version string, which Tapestry5 adds to asset URLs to manage assets versioning. When running in production Tapestry5 adds a far future expires header for the asset, which will encourage the client browser to cache it.

When you change one of your assets you have to change application version number in your AppModule.java, so that Tapestry5 generate new asset URLs and browser fetched new assets instead of using the ones from cache.

One disadvantage of such approach is that client browser will have to get all the assets once again, not just the one that was changed.

For the majority of assets the asset URL is generated by Tapestry5. Exceptions are assets, that are referenced from *.css files by the relative URL, like this (file all.css):

a.external {
background: transparent url(../images/external.png) no-repeat scroll right center;
display: inline-block;
margin-left: 2px;
height: 11px;
width: 11px;
zoom: 1;
}

In this case browser will form the URL itself relatively to "/assets/stage-20120310/ctx/css/all.css", and the resulting URL will be "/assets/stage-20120310/ctx/images/external.png".

So you have to change application version in AppModule.java if you provide new version of "external.png".

But, for the majority of assets it would be enough to append MD5/SHA1/... checksum as a GET-parameter to asset URL and make them look like:

    <link rel="stylesheet" type="text/css" href="/assets/stage-20120310/ctx/css/all.css?5ef25ac1ec38f119e283f338e6c120a4e53127b1" />

In Tapestry5 you have the ability to provide your own implementation of AssetPathConverter service and append this checksum manually. But, in this interface you only have original asset URL, and don't have the resource itself to calculate the checksum.

There are several ways this may be implemented. Ideally, I'd like this to be implemented in Tapestry5 core.

There's one thing I don't like about Tapestry5 assets handling, though, even if the above solution will be implemented -- is that assets are not static.

This means every asset URL is handled by the Java code, and in most cases assets handling is just streaming of existing files from filesystem to browser (with optional minimization and gzip-compression).

Once the asset was handled, Tapestry5 caches the response and uses it in further responses, but still this is all done in Java.

In Ping Service we've implemented "assets precompilation", and placed all the rendered assets as static files in the web app root folder.

This is done using custom implementation of org.apache.tapestry5.internal.services.ResourceStreamer, which is responsible for streaming every asset to client. During resource streaming we calculate asset checksum and store in a static.properties file, where we put asset URL as a key, and checksum as a value:

#Static Assets For Tapestry5 Application
#Sat Mar 10 19:42:38 UTC 2012
/assets/stage-20120310/ctx/css/all.css=5ef25ac1ec38f119e283f338e6c120a4e53127b1
/assets/stage-20120310/ctx/css/analytics.css=ee470432c344820e43995fb4632ab4bee3b92e38
/assets/stage-20120310/tapestry/t5-prototype.js=95e30b840a5654b82e6a0334a14a2766c57c4d99
...

Our implementation of AssetPathConverter uses this property file to modify asset URLs.

We run our implementation of ResourceStreamer only in production mode, since Google App Engine doesn't allow writing to the filesystem.

Also we've implemented it to work only if special HTTP-header passed with the request. To pass this header and to trigger every asset we have in our application, we use Selenium-powered integration test that queries every single page. We run this test before deploying new version to production.

Now Tapestry5 asset URLs and URLs of static files are the same in our application. So Google App Engine runtime won't even pass the request to Java. Also it uses its own facilities to serve static files, i.e. gzip-compression, etc.

Saturday, January 14, 2012

Simple Sorting Facade for Java (SSF4J)

In Java to sort two or more lists together you have to write a custom solution.

Say, you have list of names and corresponding list of weights. There is no API that allows you to sort names by weights (at least not that I know). However this is very common use case, especially when you analyzing data in your programs.

To achieve this, you, most likely, implement one of the sorting algorithms with a custom swap-logic.

Simple sorting facade is a pattern that already contains implementation of sorting algorithm(s) and only requires developers to specify source list, its bounds, and compare- and the swap-logic.

You can explore SSF4J on GitHub and contribute your implementations of sorting algorithms.

Here's an example of using SSF4J:


Monday, September 19, 2011

Mac OS X Lion HTTP Sniffer


Simple command-line sniffer:

sudo tcpdump -s 0 -A -i en1 port 80

Use ifconfig to lookup interface name (i.e. en1).