Pretty print, Jquery, Styling via code

Monday, February 3, 2014

Grails many to many associations and fetch with join


Lately I have been working on a custom written web solution using Grails MVC framework, that's stacked on top of Java / Groovy, and is parallel to RoR.  Since I've started working with Grails  and it's object-relational mapper GORM that's stacked on top of Hibernate, as someone that comes with no in-depth Hibernate knowledge I came up with following problem:

I had to create many-to-many mapping between entities A and B, let's called them Developer and Project, and wanted to lazy load developers when needed  but, using sql JOIN statement, rather than reading one-by-one which was database-intensive. Examine following definitions:


class Project {
 String title
 
 static hasMany = [ developers : Developer ]
}

class Developer {

 String title
 
 static hasMany = [projects : Project]
 static belongsTo = Project
}

Obviously, there is many to many relationship defined between Developer and Project domain definition, and by Grails official documentation, many to many associations need to have owning end of definition, Project in this case. Owning end means that all saves, and updates to Project will cascade to developer.

Database schema  is created using following statements (taken directly from mysql log)


create table developer (id bigint not null auto_increment, version bigint not null, title varchar(255) not null, primary key (id)) ENGINE=InnoDB
   
create table project (id bigint not null auto_increment, version bigint not 
null, title varchar(255) not null, primary key (id)) ENGINE=InnoDB
 
create table project_developers (project_id bigint not null, developer_id 
bigint not null, primary key (project_id, developer_id)) ENGINE=InnoDB

alter table project_developers add index FK700662CFD8DF43FA (project_id), add constraint FK700662CFD8DF43FA foreign key (project_id) references project (id)

alter table project_developers add index FK700662CF76D8825A (developer_id), 
add constraint FK700662CF76D8825A foreign key (developer_id) 
references developer (id)

Consider following code and log output when inserting values, if Grails is configured to log sql using dataSource.logSql  setting:


 def project = new Project(title:"Awesome web solution")
  
  project.addToDevelopers(new Developer(title:"John Doe"))
  project.addToDevelopers(new Developer(title:"Mark Smith"))
  project.addToDevelopers(new Developer(title:"Joanne Doe"))
  
  project.save()



Hibernate: insert into project (version, title) values (?, ?)
Hibernate: insert into developer (version, title) values (?, ?)
Hibernate: insert into project_developers (project_id, developer_id) values (?, ?)

It simply inserts values into mapped domain object tables, and mapper table. Since repeated rows are ignored in log, there isn't 3 statements for 2nd and 3rd row - there actually present in real mysql log:



 73 Query insert into project (version, title) 
                                      values (0, 'Awesome web solution')
     73 Query insert into developer (version, title) values                                    (0, 'Joanne Doe')
     73 Query insert into developer (version, title) values                                    (0, 'Mark Smith')
     73 Query insert into developer (version, title) values                                    (0, 'John Doe')
     73 Query insert into project_developers 
                                    (project_id, developer_id) values (1, 1)
     73 Query insert into project_developers 
                                    (project_id, developer_id) values (1, 2)
     73 Query insert into project_developers 
                                    (project_id, developer_id) values (1, 3)

And now, what happens with GORM mapping by default if we try to pull all developers that worked on a project. See piece of code below

EXAMPLE 0
def project = Project.findByTitle("Awesome web solution")
  
  project.developers.each { developer ->
   println "${developer.title} works on project '${project.title}'"
  }
  

- and sql log that follows it


// ...
//pulling out developer with given title before

//pulls out all developer IDs that work on a project
Hibernate: select developers0_.project_id as project1_0_0_, developers0_.developer_id as developer2_0_ 
    from project_developers developers0_ 
    where developers0_.project_id=?

//pulls out developer 1
Hibernate: select developer0_.id as id2_0_, developer0_.version as version2_0_, developer0_.title as title2_0_ 
    from developer developer0_ 
    where developer0_.id=?

//pulls out developer 2
Hibernate: select developer0_.id as id2_0_, developer0_.version as version2_0_, developer0_.title as title2_0_ 
   from developer developer0_ 
   where developer0_.id=?

//pulls out developer 3
Hibernate: select developer0_.id as id2_0_, developer0_.version as version2_0_, developer0_.title as title2_0_ 
   from developer developer0_ 
   where developer0_.id=?

Obviously, there is n+1 query executed for project with n developers, which is database-intensive even for projects with small number of developers, since main thread will be constantly put in I/O wait, waiting for database reply, and database will process larger number of queries, so MySQL server CPU is unnecessary utilized. Simple idea for anyone with some knowledge of SQL would be to perform a JOIN instead of row-by-row querying.

Techniques for loading more than one row at a time in GORM, by Grails documentation are

  1. Setting lazy loading to false (pulling all date at once), since GORM mapping is lazy loading associations by default
  2. Specifying 'join' fetch 

 Trying to set lazy loading to false on Project domain, simple does not work, and gives us same sql log:

EXAMPLE 1
class Project {
 String title
 
 static hasMany = [ developers : Developer ]
 
 static mapping = {
  developers lazy:false
 }
}


        select project0_.id as id0_0_, project0_.version as version0_0_, 
               project0_.title as title0_0_ 
        from project project0_ where project0_.id=1
        select developers0_.project_id as project1_0_0_,
             developers0_.developer_id as developer2_0_ 
        from project_developers developers0_ 
        where developers0_.project_id=1

        select developer0_.id as id2_0_, developer0_.version as version2_0_,
               developer0_.title as title2_0_ 
        from developer developer0_ where developer0_.id=2
 select developer0_.id as id2_0_, developer0_.version as version2_0_,
               developer0_.title as title2_0_ 
        from developer developer0_ where developer0_.id=1

        select developer0_.id as id2_0_, developer0_.version as version2_0_,
               developer0_.title as title2_0_ 
        from developer developer0_ where developer0_.id=3

If setting fetching associations using JOIN statement, only one developer is pulled out, since reading project statement is merged with reading developers statement.

EXAMPLE 2
class Project {
 String title
 
 static hasMany = [ developers : Developer ]
 
 static mapping = {
  developers fetch:'join'
 }
}


sql log (directly from MySQL log file):


               select this_.id as id2_0_, 
                       this_.version as version2_0_, 
                       this_.title as title2_0_,
                       developers2_.project_id as project2_2_2_, 
                      developers2_.developer_id as developer1_2_
               from project this_ 
               left outer join project_developers developers2_ 
                     on this_.id=developers2_.project_id 
               where this_.title='Awesome web solution' limit 1
        select developer0_.id as id0_0_, 
                     developer0_.version as version0_0_,
                     developer0_.title as title0_0_ 
               from developer developer0_ 
               where developer0_.id=1

Obviously, when reading entity by findBy, it's translated to 'limit 1' sql statement, and thus limits reading of project developers to only one. Even if reading statement changed to reading project directly, so there's no merging of two Hibernate statements, this approach won't make things better, since every developer is pulled out of database in it's own query, though it will give correct (read ALL) results

EXAMPLE 3:
 def project = Project.get(1)
  
  project.developers.each { developer ->
   println "${developer.title} works on project '${project.title}'"
  }

MySQL  LOG file:


select developer0_.id as id0_0_,
       developer0_.version as version0_0_, 
       developer0_.title as title0_0_ 
from developer developer0_ 
        where developer0_.id=1
 
select project0_.id as id2_0_, project0_.version as version2_0_, 
       project0_.title as title2_0_,
       developers1_.project_id as project2_2_2_, 
       developers1_.developer_id as developer1_2_ 
from project project0_ 
     left outer join project_developers developers1_
          on project0_.id=developers1_.project_id 
 where project0_.id=1
       
select developer0_.id as id0_0_, developer0_.version as version0_0_,
               developer0_.title as title0_0_ 
from developer developer0_ 
where developer0_.id=2

select developer0_.id as id0_0_, developer0_.version as version0_0_,
               developer0_.title as title0_0_ 
from developer developer0_ 
where developer0_.id=1

select developer0_.id as id0_0_, developer0_.version as version0_0_,
               developer0_.title as title0_0_ 
from developer developer0_ 
where developer0_.id=3


Still, large number of queries, for simple result. Analyzing all of above examples implies that mapper table 'projects_developers' is actually what's treated as 'developers' association in Project domain, in the key to achieving the goal is to set join behavior on  OWNED end of an association, that is on Developer domain class:

EXAMPLE 4:

class Project {
 String title 
 static hasMany = [ developers : Developer ]
}

class Developer {

 String title
 
 static hasMany = [projects : Project]
 static belongsTo = Project
 
 static mapping = {
  projects fetch:'join'
 }
}

And following SQL log

select project0_.id as id0_0_, project0_.version as version0_0_,
       project0_.title as title0_0_ 
from project project0_ 
where project0_.id=1

select developers0_.project_id as project1_0_1_, 
       developers0_.developer_id as developer2_1_, 
       developer1_.id as id2_0_, developer1_.version as version2_0_, 
       developer1_.title as title2_0_ 
from project_developers developers0_ 
     inner join developer developer1_ on developers0_.developer_id=developer1_.id 
where developers0_.project_id=1


As you can see only 2 sql queries are executed - one to pull out project, and second to pull out all of the project's developers, and no limit is implied.

It is lengthy post, but hope that this will save someone's time, as I spent quite some time understanding GORM and it's gotchas. For any furhter tweaking like CASCADE constraints on MySQL, pagination parameters, etc, please refer to official GORM documentation.






Saturday, January 4, 2014

Continuous integration problems : Jenkins, JVM Class loader, permgen error and hotdeploy


As I figured out that my previous blog posts are mostly related to front-end technologies, I decided to share some knowledge related to dev operations - project build and deployment automation. I won't go too much in detail about introduction to continuous integration (CI) systems, and assume that reader of this blog is aware of it's role in software development cycle, and rather focus on one particular problem.

Whatever we are using maven, ant or some other build system to build our Java (or any other JVM compatible language like Groovy, Scala etc.), we'll probably want to deploy it to some application server in the spirit of automation, and offloading programmable tasks from humans to machines. Problem arises when doing hot deployment to application server like Tomcat, where after couple of deployments memory reserved in JVM for class definitions and other non-mutable objects (so called PermGen space) becomes full, and we are getting so called 'Permgen error'.

Turns out that there is simple solution to this problem, while in development environment. For production environments I've strongly recommend doing cold deployment. It's simple as restarting Java servlet container (Tomcat, JBoss, Glassfish.) after doing package deployment. I recommend deploying war/jar file using Jenkins Deploy plugin (alternative to this would be to use script to perform this), while restarting tomcat is simple as executing command (on *NIX environment). If you have installed Tomcat as a service:

service tomcat restart

You can execute script on Jenkins using PostBuild script plugin. Note that it's essential for above approach to work to run Jenkins outside of Servlet container that you're restarting, and that jenkins user has permissions to perform restart operation.

With Servlet container restarted, there is overhead in time delay while waiting for  application server to boot, but JVM is essentially restarted, so PermGen memory does not get filled up, and you don't have to login into your development box every-once-in-a-while and reboot it.

Thursday, October 17, 2013

Read current script tag in Javascript

Today I was asked simple question - How to read current JavaScript tag in JS code itself. Sounds easy, and usefull when you want to provide JS snippet that does some sort of DOM modificaiton, ideal for example ad delivery where ad-script tag will be transformed into DIV or SPAN dom element. However, initially there were couple of ideas:

  • Mark script tag with specific ID. This idea fails at the beginning because, it's hard to distinguish source tag if you have multiple snippets with same src url inserted into page
  • Mark script tag with specific code comment, and match in code markup text against specific regex expression. Still fails if you don't put some extra info in code comment, that will make 2 or more scripts distinct. 
Perfect solution is more than simple: since included JavaScript tags are always executed sequentially, in the time of execution of current script all tags included below given tag are not visible to the DOM model, so we could just say 'give me last loaded script tag'. Take a look at code below

//read all script tags, and get last one
var getCurrentScriptTag = function(){
 var scripts = document.getElementsByTagName('script');
 var lastScript = scripts[scripts.length - 1];
 return lastScript
}

//global counter, if not definet set to 0
window.scriptCounter = (window.scriptCounter ) || 0;
//increment
window.scriptCounter++;

//mark last script tag
getCurrentScriptTag().setAttribute('id','script-' + scriptCounter);

Every time script file is loaded, variable called scriptCounter is incremented, or set to 0  if that's the first time script loads, and variable hasn't been initialized before. Now consider including above script several times on a page:


 
 <script type="text/javascript" src="script1.js"> </script>
 <script type="text/javascript" src="script1.js"> </script>
 <script type="text/javascript" src="script1.js"> </script>
 <script type="text/javascript" src="script1.js"> </script>
 

So, what's the resulting DOM after the page is loaded. Note that every time script is loaded, it's execution isn't attached to any document 'load', or 'ready' event, it's executed right away, so above written getCurrentScriptTag() function works as expected:

I hope this small post helps someone solve real-world problem!