Lately I have been working on a custom written web solution using Grails MVC framework, that's stacked on top of Java / Groovy, and is parallel to RoR. Since I've started working with Grails and it's object-relational mapper GORM that's stacked on top of Hibernate, as someone that comes with no in-depth Hibernate knowledge I came up with following problem:
I had to create many-to-many mapping between entities A and B, let's called them Developer and Project, and wanted to lazy load developers when needed but, using sql JOIN statement, rather than reading one-by-one which was database-intensive. Examine following definitions:
class Project { String title static hasMany = [ developers : Developer ] } class Developer { String title static hasMany = [projects : Project] static belongsTo = Project }
Obviously, there is many to many relationship defined between Developer and Project domain definition, and by Grails official documentation, many to many associations need to have owning end of definition, Project in this case. Owning end means that all saves, and updates to Project will cascade to developer.
Database schema is created using following statements (taken directly from mysql log)
create table developer (id bigint not null auto_increment, version bigint not null, title varchar(255) not null, primary key (id)) ENGINE=InnoDB create table project (id bigint not null auto_increment, version bigint not
null, title varchar(255) not null, primary key (id)) ENGINE=InnoDB create table project_developers (project_id bigint not null, developer_id
bigint not null, primary key (project_id, developer_id)) ENGINE=InnoDB alter table project_developers add index FK700662CFD8DF43FA (project_id), add constraint FK700662CFD8DF43FA foreign key (project_id) references project (id) alter table project_developers add index FK700662CF76D8825A (developer_id),
add constraint FK700662CF76D8825A foreign key (developer_id)
references developer (id)
Consider following code and log output when inserting values, if Grails is configured to log sql using dataSource.logSql setting:
def project = new Project(title:"Awesome web solution") project.addToDevelopers(new Developer(title:"John Doe")) project.addToDevelopers(new Developer(title:"Mark Smith")) project.addToDevelopers(new Developer(title:"Joanne Doe")) project.save()
Hibernate: insert into project (version, title) values (?, ?) Hibernate: insert into developer (version, title) values (?, ?) Hibernate: insert into project_developers (project_id, developer_id) values (?, ?)
It simply inserts values into mapped domain object tables, and mapper table. Since repeated rows are ignored in log, there isn't 3 statements for 2nd and 3rd row - there actually present in real mysql log:
73 Query insert into project (version, title)
values (0, 'Awesome web solution') 73 Query insert into developer (version, title) values (0, 'Joanne Doe') 73 Query insert into developer (version, title) values (0, 'Mark Smith') 73 Query insert into developer (version, title) values (0, 'John Doe') 73 Query insert into project_developers
(project_id, developer_id) values (1, 1) 73 Query insert into project_developers
(project_id, developer_id) values (1, 2) 73 Query insert into project_developers
(project_id, developer_id) values (1, 3)
And now, what happens with GORM mapping by default if we try to pull all developers that worked on a project. See piece of code below
EXAMPLE 0
def project = Project.findByTitle("Awesome web solution") project.developers.each { developer -> println "${developer.title} works on project '${project.title}'" }
- and sql log that follows it
// ...
//pulling out developer with given title before //pulls out all developer IDs that work on a project Hibernate: select developers0_.project_id as project1_0_0_, developers0_.developer_id as developer2_0_ from project_developers developers0_ where developers0_.project_id=? //pulls out developer 1 Hibernate: select developer0_.id as id2_0_, developer0_.version as version2_0_, developer0_.title as title2_0_ from developer developer0_ where developer0_.id=? //pulls out developer 2 Hibernate: select developer0_.id as id2_0_, developer0_.version as version2_0_, developer0_.title as title2_0_ from developer developer0_ where developer0_.id=? //pulls out developer 3 Hibernate: select developer0_.id as id2_0_, developer0_.version as version2_0_, developer0_.title as title2_0_ from developer developer0_ where developer0_.id=?
Obviously, there is n+1 query executed for project with n developers, which is database-intensive even for projects with small number of developers, since main thread will be constantly put in I/O wait, waiting for database reply, and database will process larger number of queries, so MySQL server CPU is unnecessary utilized. Simple idea for anyone with some knowledge of SQL would be to perform a JOIN instead of row-by-row querying.
Techniques for loading more than one row at a time in GORM, by Grails documentation are
- Setting lazy loading to false (pulling all date at once), since GORM mapping is lazy loading associations by default
- Specifying 'join' fetch
Trying to set lazy loading to false on Project domain, simple does not work, and gives us same sql log:
EXAMPLE 1
class Project { String title static hasMany = [ developers : Developer ] static mapping = { developers lazy:false } }
select project0_.id as id0_0_, project0_.version as version0_0_,
project0_.title as title0_0_
from project project0_ where project0_.id=1
select developers0_.project_id as project1_0_0_,
developers0_.developer_id as developer2_0_
from project_developers developers0_
where developers0_.project_id=1
select developer0_.id as id2_0_, developer0_.version as version2_0_,
developer0_.title as title2_0_
from developer developer0_ where developer0_.id=2
select developer0_.id as id2_0_, developer0_.version as version2_0_,
developer0_.title as title2_0_
from developer developer0_ where developer0_.id=1
select developer0_.id as id2_0_, developer0_.version as version2_0_,developer0_.title as title2_0_
from developer developer0_ where developer0_.id=3
If setting fetching associations using JOIN statement, only one developer is pulled out, since reading project statement is merged with reading developers statement.
EXAMPLE 2
class Project { String title static hasMany = [ developers : Developer ] static mapping = { developers fetch:'join' } }
sql log (directly from MySQL log file):
select this_.id as id2_0_,
this_.version as version2_0_,
this_.title as title2_0_,
developers2_.project_id as project2_2_2_,
developers2_.developer_id as developer1_2_
from project this_
left outer join project_developers developers2_
on this_.id=developers2_.project_id
where this_.title='Awesome web solution' limit 1
select developer0_.id as id0_0_,
developer0_.version as version0_0_,
developer0_.title as title0_0_
from developer developer0_
where developer0_.id=1
Obviously, when reading entity by findBy, it's translated to 'limit 1' sql statement, and thus limits reading of project developers to only one. Even if reading statement changed to reading project directly, so there's no merging of two Hibernate statements, this approach won't make things better, since every developer is pulled out of database in it's own query, though it will give correct (read ALL) results
EXAMPLE 3:
def project = Project.get(1) project.developers.each { developer -> println "${developer.title} works on project '${project.title}'" }
MySQL LOG file:
select developer0_.id as id0_0_, developer0_.version as version0_0_, developer0_.title as title0_0_ from developer developer0_ where developer0_.id=1 select project0_.id as id2_0_, project0_.version as version2_0_, project0_.title as title2_0_, developers1_.project_id as project2_2_2_, developers1_.developer_id as developer1_2_ from project project0_ left outer join project_developers developers1_ on project0_.id=developers1_.project_id where project0_.id=1 select developer0_.id as id0_0_, developer0_.version as version0_0_, developer0_.title as title0_0_ from developer developer0_ where developer0_.id=2 select developer0_.id as id0_0_, developer0_.version as version0_0_, developer0_.title as title0_0_ from developer developer0_ where developer0_.id=1 select developer0_.id as id0_0_, developer0_.version as version0_0_, developer0_.title as title0_0_ from developer developer0_ where developer0_.id=3
Still, large number of queries, for simple result. Analyzing all of above examples implies that mapper table 'projects_developers' is actually what's treated as 'developers' association in Project domain, in the key to achieving the goal is to set join behavior on OWNED end of an association, that is on Developer domain class:
EXAMPLE 4:
class Project { String title static hasMany = [ developers : Developer ] } class Developer { String title static hasMany = [projects : Project] static belongsTo = Project static mapping = { projects fetch:'join' } }
And following SQL log
select project0_.id as id0_0_, project0_.version as version0_0_, project0_.title as title0_0_ from project project0_ where project0_.id=1 select developers0_.project_id as project1_0_1_, developers0_.developer_id as developer2_1_, developer1_.id as id2_0_, developer1_.version as version2_0_, developer1_.title as title2_0_ from project_developers developers0_ inner join developer developer1_ on developers0_.developer_id=developer1_.id where developers0_.project_id=1
As you can see only 2 sql queries are executed - one to pull out project, and second to pull out all of the project's developers, and no limit is implied.
It is lengthy post, but hope that this will save someone's time, as I spent quite some time understanding GORM and it's gotchas. For any furhter tweaking like CASCADE constraints on MySQL, pagination parameters, etc, please refer to official GORM documentation.
No comments:
Post a Comment