Thursday, 27 October 2011

Neo4J - 2nd Look - Setting a Primary Key on Nodes

Primary Key

In my last post I considered the lack of Primary Key like Id's as something I need to solve. My use case is

The application I will be building out will have, after all is said and done a really simple Web Interface with REST type URLs. So .. for example, I will be able to do.
http://myservice.co.uk/superwebapp/mySpecialThing/detailedView/55
The 55 there will result in a query to Neo4J to locate the "MySpecialThing" object with ID of type 55 and display it.

I also considered using a UUID across objects which is also good, but not really what I was after. I want a class of objects to all share an identifier. It has a lot of use. To solve the problem I arrived at the following solution.

Solution Outline

  • All Domain Objects extend fro super type of AbstractLongDomain which has getId()/setId() (Long)
  • An Aspect wrapped around the getId() looks for a null value and if no Id is found. It creates one
  • In the aspect creation of an ID involves talking to a singleton IdManager for a "nextId()"
  • nextId() on the manager looks to it's cache HashMap to see if it has an IdObject that knows what the next Id is
  • IdObject self persists to the repository after each call (** this could be slow.. see how we go)

To the Code

All my Objects extend the AbstractLongDomain The ID Object holds a "per" class Long Id, so each time an ID is needed, one of these objects gives one out. Next we have the IdManager that is managed as a Spring Singleton Bean. Its job is to return an id based on the "class" that needs an Id via getNextId(Class klass). The idRepository you see there is a simple Spring Data Neo4J Repository which has aspect-magic dust sprinkled on it to make the actual implementation.

Because I am not sure if neo4j is the "best" place to store the Id's (though it is the logical) I created a simple idGenerator interface which is simply what the aspect will call and talk to. One implementation (the only) is the Neo4JBackedIdGenerator which uses the id objects and idmanager above.

So first the interface for the IdGenerator

And then the actual Neo4JBackedIdGenerator which is created and managed as a spring bean.

I will have to play with the transactional semantics on this one. I recall a horrid situation which a similar design but via stored procs many moons ago where we had the sproc that generate Id's wrapped in transactions. They needed to be in their own transaction to ensure that mass object thread creation would not get stuck on a lock. (just an area I know can be sticky and bite.. so I put the @Transactional in there and commented out to remind me.

So last, we have the AspectJ which wraps our getId(). All the domain objects extend org.soqqo.luap.model.AbstractLongDomain which means we will get the Id creation and generation for free each time getId is called. (technically a catch here is that setId doesn't get checked if called manually on setting an Id. It could I guess look into the repo to see if the Id is already used.

And finally the Unit Test code to show that it all works

Note the use of @DirtiesContext, because the Neo4J is transactional, after each test the contents are dumped, which means that the idManager which has the HashMap cache becomes stale and it is singleton and has a lifecycle of the test class, not just the method. So the fix is either..manually flush() the cache or use @DirtiesContext which tells spring to re-build the context file. Both work but manually flushing my HashMap (new() ) is 2 seconds faster (0.037s for the test) than spring is at rebuilding.

The 2nd last piece to show is my test context file - model-test-context.xml

The very last piece is what my Maven POM looks like because a lot of people like to see that... Hopefully these are the correct relevant bits. I am happy to provide all this as a ZIP or push it up to github if people want to see more of it.

Wednesday, 26 October 2011

Neo4J - My First Look with Spring Data Graph

Spring Data Neo4J


I have been working on some proof of concept code and decided on a clean route to using NoSQL. Of course there are many choices and because the application I am working on is highly connected around relationships. (not boyfriend girlfriend types) I figured I would look at Neo4j. Given my favourite library of the year is spring-data I would take a look at the recently release spring-data-neo4j library (formerly called Spring Data Graph).

Spring Data provides some funky interface abstraction over your data store, be that an RDBMS, or other type of storage like NoSQL forms.

Specifics to Neo4j and Spring Data Neo4J

A Quick Overview:

  • Neo4J allows you to store POJOs without the need of a Schema.
  • POJOs are tied together using Relationships
  • Neo4J Understands a Node and a Relationship (that is it)
  • Spring Data Graph makes the "Node storage" and "relationship" tie-ing really simple with Annotations
Let's look at that last point in detail. spring-data-neo4j uses a few special annotations, not unlike JPA's annotations.

Declaring a Node

A Node is declared with the following annotation
@NodeEntity
public class MySpecialPojo {

    @Indexed
    Long id;

    @Indexed(indexType=IndexType.FULLTEXT, indexName = "search")
    String textField;
    //...
}
Effectively these annotations make some magic happen. One of the big magic happen things is to do with some special methods() you will find on the objects. If you include the right "stuff" in your Maven POM for spring-data-neo4j, you will get some good stuff happening. Effectively some "DAO/repository" style methods get woven into your domain objects.
        // for free we get .persist() Which wraps up a call to neo4j and put my Pojo as a Node down to Neo4j.
        MySpecialPojo special = new MySpecialPojo(1,"Some Data").persist();

        MySpecialPojo foundSpecial = this.pojoRepository.findByPropertyValue("textField", "Some Data");
You will also have .remove() and other fun stuff. The best document I have found (as it is very new (2.0.0.M1) is the following PDF. Spring Data Neo4J - Good Relationships

Missing Identity or Mimicking(sp?) a Primary Key

The application I will be building out will have, after all is said and done a really simple Web Interface with REST type URLs. So .. for example, I will be able to do
http://myservice.co.uk/superwebapp/mySpecialThing/detailedView/55
The 55 there will result in a query to Neo4J to locate the "MySpecialThing" object with ID of type 55 and display it. The problem I have is a two fold
  • Neo4J just stores objects and does not have "primary keys" other than a "nodeId".
  • The NodeId is collection wide. So Pojo1 shares the incremental nodeIds with Pojo2.

spring-data-neo4j adds (via an aspect ITD) a getNodeId() method to my POJOs but I don't want to depend on these for my "primary key" (future proofing my app if I move from Neo4J to something else).

So I want a Class wide "Id" so that when an object is persisted it has an ID for it.
I may be thinking about this wrong, and should just mold and accept a collection (database/store) wide ID system.
I am heading down this path
public class Foo { 
   @NodeId(collection=Foo.class)
   private Long id;
}

// or 
@NodeId(collection=Foo.class)
public class Foo extends AbstractLongIdDomainObject { 
    // .... get/setId() is found on super class.
}

This would then store and manage an ID, like we used to in RDBMS days when they did not have Primary Key AUTO INCREMENT or IDENTITY type stuff. The way you would implement Primary ID's is to have a table that stores the ID for "each collection" and a lookup stored-procedure or code that would "get" the next ID for you (within a transaction for example).


I have not perfected this, but I figured I would show what I was thinking. Plus I will do some more reading, (maybe Neo4J has some config to allow per class type nodeIds. * I always go the hard way first *



@NodeEntity
public class IdManager { 

    /**
     * This map holds an idObject per "className" for each object we want to "have a Primary KEY id for"
     */
    @Indexed
    private Map idCollection;

}
@NodeEntity
public class IdObject {

    public Long getNextId() {
        return nextId;
    }

    public void setNextId(Long nextId) {
        this.nextId = nextId;
    }

}

So with this magic code I would then annotate as above with my custom (@NodeId) and some magic happens using aspects and stuff to weave in the "next" ID when an object is "created" and about to be stored through spring-data-neo4j.

Can't Default Values

Probably just by how the aspects interact with the get/set methods for your fields, I found that you cannot default a fields value like you can with JPA.
@NodeEntity
public class Foo { 

   private Long someNumber = 1L;
   // .. getter setters
}

If you create this object, and set the someNumber to 55 for example. And then fooInstances.persist() and then retrieve it from the repository, it will not have the value of 55, but the value of 1. !! Annoying. So that is okay .. but I think the apsects that "populate" the fields are going in too early or something. I have a test case that shows this happening but it was in a complex Aspect( because of my above primary key playing) so it may be a special case. I'll see.

Summary

Really nice and I like the simplicity that neo4j and spring-data give. GO away SQL.

Current 5 booksmarks @ del.icio.us/pefdus