Neo Workshop Lab Instructions

From NeoWiki

Jump to: navigation, search

Contents

[edit] Assignment 1: Matrix, or constructing a node space

The purpose of this assignment is to try out the Neo API. You will build a simple node space and perform some traversals. To do this you should first get acquainted with the Neo API in org.neo4j.api.*. For more information, see the javadocs at http://api.neo4j.org/1.0-b6/.

[edit] Brief API introduction

In order to successfully complete assignment 1, you only need to make use of a few classes from the Neo core API. Here's a brief intro to some of them:

  • There's EmbeddedNeo, which is the "starting point" to the Neo API. From this class you create and retrieve nodes, amongst other things.
  • The Node interface contains methods to get/set/delete properties on the node and get/create relationships. You also instantiate traversers using the node.traverse() methods.
  • Relationship has the same methods as node for property manipulation and methods for getting the start and end node. It also has a method getType() that returns the relationship type.

This should actually be enough to get you going in this example. Fore more details, see the javadocs at http://api.neo4j.org/1.0-b6/.

[edit] Tasks

  • Open the org.neo4j.tutor.matrix.Matrix class in the src/java folder and look for "TODO" comments. The setupMatrix() method should create the node space described in the image below:

Image:Matrix-social-network-jw-workshop.png

  • Implement the traversers in findFriends and findHacker methods. While implementing you can open a console, goto the project directory and run bin/matrix or "Run as->Java application" in Eclipse to see the changes. When all the "TODO"s have been implemented output when running the bin/matrix script will be something like:
Thomas Andersson's friends:
At depth 1 => Trinity
At depth 1 => Morpheus
At depth 2 => Chyper
At depth 3 => Agent Smith
All hackers:
At depth 4 => The Architect

You may have to run the default target for build.xml (shift+alt+x q) if Eclipse in its infinite wisdom decided to put the classes somewhere other than build/classes.

Now lets try look at the node space using the Neo shell. Neo shell is a simple command line utility that can be used to look at the "node space" and modify it.

  • Look at the main method in org.neo4j.tutor.matrix.MatrixWithShell class, you will see an additional call being made to the embedded neo instance to enable remote shell connection.
    neo.enableRemoteShell( null );

More information about the Neo shell can be found at Neo_Shell_Guide.

  • Open a console, go to the project directory and type bin/shell-matrix. You should get a print out to the console looking something like Thomas Anderson is Node[1]. This means that the node representing Thomas Anderson has an id equal to 1.
  • Open a second console and type bin/shell. This will start the Neo shell and the output should look like this:
Welcome to NeoShell
Available commands: cd env exit export gsh ls man mkrel mv pwd rm rmrel set quit
Use man <command> for info about each command.
neo-sh [0] $ 

Now remember the node id for Thomas Anderson. In Neo shell, type:

cd -a <node id>

And replace the node id with the id printed. The -a means "jump directly to this node". Normaly a cd to another node can only be done if the current node is connected to that node via a relationship. Now type in ls and you will get an output that looks something like:

neo-sh [1] $ ls
*name =[Thomas Andersson]
*age  =[29]
(me) --[KNOWS]--> (2)
(me) --[KNOWS]--> (3)

This output means that the current node has two properties (name and age) and there are two outgoing relationships of type KNOWS connected to node with id 2 and 3. You can now cd to any of the nodes connected without the -a option. Play around some with the shell, traverse the network. You can type help to get a list of available commands and man command for more info about each command. The printout bellow shows how the shell is used to add a new friend to the "Thomas Anderson" node.

neo-sh [0] $ cd -a 1
neo-sh [1] $ ls
*name =[Thomas Andersson]
*age  =[29]
(me) --[KNOWS]--> (2)
(me) --[KNOWS]--> (3)
neo-sh [1] $ mkrel -c -t KNOWS
neo-sh [1] $ ls
*name =[Thomas Andersson]
*age  =[29]
(me) --[KNOWS]--> (2)
(me) --[KNOWS]--> (3)
(me) --[KNOWS]--> (7)
neo-sh [1] $ cd 7
neo-sh [7] $ set name Tank
neo-sh [7] $ ls
*name =[Tank]
(me) <--[KNOWS]-- (1)
neo-sh [7] $ pwd
Current node is (7)
(0)-->(1)-->(me)

[edit] Assignment 2: Cross language search

In this assignment we will implement a traverser that can perform cross language keyword search on files and documents. If a document is indexed with a keyword in one language, a user performing a search for that keyword in another language should still get the document. This can be achieved by adding translations between "keywords".

A file/document will be represented by an empty node. Keywords are also nodes (with the keyword stored as a property). Document nodes can have an outgoing relationship of type HAS_KEYWORD to keyword nodes. From a keyword node, relationships of type TRANSLATION exist to other keyword representing the same keyword in a different language. Here is an example:

Image:Cross-language-search.png

In this image we have 3 documents and 4 keywords (actually 2 keywords in english that both have a swedish translation). A search for documents with english keyword "house" or swedish keyword "hus" should result in document 2 being returned. A search for english keyword "dog" or swedish keyword "hund" should then result in document 1 and 3 being returned. So in Neo after we have found the keyword node all we have to do is traverse both the TRANSLATION and the HAS_KEYWORD relationships. If a HAS_KEYWORD relationship has been traversed we know we found a document node that should be returned.

Take a look in the org.neo4j.tutor.search package. MyRelationshipTypes contains the relationship types where TRANSLATION and HAS_KEYWORD is of interest for this assignment. Search and AbstractSearchEngine has some code for setting up test data and creating keywords but what is of interest is the SearchEngine class (extending AbstractSearchEngine. In there you will find a search method that needs to be implemented. The method takes a keyword (parametrized with language and keyword value), looks up the keyword node representing that keyword then expects you to perform a traversal (from the keyword node) and return all documents connected to that keyword.

  • Implement the "TODO" in org.neo4j.tutor.search.SearchEngine to get the unit.search.TestSearch to run.

Once the two unit tests passes open a console, go to the project home directory and type ./bin/search. This should result in an output of:

Usage: Search setup -- create test data
Usage: Search clean -- clean test data
Usage: Search dump -- dump test data
Usage: Search <lang> <keyword> -- search
Usage: Search shell -- enable shell
  • Run ./bin/search setup to create some documents (they will be randomly connected to existing keywords, you can run this many times to create more and more documents)
  • Run ./bin/search en dog and see if you got any results.
  • Run ./bin/search dump to see what "documents" exist and what keywords are connected to them.
  • Run ./bin/search clean to remove all documents.
  • Run ./bin/search shell to enable remote shell connection. You can then run the same script ./bin/shell as we did in the previous assignment to connect with Neo shell.

Take a look at the file etc/translation.properties. In this file you can add your own keywords and translations.

#define keyword translations here
#language_x,keyword_a=language_y,keyword_b
sv,hund=en,dog
en,house=sv,hus

Just add a line to that file to add more keywords, for example to add the German word "house" and translate it to the Swedish word "hus" add the following line:

de,house=sv,hus

[edit] Assignment 3: A Tree, building an app using Neo

The purpose of this assignment is to get a feeling for how we can build applications using Neo. In the first two assignments we looked at and worked with the Neo API. That was very straight forward but moving from that to building real applications where we have an object model, that somehow should be fitted into the Neo model, can be a bit tricky the first time. Fortunately we have found that object-oriented models fit very well in Neo's network-oriented model.

[edit] Introducing the application

To illustrate this we use a simple example application that models a file system.

In a normal file system we have files and directories. A directory may have child directories or files. Both files and directories may have a parent directory. A simple way to model this in Java would be to take the Unix approach (where everything is a file) and have a single class, File.

Time for some, well, UML... ish:

File
 +getFileName() : String
 +getParent() : File
 +getChildren() : Iterable<File>
 +addChild( File child )
 +disconnectFromParent()
 +delete()

And for file creation we can have a factory:

FileFactory
 +createFile( String fileName )

Now, how do we make our file system persistent in Neo? Well what we have is a simple tree structure and a tree structure can be fitted into a network structure. The simplest way to model this in Neo would be to have a single relationship type called FILE. Each node is a File entity, an incoming relationship of type FILE comes from the entity's parent. Outgoing relationships of type FILE point to the file entity's children.

Time to pull out the ASCII art! Here's what it might look like in the node space:

(Node)---FILE--->(Node)----FILE--->(Node)
                    |----FILE----->(Node)

Merging our object model with the Neo model can then result in the following implementation. This might be a lot to comprehend on first glance. Worry not, it will be clear as you progress through the tasks:

  • FileFactory.createFile() should create a Node (representing the file entity). Set the name of the file as a property on that node then create a File object, passing in the created node in the constructor.
  • The File constructor takes a Node and stores that node as "underlying" node.
  • File.getFileName() reads and returns the file name property from the underlying node
  • File.getParent checks for an incoming relationship of type FILE, if such a relationship exists it returns a new File passing in the start node of the relationship in constructor.
  • File.getChildren() uses the underlying node to create a traverser that traverses outgoing FILE relationships with a depth limit of one and a returnable evaluator that returns all nodes except start node. (An inner class in File can be found that converts from Iterable<Node> to Iterable<File>, use it and return.)
  • File.addChild(File child) creates a relationship of type FILE from this underlying node to the child's underlying node.
  • File.disconnectFromParent() deletes the incoming relationship of type FILE.
  • File.delete() disconnects itself from parent, gets all children and invokes delete() on them. Finally it deletes the underlying node.

[edit] Tasks

Under org.neo4j.tutor.tree a partial implementation of this "file system" can be found.

  1. Get the unit tests in unit.tree.TestFile to pass by implementing the "TODO's" in File.java and FileFactory.java.
  2. Use the console to invoke the script bin/list-files (from the project root directory). It will build a simple file structure and then list the result to the console. That is, if your implementation is correct!
$ ./bin/list-files
/bin
/boot
/dev
/etc
/etc/passwd
/etc/group
/home
/home/neo
/home/neo/application.jar
/home/neo/readme.txt
/lib
/usr
/sbin
/sbin/reboot
/sbin/fsck

[edit] Assignment 4: Security

The purpose of this assignment is to see what happens when an application built using Neo starts to evolve.

To simulate this we use the simple "file system" created in the previous example and evolve it by adding security. For simplicity we will only add "security check" in the File.getChildren() method and the security check will only be of type "access granted/denied". To check for security access we generate an event in File.getChildren for each child encountered asking "do we have access?". The event will be caught by the "security manager" in place answering the question by returning true or false.

This assignment is split into 4 sub tasks (4a-4d), each task evolving the security model. For educational purposes there exist 4 different security managers under org.neo4j.tutor.security.SecurityMgr1-SecurityMgr4 (normally changes would go to same security manager).

[edit] 4a Security using properties

This implementation of security use properties stored on the file node. Basically we have a property called File.SECURITY with a Boolean value. If no such property exists on the file access is granted by default. If property exists false means access denied, true means access granted.

Implement the "TODO" in org.neo4j.tutor.security.SecurityMgr1 and make the unit tests in unit.security.TestSecurity1 pass.

There is a script in bin directory called security1 that will create a file structure then list it.

  • ./bin/security1 off will list the file structure without registering the security manager
$ ./bin/security1 off
/etc
/etc/passwd
/etc/group
/home
/home/user
/home/user/application.jar
/home/user/readme.txt
/usr
/sbin
/sbin/reboot
/sbin/fsck
  • ./bin/security1 will list the file file structure with the security manager in place. Now certain files shouldn't be seen.
$ ./bin/security1 
/etc
/etc/group
/home
/home/user
/home/user/application.jar
/home/user/readme.txt
/usr
/sbin

[edit] 4b Security connected to a user

The implementation in 4a is kind of useless in a multi user environment since the purpose of security is to grant or allow access depending on the user performing the task. Adding a user concept to our security model needs to be done.

We make use of an AuthorizationService that can be asked by the security manager what user we should do the security check against. We also need to add the user concept to our application. To keep things simple we will not create a User object and means of creating/managing users. Instead we pretend that exists and focus on the representation in Neo.

A file node can have a ACCESS relationship to a user node indicating access is granted for that user.

(file node)---ACCESS--->(user node)
     |--------ACCESS--->(some other user node)

Implement the "TODO" in SecurityMgr2 and make the unit tests in TestSecurity2 pass.

The script security2 uses SecurityMgr2 when listing a created file structure.

  • ./bin/security2 root list files as root (will show all files)
$ ./bin/security2 root
/etc
/etc/passwd
/etc/group
/home
/home/user
/home/user/application.jar
/home/user/readme.txt
/usr
/sbin
/sbin/reboot
/sbin/fsck
  • ./bin/security2 user will list files as a user (some files will be hidden)
$ ./bin/security2 user
/etc
/etc/group
/home
/home/user
/home/user/application.jar
/home/user/readme.txt
/usr
/sbin

[edit] 4c Adding groups

It would be nice if users could be member of groups. That way we can reduce the number of ACCESS relationships needed in our model when we have more users.

As in 4b security is granted if ACCESS relationship but now that relationship can be drawn to a group node also. A group node will have incoming relationships of type MEMBER_OF where the start node is either a user node or another group node.

(User1)---MEMBER_OF--->(Group1)---MEMBER_OF--->(Group2)
(User2)---MEMBER_OF-------------------------------^ ^
                                                    |
(File)---ACCESS-------------------------------------|

Implement the "TODO" in SecurityMgr3 and make the unit tests in TestSecurity3 pass.

The script security3 uses SecurityMgr3 when listing a created file structure.

  • ./bin/security3 root list all files since root user is member of all groups
  • ./bin/security3 user this user is not a member of root group and can't see all files

[edit] 4d Making security hierarchical

We're still not happy with our security model. The problem now is that if we need to change security settings somewhere it will involve a lot of iterating down the tree hierarcy to keep things consistent.

[Note: this issue isn't so obivious when we only have one type of security setting (access granted), but if we had for example: list,read,write,delete flags with a plus & minus modifier in our security model (as properties on the access relationship) it will be more obvious why hierarchical security is a good thing]

Example setup:


                                    |---ACCESS--->(User1)
                                    |
(/)---FILE--->(home)---FILE--->(user1 home)---... user1 file's
 |              |------FILE--->(user2 home)---... user2 file's
 |                                  |
 |                                  |---ACCESS--->(User2)
 |---ACCESS--->(root user)

The idea here is to traverse the FILE relationship backwards checking for ACCESS relationships.

  • If we invoke getChildren() on home folder as User1 we will only get his home directory.
  • If we invoke getChildren() on home folder as User2 we will only get his home directory.
  • If we invoke getChildren() on home folder as root we will see both users home directories.

Implement the "TODO" in SecurityMgr4 and make the unit tests in TestSecurity4 pass.

  • ./bin/security4 root will list the home directory and display two users home directories and their files
$ ./bin/security4 root
Listing home directory:
/user
/user/application.jar
/user/readme.txt
/otheruser
/otheruser/.bashrc
/otheruser/.bash_profile
  • ./bin/security4 user will list the home directory only displaying the user's home directory
$ ./bin/security4 user
Listing home directory:
/user
/user/application.jar
/user/readme.txt
Personal tools