Google Summer of Code 2012 with Xtext, EMF, and MongoDB

The Eclipse Google Summer of Code for 2012 will start coming to life soon, and I have an idea for a project involving Xtext, EMF, and MongoDB.  Ed Merks and I have created a project called MongoEMF that allows you to persist EMF objects to MongoDB.  MongoEMF has the ability to query objects based on attribute and reference values, but is lacking some functionality and robustness.  We think a re-write of the query engine using Xtext and EMF would make for a great Google Summer of Code project.  This project has a very well defined scope, is easily unit tested, and involves some pretty cool technology.

The project will consist of two major parts.  The first will be to define a query grammar (possibly re-using an existing grammar) using Xtext.  The result will be an EMF query model that can be serialized and de-serialized to a string.  The string will be used as the query portion of a URI, and must be human readable.  For example:  mongo://host/db/collection/?(tag == ‘java’ || tag == ‘JSON’) && (category == ‘Eclipse’).  Clients will be able to specify the query as string or EMF model.  The second part of the project will be to create a processing engine that builds a MongoDB query from the EMF query model.  The result will be a DBObject that can be sent to MongoDB as a query.

An example use-case is as follows:  A client constructs an EMF query model.  The query model is then serialized to a string and included in a URI.  The URI is sent to a server in a HTTP GET call.  The server extracts the query string from the URI and re-builds the EMF query model.  The query model is passed to the query engine and a DBObject is returned containing a MongoDB specific query.  The DBObject is sent to MongoDB resulting in some number of DBObjects returned.  The resulting DBObjects are converted into EMF objects and returned to the client.

The student is expected to develop extensive unit tests for his / her code as well as end-user documentation.  Documentation will be contributed to the UserGuide on the MongoEMF wiki.  We also anticipate that the student will become a committer on MongoEMF, and will continue to provide bug fixes and enhancements after the project is complete.

We consider this to be an advanced Eclipse project.  The student should already be comfortable coding in Java and developing OSGi bundles with Eclipse.  A student with working knowledge of EMF will be preferred, but you are not required to already know Xtext or MongoDB.  If you are interested in this project, I would recommend forking the MongoEMF project on GitHub and experimenting with the latest code in the master branch.  Instructions for setting up your environment can be found on the Development wiki page.  If you have any questions, please comment on this post.

This project idea will be included on the Eclipse Google Summer of Code wiki soon.

4 thoughts on “Google Summer of Code 2012 with Xtext, EMF, and MongoDB

  1. Great idea. Also think about getting the metadata out of every document stored. Like the class name of every object. Maybe by storing the model itself in a metadata collection.

  2. Yes, an excellent idea! Some things to keep in mind as we build upper layer query services.

    1. The model instance may have been normalised over multiple collections – I have done this in MongoEMF wth my models – so the query engine will have to be able to create the correct dbObject or dbObjects (multiple sub queries?) to accommodate a normalised model.

    Perhaps we need an additional model for describing model element collection normalisation?

    2. GeoLocation queries

    3. Information on query performance for index tuning

Leave a reply to Bryan Hunt Cancel reply