Description
The student is integrating high performance, parallel data processing routines into the MOST framework. Based on the MOST NoSQL modules (Cassandra, neo4j), the available data processing routines (e.g. periodic data calculation) were moved to an independent Java module. Based on this, the student includes hadoop support to distribute these calculations.
Benefit for the student
The student works with state of the art database technologies. He/she gains expertise in the development of high-scalable and distributed data processing frameworks.
Benefit for the Project
Handling building data on an urban level requires scalable architecture. This work offers the possibility to distribute the various available data processing routines with the goal to improve performance.
Requirements
Good Java programming skills, Interest in NoSQL datastores and data processing algorithms
Mentors
Harald Hofstätter, Stefan Glawischnig, Rainer Bräuer, Robert Zach
Contact
Mentors are regularly around in our GSoC IRC channel #TU-CSE-SoC at irc.freenode.net. You can also reach us via the mailinglist – send an email to This email address is being protected from spambots. You need JavaScript enabled to view it. using the prefix [MOST] (a subscription is required).
More information
Yellow elephant in Image by hadoop.