As you saw this blog was quite death for a long time, noting was posted since the last Coding Dojo, no special reason, but lots of work and no special need to write, this happens but I love writing so I propose myself to fix this. And what the best starting point with a post mortem of the last Hadoop Get Together in Berlin.
Berlin is a really cool city (specially in summer) with lots of movement when talking about IT, lots of meetup’s are organized every month, they have a huge amount of people attending and presentations are really high level. So if you are still not willing to be in Berlin, you are probably not into IT, or hate start-up’s.
Yesterday I attended one of the this meetup’s I’ve always lost, because I was to late to get the ticket, because I miss the date, whatever, but really Hadoop Get Together is the place to be for data geek’s, I really enjoyed being there. There are always high quality talks made by people who is really working on the edge of this kind of technologies, but the best is probably how professionally this meetup is organized, with videos, catering, etc.
Sebastian Schelter on Introducing Apache Giraph for Large Scale Graph Processing.
Sebastian shows us the power of Apache Giraph, the open source initiative to implement a Google Pregel paradigm over Hadoop. Sebastian shows us some toy examples on how, and why this approach was really the solution when dealing with huge amount of data that looks like a graph, and what can we say everything is a network ( or looks like ) today.
With pros and cons, Giraph is probably the way to go when dealing with this kind of data, specially because they provide to you a very easy way to start using all your current Hadoop infrastructure. If you are interested on that I really recommend you to read the Pregel paper, plus if you are in Berlin after Buzzwords you should attend the Giraph Workshop.
Dr. Falk-Florian Henrich on Applying Compiler Technology to Event Stream Processing.
Dr Falk-Florian, told us how they use LVVM technology in order to build a proper compiler technology who can manage event stream processing. Still not sure If I like what they are doing or not, but have to say their first benchmarks are really impresive. Looking forward to have a more detailed overview of that in order to build a proper opinion. If you like the idea of using proper muti core computation to solve real time analitics, feel free to follow it at Celera One Gmbh.
Dr. Mikio Braun on TWIMPACT: On Real-Time Twitter Analysis.
The last, but not the least, Dr. Miko told us how they perform real time analysis of Twitter. Have to say they followed a really smart, step by step approach, specially loved the tip: Know your data, do you really can scale on Real-Time?. Also liked how they end up discarding some cool databases when dealing with this kind of process, ending up having some kind of usage of custom specialized data structures. Looking forward to have the option to put here their slides.
A side note was made by the Data Science Berlin people, showing up what they are promoting in town. A place to have an eye to, specially cause they are also organizing a data hackaton, we will see with what they come up.
Not related with data, but on the March edition of this Get Together there was a nice introduction on how to use Kanban to get more often and better releases. Only can say I subscribe 100% of the ideas presented by this guy.
See you more often here!
No related posts.