Tug's Blog: 2016

Tuesday, October 11, 2016

Getting started with Apache Flink and Kafka

Introduction

Apache Flink is an open source platform for distributed stream and batch data processing. Flink is a streaming data flow engine with several APIs to create data streams oriented application.

It is very common for Flink applications to use Apache Kafka for data input and output. This article will guide you into the steps to use Apache Flink with Kafka.

Monday, September 26, 2016

Streaming Analytics in a Digitally Industrialized World

Read this article on my new blog

Get an introduction to streaming analytics, which allows you real-time insight from captured events and big data. There are applications across industries, from finance to wine making, though there are two primary challenges to be addressed.

Did you know that a plane flying from Texas to London can generate 30 million data points per flight? As Jim Daily of GE Aviation notes, that equals 10 billion data points in one year. And we’re talking about one plane alone. So you can understand why another top GE executive recently told Ericsson Business Review that "Cloud is the future of IT," with a focus on supporting challenging applications in industries such as aviation and energy.

Thursday, September 1, 2016

Setting up Spark Dynamic Allocation on MapR

Read this article on my new blog

Apache Spark can use various cluster manager to execute application (Stand Alone, YARN, Apache Mesos). When you install Apache Spark on MapR you can submit application in a Stand Alone mode or using YARN.

This article focuses on YARN and Dynamic Allocation, a feature that lets Spark add or remove executors dynamically based on the workload. You can find more information about this feature in this presentation from Databricks:

Dynamic Allocation in Spark

Let’s see how to configure Spark and YARN to use dynamic allocation (that is disabled by default).

Thursday, March 31, 2016

Save MapR Streams messages into MapR DB JSON

Read this article on my new blog

In this article you will learn how to create a MapR Streams Consumer that saves all the messages into a MapR-DB JSON Table.

Thursday, March 10, 2016

Getting Started with MapR Streams

Read this article on my new blog

You can find a new tutorial that explains how to deploy an Apache Kafka application to MapR Streams, the tutorial is available here:

Getting Started with MapR Streams

MapR Streams is a new distributed messaging system for streaming event data at scale, and it’s integrated into the MapR converged platform. MapR Streams uses the Apache Kafka API, so if you’re already familiar with Kafka, you’ll find it particularly easy to get started with MapR Streams.

Wednesday, February 10, 2016

Getting Started With Sample Programs for Apache Kafka 0.9

Read this article on my new blog

Ted Dunning and I have worked on a tutorial that explains how to write your first Kafka application. In this tutorial you will learn how to:

Install and start Kafka
Create and Run a producer and a consumer

You can find the tutorial on the MapR blog:

Getting Started with Sample Programs for Apache Kafka 0.9

Tug's Blog

Tuesday, October 11, 2016

Getting started with Apache Flink and Kafka

Introduction

Monday, September 26, 2016

Streaming Analytics in a Digitally Industrialized World

Thursday, September 1, 2016

Setting up Spark Dynamic Allocation on MapR

Thursday, March 31, 2016

Save MapR Streams messages into MapR DB JSON

Thursday, March 10, 2016

Getting Started with MapR Streams

Wednesday, February 10, 2016

Getting Started With Sample Programs for Apache Kafka 0.9

Blog Archive

Followers

Twitter

Label Cloud

About Me