Azure Event Hubs, Stream Analytics and Power BI

Content

  1. What is Azure Event Hubs?
  2. Example Use Cases
  3. Key Concepts and Relationships
  4. Demo: Streaming Bitcoin Price Data in Real-Time [Video]
  5. Reference Material

1. What is Azure Event Hubs?

Azure Event Hubs is a highly scalable event ingestion service, capable of processing millions of events per second with low latency and high reliability. Conceptually, Event Hubs can be thought of as a liaison between “event producers” and “event consumers” as depicted in the diagram below.

azure_event_hub_high_level.png

2. Example Use Cases

  • Real-time stock market analysis (e.g. automated high-frequency trading). 
  • Analyse transactions as they occur to detect fraudulent activity.
  • React immediately to machine sensor data when a threshold is exceeded (e.g. temperature, pressure, fault, etc). 

Note: In each scenario, the data would lose a large share of its potential value if not processed quickly.

3. Key Concepts and Relationships

Event Publisher
An entity that sends data to an Event Hub (ingress).

Event Consumer
An entity that reads data from an Event Hub (egress).

Capture
Capture is a feature that allows streaming data to be automatically stored in either Azure Blob Storage or Azure Data Lake.

Partitions
Partitions is a mechanism in which data can be organised, enabling consumers to only read a specific subset (or partition) of a stream. Paritions also enable downstream parallelism in scenarios where there are multiple consuming applications.

Partition Key
Used by publishers to map event data to a specific partition. If no key is specified, a round-robin assignment is used.

Consumer Groups
A consumer group is a view (state, position or offset) of an entire event hub. Consumer groups allow multiple consumers to have distinct views of an event stream (i.e. read at their own pace). Consumer groups can be thought of as a subscription to an event stream, if another application (i.e. consumer) would like to subscribe to the same stream but process the data differently, this would be an example of where an additional consumer group could be beneficial.

Note: Partitions can only be accessed via a consumer group (there is always a default in an event hub).

Throughput Units (TU)
Capacity is defined by selecting a number of Throughput Units (TU) during the initial creation of an Event Hub namespace. The number of throughput units applies to all event hubs within a namespace. 

Each unit is entitled to:

  • Ingress (events sent into Event Hub): Up to 1 MB per second or 1,000 events per second (whichever comes first)
  • Egress (events sent out of Event Hub): Up to 2 MB per second

Shared Access Signatures (SAS)
Event publishers require a SAS token to identify and authenticate themselves to an event hub.

The following diagram illustrates the relationships between the core elements.

azure_event_hub_relationships.png

While it may be easier to conceptualise the high-level flow with single entities (i.e. Producer, Hub, Consumer), in reality, there can be multiple entities sending events to an event hub and by the same token, multiple consumers reading from an event stream.

azure_event_hub_high_level_multi.png

4. Demo: Streaming Bitcoin Price Data in Real-Time

In this demo, we will stream live Bitcoin price data to Power BI. We will use a timer based Azure Function as our "event producer", Stream Analytics as our "event consumer", and an Azure Event Hub as our ingestion service.

Architecture

bitcoin_streaming_architecture.png

Prerequisites

High Level Steps

  1. Create an Event Hub.
  2. Create a timer based Azure Function that consumes the API and outputs to Event Hub on a regular schedule.
  3. Create a Stream Analytics Job that consumes data from the Event Hub and outputs to Power BI.
  4. Visualise the live stream in Power BI.

Video

5. Reference Material

Azure Function Code

using System.Net.Http;

// HTTP Client
private static HttpClient client = new HttpClient();

public static async Task Run(TimerInfo myTimer, IAsyncCollector<string> outputEventHubMessage, TraceWriter log)
{
    try
    {
        // API
        var url = "https://www.bitstamp.net/api/v2/ticker/btcusd";

        // 1. HTTP Request
        var request = await client.GetAsync(url);

        // 2. HTTP Response
        var response = await request.Content.ReadAsStringAsync();

        // 3. Update Event Hub Parameter
        await outputEventHubMessage.AddAsync(response);

        log.Info("Event sent to Azure Event Hub!");
    }
    catch (Exception ex)
    {
        log.Error("An error occured: ", ex);
    }
}

Stream Analytics Job

SELECT
    DATEADD(ss, timestamp, '1970/01/01 GMT') as event_date,
    CAST(last AS float) AS price
FROM
    streaminput