Quick Installation of Single node Datazen Server in Azure Cloud Service & Sample Dashboards


The demo provides step by step guidance on quick setup of datazen server on single node server & connecting with publisher app to build custom visuals.

Pre-requisites for the demo : 

  1. An active Azure subscription.
  2. Windows 10 store(for installation of Datazen publisher)

Detailed steps are depicted as follows:

Steps Screen Shot
1.     Go to https://manage.windowsazure.com/

2.     Login with your Live ID.

3.     Click +New.

4.     Select Compute -> Virtual Machine -> From Gallery.
(Fig. 2)

VM-provision.JPG
5.     Select the Windows Server 2012 R2 Datacenter image.

6.     Click the next arrow at the bottom right.

 

 

7.     Enter the required information for Virtual machine configuration.

o   Virtual machine name

o   Choose Basic or Standard Tier (recommended)

o   Choose A4 as the machine size.  A4 has 8 cores, which is the minimum number of cores for a single machine setup.

o   Enter the machine admin username/password.

o   Click the next arrow.

 

 
8.     Enter the required information for Virtual machine configuration.

o   Select option Create a new cloud service.

Make sure the Cloud Service DNS Name is available.

o   Choose the subscription it should be billed to, the region it should be deployed.

Choose the one closest to your location

o   Leave the Storage Account and Availability Set as is.

o   Add an HTTP endpoint at a minimum.

You may need to scroll to add the endpoint.

o   Click the next arrow.

 
9.     Select Install VM Agent and leave other unchecked.

10.  Click the checkmark to start the deployment process.

You’ll see it start the provisioning process in the list of the virtual machines you are responsible for in Azure.

 

 
11.   Wait for the status to change from Starting to Running in virtual machines.  
12.   Select your VM then click Connect.  
13.   Save the RDP file to your local machine.  
14.   Open the saved Remote Desktop Connection file, then click Connect.  
15.   Connect to the VM via Remote Desktop and enter the admin username/password. VM.png
16.   Click Yes to connect to Server  
17.   Click Configure this local server in the Server Manager dashboard that appears when you login.  
18.  Then change the IE Enhanced Security Configuration to Off for Administrators.

You can always change it back if you really want to when you’re done.

19.  Close the Server Manager.

Server Manager.png

 

 

Section 2: Install the Datazen Server

1.     Navigate to the following link and download the Datazen server software onto the VM. You may need to turn off IE Enhanced Security on the server to do so.

2.     The Datazen server files download as a zipped file. Extract all the files

3.     Open Datazen Enterprise Server.3.0.2562.

4.     Click run to start the install process.

Datazen Server.JPG
5.     In the Datazen Enterprise Server Setup click Next.  Finish-DataStudio.JPG
6.     Click Next in the Setup wizard, accepting the terms in the License Agreement and moving through each screen.

 

 
7.     Click Next on the Features page

 

 
8.     Click Next on the Core Service Credentials page.  
9.     Once you get to the Admin Password page, type a Password for the Datazen admin user.  (Fig. 20)

This doesn’t have to be the same password as you used for the server.

10.  Click Next. 

ControlPanel-email.JPG
11.  On the Authentication page, leave the Authentication Mode as Default.

12.  Click Next.

 
13.  On the Repository Encryption page, select Copy to Clipboard, then paste the key into a Notepad file.

14.  Save the Notepad file to a safe location.

 

15.  Click Next. 

 
16.  On the Instance Id page, select Copy to Clipboard then paste the Id into a Notepad file.

17.  Save the Notepad file to a safe location.

 

18.  Click Next. 

 
19.  On the Data Acquisition Service Credentials page, leave the credentials as is, then click Next.  
20.  On the Web Applications IIS Settings page leave the default settings, then click Next.   
21.  On the Control Panel Email Settings page, leave the default values since this is a test server.

22.  Click Next. 

 
23.  On the Ready to Install page, click Install and wait until the installation is complete.
This might take a few minutes.
 

Section 3: Configure the Datazen Server

1.   Open your browser in your local machine

2.   Navigate to http://mycloudservicename.cloudapp.net/cp.

Make sure you replace the yourcloudservicename with the name of your cloud service.

3.   If you can successfully connect, you should see the Control Panel Log In screen.

4.   Enter the username admin and the password you entered in the Setup wizard, then select  Log In.

Login.JPG
5.   You will need to create a new user to start creating dashboard hubs, since you need each hub to have an owner.  The owner can NOT be the admin user.  Click Create User to create your first user.

 

 

 

UserCreate.JPG
6.   Enter a value in the top three fields (the email address can be fake if you want) and select Create User.  
7.   You will now see a new option to Create BI Hub.  activate-user.JPG
8.   Enter Hub name whatever you’d like, but make sure you enter the username of the user you just created for owner username.

9.   Enter a maximum number of users that can connect to hub.

10.  Click Create. 

BI Hub Created.JPG
11.   Finish the creation of the hub. It will be displayed in the list of available hubs.  
12.  The new hub will also be shown in the navigation menu at the bottom left of the screen.  
13.  Click the Server Users link on the left-hand side of the screen.

 

Server Users.png
14.  Click Create User.   
15.  Fill in the fields under Required Info.

16.  Click Create User.

 
17.  You will see the user and a Set password link option next to the username.

18.  Click on Set password link and then copy the link to your clipboard.
Note: This step is only required as the email notification is not set up.

 
19.  Logout as the admin

20.  Open a new browser window and paste the URL to reset the password into the address bar.

You can now finish setting up that user by entering the password for the account.

 
21.  In the Control Panel Activate User Account screen, then enter the new password, then re-type password

22.  Click Activate My Account. (Fig. 40)

23.  Logout as this user and log back in as the admin before proceeding.

activate-user.JPG

 

Section 4: Apply a Custom Branding

1.  To add the Wide World Importers brand package to the server, save it locally. The package is provided with this demo.

2.  Click on the Branding link on the left-hand side and upload the brand package to the server.

Branding.png
3.  Make sure you choose the Server to upload it to.

You will see the Server icon has the Wide World Importers branding associated.

 

 
4.  To make sure it was applied properly, open a new browser and navigate to the following URL (make sure you replace the mycloudservicename with whatever you named yours)

http://mycloudservicename.cloudapp.net

Your Server Login screen should look as shown on the right, now having the Wide World Importers brand package applied. (Fig. 43)

Server Login

 

Section 5: Connect to the Datazen Server with Publisher

1.  Open the Datazen Publisher app

If this is the first time using the app, you will have the option of connecting the Datazen demo server.  We recommend doing that, so you will have some nice demo dashboards to show immediately.

2.  To add new server Right-click in the dashboard, then click Connected. (Fig. 44)

 
3.  Click Add New Server Connection (Fig. 45) Demo
4.  Provide the following information to connect to a Datazen server. (Fig. 46)

Server Address: mycloudservicename.cloudapp.net
User name:
user name that created
Password:
provide user password

5.  Uncheck Use Secure Connection.

6.  Click Connect. (Fig. 46)

Datazen Server Login.png
7.  When connected, you should be able to publish dashboards to your Datazen server. (Fig. 47)

8.  You will see a nice dashboard with KPIs for Wide World Importers and Fabrikam Insurance. (Fig. 47)

Datazen Screen

 

 

 

 

 

Azure Stream Analytics & Machine Learning Integration With RealTime Twitter Sentiment Analytics Dashboard on PowerBI


Recently, it has been introduced the integration of ASA & AML available as preview update & it’s possible to add AML web service URL & API key as ‘custom function‘ with ASA input. In this demo, realtime tweets are collected based on keywords like ‘#HappyHolidays2016‘, ‘#MerryChristmas‘, ‘#HappyNewYear2016‘ & those are directly stored on a .csv file saved on OneDrive. Here goes the solution architecture diagram of the POC.

SolutionArc

 

 

Now, add the Service Bus event hub endpoint as input to the ASA job, while deploy the ‘Twitter Predictive Sentiment Analytics Model‘  & click on ‘Open in Studio‘ to start deploy the model. Don’t forget to run the solution before deploying.

AML

 

Once the model is deployed, open the ‘Web Service‘ dashboard page to get the model URL & API key, click on default endpoint -> download the excel 2010 or earlier apps. Collect the URL & API key to apply it to ASA function credentials for AML deployment.

DeployedAML

Next, create an ASA job & add the event hub credentials where the real world tweets are getting pushed & click on ‘Functions‘ tab of ASA job to add the AML credentials. Provide model name, URL & API key of the model & Once, it’s added, click on Save.

ASA-Functions

 

Now, add the following ASA SQL to aggregate the realtime tweets sentiment scores coming out from predictive twitter sentiment model.

Query

 

Provide the output as Azure Blob storage, add a container name & serialization type as CSV & start the ASA job. Also, start importing data into PowerBI desktop from the ASA output Azure blob storage account.

Output

 

 

PowerBI desktop contains in-built power Query to start preparing the ASA output data & processing data types. Choose the AML model sentiment score datatype as decimal type & TweetTexts as Text(String) type.

PBI-AML

 

Start building the ‘Twitter Sentiment Analytics‘ dashboard powered by @AzureStreaming & Azure Machine Learning API with realworld tweet streaming, there’re some cool custom visuals are available on PowerBI.  I’ve used some visuals here like ‘wordcloud‘ chart which depicts some of the highly scored positive sentiment contained tweets with most specific keywords like ‘happynewyear2016‘, ‘MerryChristmas‘,’HappyHolidays‘ etc.

PBI-visuals

 

While, in the donut chart, the top 10 tweets with most positive sentiment counts are portrayed with the specific sentiment scores coming from AML predictive model experiment integrated with ASA jobs.

PBI-dashboard

~Wish you HappyHolidays 2016!

A lap around Microsoft Azure IoT Hub with Azure Stream Analytics & IoT Analytics Suite


Last month on #AzureConf 2015, the Azure IoT Suite has been announced to be available for purchase along with the GA release of Azure IoT Hub. The IoT Hub helps to control, monitor & connect thousands of devices to communicate via cloud & talk to each other using suitable protocols. You can connect to your Azure IoT Hub using the IoT Hub SDKs available in different languages like C, C#, Java, Ruby etc. Also, there’re monitoring devices available like device explorer or iothub-explorer. In this demo, Weather Data Analytics is demonstrated using Azure IoT Hub with Stream Analytics powered by Azure IoT Suite & visualized using Azure SQL database with PowerBI.

You can provision your own device into Azure IoT analytics Suite using device explorer or iothub-explorer tool & start bi-directional communication through device-cloud & cloud-device.

First, create your Azure IoT Hub from Azure Preview Portal  by selecting New-> Internet of Things -> Azure IoT Hub. Provide hub name, select pricing & scale tier[F1 – free(1/subscription, connect 10 devices, 3000 messages /day), [S1 – standard (50,000 messages/day) & S2- standard(1.5 M messages/day)] for device to cloud communication. Select IoT Hub units, device to cloud partitions, resource group, subscription & finally location of deployment(currently it’s available only in three locations- ‘East Asia’, ‘East US’, ‘North Europe’.

 

IoThubcreate

 

Once the hub is created, next switch to device explorer to start creating a device, for details about to create a device & register, refer to this Github page. After registering the device, move back to  ‘Data‘ tab of device explorer tool & click on ‘Monitor‘ button to start receive device-cloud events sent to Azure IoT Hub from device.

DeviceExplorer

 

The schema for the weather dataset looks like the following data & fresh data collected from various sensors & feed into Azure IoT Hub which can be viewed using Device Explorer tool.

DataSchema

 

In order to push data from weather data sensor device to Azure IoT hub, the following code snippet needs to be used. The full code-snipped is going to be available on my Github page.

 

using System;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Data;
using Newtonsoft.Json;
using Microsoft.VisualBasic;
using Microsoft.VisualBasic.FileIO;

namespace Microsoft.Azure.Devices.Client.Samples
{
class Program
{
private const string DeviceConnectionString = “Your device connection-string”;
private static int MESSAGE_COUNT = 5;
static string data = string.Empty;

static void Main(string[] args)
{
try
{
DeviceClient deviceClient = DeviceClient.CreateFromConnectionString(DeviceConnectionString);

if (deviceClient == null)
{
Console.WriteLine(“Failed to create DeviceClient!”);
}
else
{
SendEvent(deviceClient).Wait();
ReceiveCommands(deviceClient).Wait();
}

Console.WriteLine(“Exited!\n”);
}
catch (Exception ex)
{
Console.WriteLine(“Error in sample: {0}”, ex.Message);
}
}

static async Task SendEvent(DeviceClient deviceClient)
{
string[] filePath = Directory.GetFiles(@”\Weblog\”,”*.csv”);
string csv_file_path = string.Empty;
int size = filePath.Length;
for(int i=0; i< size; i++)
{
Console.WriteLine(filePath[i]);
csv_file_path = filePath[i];
}

DataTable csvData = GetDataTableFromCSVFile(csv_file_path);
Console.WriteLine(“Rows count:” + csvData.Rows.Count);
DataTable table = csvData;
foreach(DataRow row in table.Rows)
{
foreach(var item in row.ItemArray)
data = item.ToString();
Console.Write(data);

try
{
foreach(DataRow rows in table.Rows)
{
var info = new WeatherData
{
weatherDate = rows.ItemArray[0].ToString(),
weatherTime = rows.ItemArray[1].ToString(),
apperantTemperature = rows.ItemArray[2].ToString(),
cloudCover = rows.ItemArray[3].ToString(),
dewPoint = rows.ItemArray[4].ToString(),
humidity = rows.ItemArray[5].ToString(),
icon = rows.ItemArray[6].ToString(),
pressure = rows.ItemArray[7].ToString(),
temperature = rows.ItemArray[8].ToString(),
timeInterval = rows.ItemArray[9].ToString(),
visibility = rows.ItemArray[10].ToString(),
windBearing = rows.ItemArray[11].ToString(),
windSpeed = rows.ItemArray[12].ToString(),
latitude = rows.ItemArray[13].ToString(),
longitude = rows.ItemArray[14].ToString()
};

var serializedString = JsonConvert.SerializeObject(info);
var message = data;
Console.WriteLine(“{0}> Sending events: {1}”, DateTime.Now.ToString(), serializedString.ToString());
await deviceClient.SendEventAsync(new Message(Encoding.UTF8.GetBytes(serializedString.ToString())));
}
}

catch(Exception ex)
{
Console.ForegroundColor = ConsoleColor.Red;
Console.WriteLine(“{0} > Exception: {1}”, DateTime.Now.ToString(), ex.Message);
Console.ResetColor();
}
// Task.Delay(200);

}

Console.WriteLine(“Press Ctrl-C to stop the sender process”);
Console.WriteLine(“Press Enter to start now”);
Console.ReadLine();

//string dataBuffer;

//Console.WriteLine(“Device sending {0} messages to IoTHub…\n”, MESSAGE_COUNT);

//for (int count = 0; count < MESSAGE_COUNT; count++)
//{
// dataBuffer = Guid.NewGuid().ToString();
// Message eventMessage = new Message(Encoding.UTF8.GetBytes(dataBuffer));
// Console.WriteLine(“\t{0}> Sending message: {1}, Data: [{2}]”, DateTime.Now.ToLocalTime(), count, dataBuffer);

// await deviceClient.SendEventAsync(eventMessage);
//}
}

private static DataTable GetDataTableFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
string data = string.Empty;
try
{
using (TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { “,” });
csvReader.HasFieldsEnclosedInQuotes = true;

//read column names
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();

for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == “”)
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);

}
}
}
catch (Exception ex)
{
Console.WriteLine(“Exception” + ex.Message);
}
return csvData;
}

static async Task ReceiveCommands(DeviceClient deviceClient)
{
Console.WriteLine(“\nDevice waiting for commands from IoTHub…\n”);
Message receivedMessage;
string messageData;

while (true)
{
receivedMessage = await deviceClient.ReceiveAsync(TimeSpan.FromSeconds(1));

if (receivedMessage != null)
{
messageData = Encoding.ASCII.GetString(receivedMessage.GetBytes());
Console.WriteLine(“\t{0}> Received message: {1}”, DateTime.Now.ToLocalTime(), messageData);

await deviceClient.CompleteAsync(receivedMessage);
}
}
}
}
}

You could check output to events sending from device to cloud on console.

dataconsole

Next, start pushing the device data into Azure IoT Hub & monitor the events receiving process through device explorer. Now, start provisioning an Azure Stream Analytics Job on Azure portal. Provide ‘Azure IoT Hub‘ as an input to the job like as the followings.

SAJob

 

input

IoTHubinput

Now provide Azure Stream Analytics Query to connect incoming unstructured datasets from device to cloud to pass into Azure SQL database. So, first, provision a SQL database on Azure & connect to as output to Stream Analytics job.

create table input(
weatherDate nvarchar(max),
weatherTime datetime,
apperantTemperature nvarchar(max),
cloudCover nvarchar(max),
dewPoint nvarchar(max),
humidity nvarchar(max),
icon nvarchar(max),
pressure nvarchar(max),
temperature nvarchar(max),
timeInterval nvarchar(max),
visibility nvarchar(max),
windBearing nvarchar(max),
windSpeed nvarchar(max),
latitude nvarchar(max),
longitude nvarchar(max)
)
select input.weatherDate, input.weatherTime,input.apperantTemperature,input.cloudCover,
input.dewPoint, input.humidity,input.icon,input.pressure,count(input.temperature) as avgtemperature, input.timeInterval, input.visibility, input.windBearing,
input.windSpeed,input.latitude,input.longitude

into weathersql
from input
group by input.weatherDate, input.weatherTime, input.apperantTemperature,input.cloudCover,
input.dewPoint, input.humidity,input.icon, input.pressure,input.timeInterval,input.visibility, input.windBearing,
input.windSpeed,input.latitude,input.longitude, TumblingWindow(second,2)

ASA-sql

Specify the output of ‘WeatherIoT’ ASA job as ‘Azure SQL Database‘, alternatively, you can select any of the rest of the connectors like ‘Event Hub’, ‘DocumentDB’ etc.

SAOutput

 

Make sure that , to create the necessary database & table first on SQL before adding as output to ASA job. For this demo, I have created the ‘weatheriot‘ table on Azure SQL database. The t-sql query looks like this.

iotsql

 

Next, start the ASA job & receive the final Azure IoT hub(device to cloud) data processed to IoT hub ->ASA -> Azure SQL database pipeline. Once you receive data on your Azure SQL table. Start building the PowerBI ‘Weather IoT Data Analytics’ dashboard for visualization & to leverage the power of Azure IoT momentum.

SQLoutput

Connect to PowerBI connected through same account of Azure subscription where you provisioned the ASA job & start importing data from Azure SQL database. Create stunning reports using funnel, donut, global map charts with live data refresh.

WeatherData

For this demo, I’ve populated charts on average weather temperature, pressure, humidity, dew point forecasting analysis over specific areas based on latitude & longitude values, plotted & pinned into PowerBI ‘Weather Data Azure IoT Analytics’ dashboard.

WeatherData-analysis

 

What’s new in Azure Data Catalog


The Azure Data Catalog (aka previously PowerBI Data Catalog) has released in public preview on last monday(July 13th) @WPC15, which typically reveals a new world of storing & connecting #Data across on-prem & azure SQL database. Lets hop into a quick jumpstart on it.

Connect through Azure Data Catalog through this url  https://www.azuredatacatalog.com/ by making sure you are logging with your official id & a valid Azure subscription. Currently , it’s free for first 50 users & upto 5000 registered data assets & in standard edition, upto 100 users & available upto 1M registered data assets.

Provision

 

Lets start with the signing of the official id into the portal.

Signin

Once it’s provisioned, you will be redirected to this page to launch a windows app of Azure Data Catalog.

AzureDC

 

It would start downloading the app from clickonce deployed server.

ADCapp

 

After it downloaded & would prompt to select server , at this point it has capacity to select data from SQL Server Analysis service, Reporting Service, on-prem/Azure SQL database & Oracle db.

Servers

For this demo, we used on-prem SQL server database to connect to Azure Data Catalog.

Catalog

We selected here ‘AdventureWorksLT’ database & pushed total 8 tables like ‘Customer’, ‘Product’, ‘ProductCategory’, ‘ProductDescription’,’ProductModel’, ‘SalesOrderDetail’ etc. Also, you can tags to identify the datasets on data catalog portal.

metadata-tag

Next, click on ‘REGISTER’ to register the dataset & optionally, you can include a preview of the data definition as well.

Object-registration

 

Once the object registration is done, it would allow to view on portal. Click on ‘View Portal’ to check the data catalogs.

Portal

Once you click , you would be redirected to data catalog homepage where you can search for your data by object metaname.

Search

 

SearchData

in the data catalog object portal, all of the registered metadata & objects would be visible with property tags.

Properties

You can also open the registered object datasets in excel to start importing into PowerBI.

opendata

Click on ‘Excel’ or ‘Excel(Top 1000)’ to start importing the data into Excel. The resultant data definition would in .odc format.

SaveCustomer

 

Once you open it in Excel, it would be prompted to enable custom extension. Click on ‘Enable’.

Security

From Excel, the dataset is imported to latest Microsoft PowerBI Designer Preview app to build up a custom dashboard.

ADC-PowerBI

Login into https://app.powerbi.com & click to ‘File’ to get data from .pbix file.

PowerBI

Import the .pbix file on ‘AdventureWorks’ customer details & product analytics to powerbi reports & built up a dashboard.Uploading

The PowerBI preview portal dashboard has some updates on tile details filter like extension of custom links.

PowerBI-filter

 

The PowerBI app for Android is available now, which is useful for quick glance of real-time analytics dashboards specially connected with Stream analytics & updating  real time.

WP_20150715_14_07_48_Pro

WP_20150715_14_13_33_Pro

AdventureWorks-ADC

 

 

 

What’s new in Azure SDK 2.5 & Visual Studio 2013 Update 4


Recently, after playing enough with Azure Stream Analytics , it’s time to move on with azure .net development & a new version of Azure sdk is published. Let’s have a quick overview on latest azure sdk.

First of all, lets download the sdk from webpi console, as directed ‘Microsoft Azure SDK 2.5 for .NET(VS 2013)

webpi

In this edition, there are few new components added like as:

i) EnvironmentTools.VS.msi

ii) HiveODBC32.msi

iii)HiveODBC64.msi

iv) Microsoft.Azure.HDInsightTools-x64.msi

v) Microsoft.Azure.HDInsightTools-x86.msi

so on…

Components

Now, after installing sdk 2.5 , lets start with Visual Studio 2013.

Vs2013-sdk2.5

Expand on ‘QuickStart’ under ‘Cloud’ & start exploring options to create AppService , Compute & DataService directly from VS 2013 /2012 itself.

sdk2.5

 

The default ‘DataBlobStorage1’ sample would be created in VS to create blob container, create a block blob/page blob, upload a new blob , delete a blob (all basic CRUD operations on blob using REST)

BlobStorage-VS

Next, the major improvements is done on Azure HDinsight shell integration into Visual Studio onto which you can now run your custom Hive table queries on HDFS of HDInsight clusters. Lets create a sample Hive query file on VS 2013.

Lets move into HDInsight tab on left side of VS installed menu & select HDInsight’ & select ‘HiveApplication’ to start with new Hive-ql. For this demo, I am selecting Hive Sample from VS.

HDI

 

On selecting Hive sample, I would be able to open the sample Hive queries on ‘weblogAnalysis.hql‘  & ‘sensordataAnalysis.hql’ from Azure HDinsight cluster.

Here goes a sample weblogAnalysis.hql:

DROP TABLE IF EXISTS weblogs;
— create table weblogs on space-delimited website log data.
— In this sample we will use the default container. You could also use ‘wasb://[container]@[storage account].blob.core.windows.net/Path/To/Data/’ to access the data in other containers.
CREATE EXTERNAL TABLE IF NOT EXISTS weblogs(s_date date, s_time string, s_sitename string, cs_method string, cs_uristem string,
cs_uriquery string, s_port int, cs_username string, c_ip string, cs_useragent string,
cs_cookie string, cs_referer string, cs_host string, sc_status int, sc_substatus int,
sc_win32status int, sc_bytes int, cs_bytes int, s_timetaken int )
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘ ‘
STORED AS TEXTFILE LOCATION ‘/HdiSamples/WebsiteLogSampleData/SampleLog/’
TBLPROPERTIES (‘skip.header.line.count’=’2’);

 

Before proceeding with the realtime hive queries, we need to make sure that the Azure HDI cluster is already provisioned & it might be either a simple Hadoop HDI cluster, HBase HDI cluster or Storm HDI cluster to build hive tables on top of it.

sensorhql-vs

There’s a new option came out for Azure HDI cluster to add custom powershell scripts while provisioning a HDI cluster using azure portal. Also, new additions of HDI cluster is exploration of R(official cran packages) & Apache Spark on hdinsight hdfs cluster which will be covered with demo next.

A Quick Walk-through on Azure Storage(SQL, NoSQL, NewSQL)


It’s always imaginable that developers are always flexible to proceed with relational databases while migrating an existing on-premise app to Azure platform while leveraging best possible architectural guidelines on migration to cloud. But, still forth in real-time cases , typical scenarios like suboptimal performance, high expenses, or worse case scenario because, NOSQL db can handle some tasks more efficiently than relational databases can. In few enterprise cases, it’s encountered a critical data storage problem, as because NOSQL solution implementation have been better off before deploying its app to production.

Moreover, there’s no single best data management choice for all data storage tasks, different data management solutions are optimized for different tasks. Let’s have a quick walk-through on various data storage option models supported on Microsoft Azure.

AzureDB

 

 

Let’s start first by four types of NOSQL db supported now Azure.

  • Key/value pair databases: store a single serialized object for each key value. They’re good for storing large volumes of data in situations where you want to get one item for a given key value and you don’t have to query based on other properties of the item.
  • Azure Blob Storage : It’s also a key/value based data storage which is same like as file system in functionality where you could search a file based on it’s folder/file name as key not file content as key. Blob offers read-write storage options (aka Block Blob) for storing large media files as well as for standard streaming purpose facilitates the usage of VHDs as Page Blob(aka Azure Drive).
  • Azure Table Storage : A standard key-value pair based NOSQL storage option prevailed from Azure storage inception phase. Each value is called an entity (similar to a row, identified by a partition key and row key) and contains multiple properties (similar to columns, but not all entities in a table have to share the same columns). Querying on columns other than the key is extremely inefficient and should be avoided.
  • Document Databases : Popular key/value databases in which the values are documents. “Document” here isn’t used in the sense of a Word or an Excel document but means a collection of named fields and values, any of which could be a child document. For example, in an order history table, an order document might have order number, order date, and customer fields, and the customer field might have name and address fields. The database encodes field data in a format such as XML, YAML, JSON, or BSON, or it can use plain text. One feature that sets document databases apart from other key/value databases is the capability they provide to query on nonkey fields and define secondary indexes, which makes querying more efficient. This capability makes a document database more suitable for applications that need to retrieve data on the basis of criteria more complex than the value of the document key.

Example : Mongo DB.

  • Column-family databases : key/value pair based data storage enables to structure data based on collections of columns called ‘Column families‘. For example, a population database consists of one group of column called ‘Persons’ (containing firstname, middlename, lastname) , one group for person’s address & another for profile info. The database can then store each column family in a separate partition while keeping all of the data for one person related to the same key. You can then read all profile information without having to read through all of the name and address information as well.

Example : Cassendra , Apache HBase (in preview supported with HDInsight as NOSQL Blob Storage)

  • Graph databases : Stores data in form of objects & relationships.The purpose of a graph database is to enable an application to efficiently perform queries that traverse the network of objects and the relationships between them. For example, the objects might be employees in a human resources database, and you might want to facilitate queries such as “find all engineers who directly or indirectly work for Product Manager.”

Example : Neo4j Graph Database.

Compared with relational databases, the NoSQL options offer far greater scalability and are more cost effective for storage and analysis of unstructured data. The tradeoff is that they don’t provide the rich querying and robust data integrity capabilities of relational databases. NoSQL options would work well for IIS log data, which involves high volume with no need for join queries. NoSQL options would not work so well for banking transactions, which require absolute data integrity and involve many relationships to other account-related data.

  • A brief about NewSQL : Combines the scalability features of NOSQL along with distributed querying & transactional integrity of OldSQL.
  • The first type of NewSQL systems are completely new database platforms. These are designed to operate in a distributed cluster of shared-nothing nodes, in which each node owns a subset of the data. Though many of the new databases have taken different design approaches, there are two primary categories evolving. The first type of system sends the execution of transactions and queries to the nodes that contain the needed data. SQL queries are split into query fragments and sent to the nodes that own the data. These databases are able to scale linearly as additional nodes are added.
  • General-purpose databases
    These maintain the full functionality of traditional databases, handling all types of queries. These databases are often written from scratch with a distributed architecture in mind, and include components such as distributed concurrency control, flow control, and distributed query processing. This includes Google Spanner, Clustrix, FoundationDB, NuoDB,TransLattice, ActorDB,andTrafodion.
    In-memory databases
    The applications targeted by these NewSQL systems are characterized as having a large number of transactions that (1) are short-lived (i.e., no user stalls), (2) touch a small subset of data using index lookups (i.e., no full table scans or large distributed joins), and (3) are repetitive (i.e. executing the same queries with different inputs).
    These NewSQL systems achieve high performance and scalability by eschewing much of the legacy architecture of the original IBM System R design, such as heavyweight recovery or concurrency control algorithms.
    Example systems in this category are:VoltDB, Pivotal‘s SQLFire and GemFire XD, SAP HANA.
    Example : NuoDB is supported in Azure as NewSQL.
    • Key Points to Consider while choosing the Data Storage Options :

Data semantic


What is the core data storage and data access semantic (are you storing relational or unstructured data)?
Unstructured data such as media files fits best in Blob storage; a collection of related data such as products, inventories, suppliers, customer orders, etc., fits best in a relational database.
Query support


How easy is it to query the data?
What types of questions can be efficiently asked?
Key/value data stores are very good at getting a single row when given a key value, but they are not so good for complex queries. For a user-profile data store in which you are always getting the data for one particular user, a key/value data store could work well. For a product catalog from which you want to get different groupings based on various product attributes, a relational database might work better.
NoSQL databases can store large volumes of data efficiently, but you have to structure the database around how the app queries the data, and this makes ad hoc queries harder to do. With a relational database, you can build almost any kind of query.
Functional projection


Can questions, aggregations, and so on be executed on the server?
If you run SELECT COUNT(*) from a table in SQL, the DBMS will very efficiently do all the work on the server and return the number you’re looking for. If you want the same calculation from a NoSQL data store that doesn’t support aggregation, this operation is an inefficient “unbounded query” and will probably time out. Even if the query succeeds, you have to retrieve all the data from the server and bring it to the client and count the rows on the client.
What languages or types of expressions can be used?
With a relational database, you can use SQL. With some NoSQL databases, such as Azure Table storage.

Ease of scalability


How often and how much will the data need to scale?
Does the platform natively implement scale-out?
How easy is it to add or remove capacity (size and throughput)?
Relational databases and tables aren’t automatically partitioned to make them scalable, so they are difficult to scale beyond certain limitations. NoSQL data stores such as Azure Table storage inherently partition everything, and there is almost no limit to adding partitions. You can readily scale Table storage up to 200 terabytes, but the maximum database size for Azure SQL Database is 500 gigabytes. You can scale relational data by partitioning it into multiple databases, but setting up an application to support that model involves a lot of programming work.
Instrumentation and Manageability


How easy is the platform to instrument, monitor, and manage?
You need to remain informed about the health and performance of your data store, so you need to know up front what metrics a platform gives you for free and what you have to develop yourself.
Operations


How easy is the platform to deploy and run on Azure? PaaS? IaaS? Linux?
Azure Table storage and Azure SQL Database are easy to set up on Azure. Platforms that aren’t built-in Azure PaaS solutions require more effort.
API Support


Is an API available that makes it easy to work with the platform?
The Azure Table Service has an SDK with a .NET API that supports the .NET 4.5 asynchronous programming model. If you’re writing a .NET app, the work to write and test the code will be much easier for the Azure Table Service than for a key/value column data store platform that has no API or a less comprehensive one.
Transactional integrity and data consistency


Is it critical that the platform support transactions to guarantee data consistency?
For keeping track of bulk emails sent, performance and low data-storage cost might be more important than automatic support for transactions or referential integrity in the data platform, making the Azure Table Service a good choice. For tracking bank account balances or purchase orders, a relational database platform that provides strong transactional guarantees would be a better choice.
Business continuity


How easy are backup, restore, and disaster recovery?
Sooner or later production data will become corrupted and you’ll need an undo function. Relational databases often have more fine-grained restore capabilities, such as the ability to restore to a point in time. Understanding what restore features are available in each platform you’re considering is an important factor to consider.
Cost


If more than one platform can support your data workload, how do they compare in cost?
For example, if you use ASP.NET Identity, you can store user profile data in Azure Table Service or Azure SQL Database. If you don’t need the rich querying facilities of SQL Database, you might choose Azure Table storage in part because it costs much less for a given amount of storage.

A Lap around the New Microsoft Azure Management Portal & Preview Programs


Well, a few months back during Build 2014, the new Microsoft Azure Management Portal (Preview) announced for public access. There’s a quite lot interesting features are included to it to make it more appealing & enchanting. One of the unique feature is ‘Parts‘ quite similar like metro ‘Tiles’ to make developer’s life more easier conglomerate all services under one single UI pane.

Portal

 

  • Select , Virtual Machine pane while list of all standard VM images along with available data centers are available.

VM

 

  • Similarly , Web , Mobile & Developer Services , you would be able to see the respective services while in Data, Cache & Storage+backup service consists of latest Azure Redis Cache(Preview), HDInsight, MySQL, Mongo Labs, SQL Server 2014 Standard, Oracle DB 12.1.0.1 standard & 11G R2, WebLogic server, Azure Backup & Hyper-V Recovery Manager.

Storage

  • Now, check for App+ Data Services, a lot of new app services including ‘Notification Hub’ , ‘Pusher’ , ‘SendGrid‘ is available in new Azure Portal.

App

 

  • The ‘Parts’ of new Azure Portal is self -customizable , so when you right-click on any of tiles , you would get the ‘Customize‘ option from when you can select the appropriate size of blocks of ‘Service Health’, ‘Gallery’, ‘What’s New’ etc.

Tile

 

  • There is a self -notification-able ‘Notification’ hub is added in new portal which summarizes any of ‘Service Health’ report /errors related to service for last 24 hours.

Notification

  •  While, from ‘Browse’ option, you should be able to see the list of ‘Resource Group Services’, ‘Virtual Machine’, ‘Websites’, ‘Team Server projects.

Browse

  • Click on ‘What’s New’ tile while a brilliant  grid view of latest updates from Azure blog is available.

Azure Blog

 

  • Now, take a focus on latest ‘Preview Features‘ available which are ‘Windows Azure Automation’, ‘New Service Tiers for SQL Database’, ‘Azure RemoteApp’, ‘Visual Studio Online Account Access’, ‘Windows Azure Files’, ‘Billing Alert Service’.

First, lets check out the ‘Azure Automation API‘ .

  • Windows Azure Automation allows you to automate the creation, monitoring, deployment, and maintenance of resources in your Windows Azure environment using a highly-available workflow execution engine. Orchestrate time-consuming, error-prone, and frequently repeated tasks against Windows Azure and third party systems to decrease time to value for your cloud operations.
  • SQL DB Premium Tiers(P1 & P2) are already in preview mode since 2013 , now there are three tiers available in this mode Basic, Standard & Premium since Web & Business edition of SQL Azure db will be deprecated on April 2015 onwards.

Details of each tiers :

  • Basic (Preview): Designed for applications with a light transactional workload. Performance objectives for Basic provide a predictable hourly transaction rate.
  • Standard (Preview): Standard is the go-to option for getting started with cloud-designed business applications. It offers mid-level performance and business continuity features. Performance objectives for Standard deliver predictable per minute transaction rates.
  • Premium (Preview): Designed for mission-critical databases, Premium offers the highest performance levels for SQL Database and access to advanced business continuity features. Performance objectives for Premium deliver predictable per second transaction rates.

 

  • Azure RemoteApp helps employees stay productive anywhere, and on a variety of devices – Windows, Mac OS X, iOS, or Android. Your company’s applications run on Windows Server in the Azure cloud, where they’re easier to scale and update. Employees install Microsoft Remote Desktop clients on their Internet-connected laptop, tablet, or phone—and can then access applications as if they were running locally. You need to sign up here

 

  • You’ll need to have Azure Active Directory (Azure AD) set up in order to work with Visual Studio Online Access. You’ll also need an organizational account, which is an email address associated with either your directory or an Office 365 subscription. If you don’t use Azure AD, go here to find out how to set it up. If you use Azure AD but can’t access your organization’s directory, work with your system administrator to get set up.

 

  • Windows Azure Files allows VMs in a Windows Azure Data Center to mount a shared file system using the SMB protocol. These VMs will then be able to access the file system using standard Windows file APIs (CreateFile, ReadFile, WriteFile, etc). Many VMs (or PaaS roles) can attach to these file systems concurrently, allowing you to share persistent data easily between various roles and instances. In addition to accessing your files through the Windows file APIs, you can access your data using the file REST API, which is similar to the familiar blob interface.

 

  • If you’re the Account Administrator for a Windows Azure subscription, you can set up email alerts when a subscription reaches a spending threshold you choose. Alerts are currently not available for subscriptions associated with a commitment plan using Billing Services.To view and set up alerts, go to the Account Center click Subscriptions, select the subscription you want to work with, and then click Alerts. You can set up a total of five billing alerts per subscription, with a different threshold and up to two email recipients for each alert.Preview

 

 

%d bloggers like this: