A lap around of Big Data with Microsoft HDInsight


Big Data synonyms with three V s :  Volume , Velocity & Variety. Even with traditional e-commerce system to modern social networks  all systems data conservation is dependent on this platform. Lets check a scenario of modern e-commerce analytic s after integration with Big Data.

bigdata

ecommerce

  • Big Data platform typically works by storing data first into clusters , then process the data through MapReduce workflows which executes by Mapping the input data through independent chunks processed by appropriate algorithms, the output from Map phase then moves to Shuffle/Sorting phase & finally the output from Shuffle phase comes to Reduce phase as input.
  • Lets check a typical Big Data MapReduce workflow.

storedata

processdata

MR

  • Microsoft’s BigData platform works exactly same way as a collaborative solution with Horton Works named as Microsoft HDInsight. Which typically simplifies the solution of running complex batch scripts. Lets cover a little insight of HDInsight/Hadoop ecosystem.

HDinsight

  • Microsoft’s Big Data platform unveils solutions from storing data into HDFS to query processing on Hive up to implementing Business Intelligence analytics on Excel Powerpivot, Powerpivot, SSAS & SSRS solutions.

MSBigdata

  • Storing data into HDFS : Petabytes to Zetabytes of data to be stored in HDFS clusters by means of Name Node followed by Data Nodes, in Azure HDInsight each Data Node is integrated with a worker roles & compute cluster. Alternatively , you can leverage the solutions using Azure Blob Storage utilizing  Front End(attaches OAuth/Security layer for authentication), Partition layer: for mapping with Azure Queue, table & blob storages , Stream layer : 3 layer HA for scaled out data stream.

HDFS

  • In order to programming on HDInsight , you can opt for Java, C#, F#, .NET, .js API, LINQ to Hive APIs which leverages to code on hadoop ecosystems including hadoop pig, hive, mahout, cascading, pegasus.

hdinsight_API

Microsoft's Hadoop Vision

Microsoft’s Hadoop Vision

Advertisements

About Anindita
Anindita Basak is working as Big Data Cloud Consultant in Microsoft. Worked in multiple MNCs as Developer & Senior Developer on Microsoft Azure, Data Platform, IoT & BI , Data Visualization, Data warehousing & ETL & of course in Hadoop platform.She played both as FTE & v- employee in Azure platform teams of Microsoft.Passionate about .NET , Java, Python & Data Science. She is also an active Big Data & Cloud Trainer & would love share her experience in IT Training Industry. She is an author, forum contributor, blogger & technical reviewer of various books on Big Data Hadoop, HDInsight, IoT & Data Science, SQL Server PDW & PowerBI.

One Response to A lap around of Big Data with Microsoft HDInsight

  1. Pingback: A lap around of Big Data with Microsoft HDInsig...

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: