BizTalk Server Tip #17: Use fast disks for your BizTalk Database subsystem

SQL Server is vital for any BizTalk environment, ensure maximum performance for the most demanding databases with a fast disk subsystem, consider using RAID 1+0 and SSDs for a high performance environment. In the majority of the BizTalk environments this is the first physical bottleneck you will face. When doing load testing you will find […]

The post BizTalk Server Tip #17: Use fast disks for your BizTalk Database subsystem appeared first on BizTalk360 Blog.

Blog Post by: Ricardo Torre

European Tour 2014-Speaking Engagements

The beginning of March I will be touring Europe speaking at different user groups and a huge event the BizTalk Summit 2014. The latter is in London on the 3rd and 4th of March. During this event I will be sharing the stage with eleven other Microsoft Integration MVP’s and four Microsoft Product Group members. There are still a few tickets left: http://www.biztalk360.com/BizTalk-Summit-2014/

BizTalk Summit 2014, London – Statistics

We are pretty much close to the BizTalk Summit 2014, London event now, couple more weekends left. I thought of sharing some statistics about the event, attendees and speakers. Arguably this is the biggest ever BizTalk/Integration focused event conducted in Europe. We are aiming for 200 attendees, around 180 already registered.  Close to 100 companies […]

The post BizTalk Summit 2014, London – Statistics appeared first on BizTalk360 Blog.

Blog Post by: Saravana Kumar

Learn About Windows Azure BizTalk Services and IaaS at the London BizTalk Summit

Time is running out to get your ticket to attend the 2014 BizTalk Summit in London March 3rd and 4th.  With over 10 international speakers plus the BizTalk product group speaking this is going to be a great event!

I’m really looking forward to learning the latest on Windows Azure BizTalk Services, something I just have not had a lot of time to play around with yet.

I’ll will be presenting at the summit on BizTalk 2013 and Windows Azure Infrastructure as a Service.  I have been working on some updated single server and full domain setup scripts that I will be show casing and making available for download.

You can get more information on the summit at:http://www.biztalk360.com/BizTalk-Summit-2014/

Hope to see you there!

BizTalk Server Tip #16: Use Visual Studio for functional and load testing

Use Visual Studio test capabilities for functional testing and load testing, you can simulate real load of your environment while at the same time monitoring performance and behavior of your environment during the test in real time. When load testing start by doing short duration tests first and then long duration reliability tests. Here are […]

The post BizTalk Server Tip #16: Use Visual Studio for functional and load testing appeared first on BizTalk360 Blog.

Blog Post by: Ricardo Torre

How to use HDInsight from Linux

HDinsight is very easy to use from PowerShell, but how would you create and delete a cluster from Linux? How would you submit a job and get the result?

Here is is a simple sample and pointers to further documentation.

1. Create a cluster

You can create a cluster with the Windows Azure Command Line Interface (CLI).

In order to install the CLI, you can go to http://windowsazure.com, downloads. At the bottom of the page, you have two links: one for the CLI itself, the other one is the documentation.

Once you have installed it, you get an azure command line with many options.

The following bash script will create a cluster:

#!/bin/bash
# create an HDInsight cluster

# more information at http://www.windowsazure.com/en-us/documentation/articles/hdinsight-administer-use-command-line/

defaultStorageAccount='monstockageazure'
storageAccount2='wasbshared'
clusterName='monclusterhadoop'
clusterContainerName='monclusterhadoop2'
clusterVersion='2.1'
clusterAdmin='cornac'
clusterConfigFile='./hdinsightCluster.config'

subscription='demos874F33876Y'

clusterPassword='YHqj6sq#ap9'
defaultStorageAccountKey='9O5uEqY1MsT6LIKifmXL0bQgrQElbslvu4N6mX58mSpPa4sPtYPTL5YjvLvcQAItuw87BdLulZWnGJWZ/VCd6Q=='
storageAccount2Key='7on846mc+5u9AItkVIEYz1OXwJZ86gN7o7ExURXO3qWJy+jNO56EtfUmRur+/qKkFGc4drA4GvBmhYGiBMlj3g=='

azure account set $subscription

azure hdinsight cluster config create $clusterConfigFile
azure hdinsight cluster config set $clusterConfigFile --clusterName $clusterName --nodes 3 --location "North Europe" --storageAccountName "$defaultStorageAccount.blob.core.windows.net" --storageAccountKey "$defaultStorageAccountKey" --storageContainer "$clusterName" --username "$clusterAdmin" --clusterPassword "$clusterPassword"
azure hdinsight cluster config storage add $clusterConfigFile --storageAccountName "$storageAccount2.blob.core.windows.net" --storageAccountKey "$storageAccount2Key"

azure hdinsight cluster create --config $clusterConfigFile

2. Submit a job

HDInsight exposes an Apache REST API called WebHCat (the former name was Templeton). This allows to submit jobs. It is documented at https://cwiki.apache.org/confluence/display/Hive/WebHCat.

There are tons of ways to call a REST API from Linux. The one I chose for this post is Python. For this sample, you install the “requests” module

pip install requests

then you can run that script (02_submit_hive_job.py):

import requests #http://pypi.python.org/pypi/requests

clusterName='monclusterhadoop'
clusterAdmin='cornac'
clusterPassword='YHqj6sq#ap9'

#get WebHCat status
webHCatUrl='https://' + clusterName + '.azurehdinsight.net/templeton/v1/status'

r = requests.get(webHCatUrl, auth=(clusterAdmin, clusterPassword))

print r.status_code
print r.json()

#submit a hive job:
# SELECT * FROM hivesampletable limit 10
# http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win-1.3.0/ds_HCatalog/hive.html

webHCatUrl='https://' + clusterName + '.azurehdinsight.net/templeton/v1/hive'

hive_params={'user.name':clusterAdmin,
             'execute':'SELECT * FROM hivesampletable limit 10',
             'statusdir': '/wasbwork/hive_from_python'}

r = requests.post(webHCatUrl, auth=(clusterAdmin, clusterPassword), data=hive_params)
print r.status_code
print r.json()

with the following command line:

python 02_submit_hive_job.py

In my case, I got the following result:

benjguin@benjguinu2:~/dev/hdinsight_from_linux$ python 02_submit_hive_job.py
200
{u'status': u'ok', u'version': u'v1'}
200
{u'id': u'job_201402171346_0002'}

You can also get the status of the job, submit pig jobs, submit hive jobs from scripts you uploaded to Windows Azure Storage Blob. Here is a link to the documentation by Hortonworks:

http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win-1.3.0/ds_HCatalog/hive.html

and you get a table of contents on the left:

3. Get the result

In the Python script, as we asked the result to be at /wasbwork/hive_from_python, it is stored in the Windows Azure Storage Blob or wasb (in HDInsight, wasb is the default file system over HDFS which is also available at hdfs://namenodehost:9000/()). So, once the job is fiinished, and a script can figure it out with this REST API, you get the following files:

 

So, you can get the result by downloading the result (with azure CLI) and see it with this bash script:

#!/bin/bash

defaultStorageAccount='monstockageazure'
clusterName='monclusterhadoop'
defaultStorageAccountKey='9O5uEqY1MsT6LIKifmXL0bQgrQElbslvu4N6mX58mSpPa4sPtYPTL5YjvLvcQAItuw87BdLulZWnGJWZ/VCd6Q=='

export AZURE_STORAGE_ACCOUNT="$defaultStorageAccount"
export AZURE_STORAGE_ACCESS_KEY="$defaultStorageAccountKey"

azure storage blob download $clusterName wasbwork/hive_from_python/stdout
cat wasbwork/hive_from_python/stdout

In my case, this gave the following result:

benjguin@benjguinu2:~/dev/hdinsight_from_linux$ ./03_get_result.sh
info:    Executing command storage blob download
+ Download blob wasbwork/hive_from_python/stdout in container monclusterhadoop to wasbwork/hive_from_python/stdout
Percentage: 100.0% (809.00B/809.00B) Average Speed: 809.00B/S Elapsed Time: 00:00:00
+ Getting Storage blob information
info:    File saved as wasbwork/hive_from_python/stdout
info:    storage blob download command OK
8       18:54:20        en-US   Android Samsung SCH-i500        California      United States   13.9204007      0       0
23      19:19:44        en-US   Android HTC     Incredible      Pennsylvania    United States   NULL    0       0
23      19:19:46        en-US   Android HTC     Incredible      Pennsylvania    United States   1.4757422       0       1
23      19:19:47        en-US   Android HTC     Incredible      Pennsylvania    United States   0.245968        0       2
28      01:37:50        en-US   Android Motorola        Droid X Colorado        United States   20.3095339      1       1
28      00:53:31        en-US   Android Motorola        Droid X Colorado        United States   16.2981668      0       0
28      00:53:50        en-US   Android Motorola        Droid X Colorado        United States   1.7715228       0       1
28      16:44:21        en-US   Android Motorola        Droid X Utah    United States   11.6755987      2       1
28      16:43:41        en-US   Android Motorola        Droid X Utah    United States   36.9446892      2       0
28      01:37:19        en-US   Android Motorola        Droid X Colorado        United States   28.9811416      1       0

4. Remove the cluster

In order to remove the cluster, the azure CLI will also help:

#!/bin/bash

clusterName='monclusterhadoop'

azure hdinsight cluster delete $clusterName

this produces the following sample result:

benjguin@benjguinu2:~/dev/hdinsight_from_linux$ ./04_removeCluster.sh
info:    Executing command hdinsight cluster delete
+ Removing HDInsight Cluster
info:    hdinsight cluster delete command OK
benjguin@benjguinu2:~/dev/hdinsight_from_linux$

Conclusion

This post only shows a few simple examples. The goal is to show the principles that can be used. The azure CLI is used to manage the cluster itself, and may also be used to interact with Windows Azure Storage blobs. Submitting jobs can be done with WebHCat REST calls.

Benjamin (@benjguin)

Blog Post by: Benjamin GUINEBERTIERE

BizTalk Server Tip #15: Split big messages in the receive pipeline

If you have a big message consider using envelop schemas and the default pipelines to split the message at the entrance point in BizTalk for best performance and high resource utilization. Using this method also doesn’t require the creation of custom pipelines or very expensive splitting in the orchestrations.  

The post BizTalk Server Tip #15: Split big messages in the receive pipeline appeared first on BizTalk360 Blog.

Blog Post by: Ricardo Torre