by community-syndication | Feb 19, 2014 | BizTalk Community Blogs via Syndication
The beginning of March I will be touring Europe speaking at different user groups and a huge event the BizTalk Summit 2014. The latter is in London on the 3rd and 4th of March. During this event I will be sharing the stage with eleven other Microsoft Integration MVP’s and four Microsoft Product Group members. There are still a few tickets left: http://www.biztalk360.com/BizTalk-Summit-2014/
by community-syndication | Feb 18, 2014 | BizTalk Community Blogs via Syndication
QuickLearn is committed to the ALM community and have signed up as Gold Sponsors for the upcoming ALM Forum Seattle 2014. BEFORE THE EVENT – WIN A FREE TICKET You can win a free ticket (valued at $ 1,895) to the ALM Forum Seattle on 1st-3rd April 2014. We have two free tickets to give […]
Blog Post by: Anthony Borton (TFS Instructor)
by community-syndication | Feb 18, 2014 | BizTalk Community Blogs via Syndication
We are pretty much close to the BizTalk Summit 2014, London event now, couple more weekends left. I thought of sharing some statistics about the event, attendees and speakers. Arguably this is the biggest ever BizTalk/Integration focused event conducted in Europe. We are aiming for 200 attendees, around 180 already registered. Close to 100 companies […]
The post BizTalk Summit 2014, London – Statistics appeared first on BizTalk360 Blog.
Blog Post by: Saravana Kumar
by community-syndication | Feb 18, 2014 | BizTalk Community Blogs via Syndication
Meetings are a necessary evil. Sometimes, you simply have to get a bunch of people together at one time in order to resolve a problem or share information. While Tier 3 was a very collaborative environment where there were very few meetings and information flowed freely, our new parent company (CenturyLink) has 50,000 people spread […]
Blog Post by: Richard Seroter
by stephen-w-thomas | Feb 18, 2014 | Stephen's BizTalk and Integration Blog
Time is running out to get your ticket to attend the 2014 BizTalk Summit in London March 3rd and 4th. With over 10 international speakers plus the BizTalk product group speaking this is going to be a great event!
I’m really looking forward to learning the latest on Windows Azure BizTalk Services, something I just have not had a lot of time to play around with yet.
I’ll will be presenting at the summit on BizTalk 2013 and Windows Azure Infrastructure as a Service. I have been working on some updated single server and full domain setup scripts that I will be show casing and making available for download.
You can get more information on the summit at:http://www.biztalk360.com/BizTalk-Summit-2014/
Hope to see you there!
by community-syndication | Feb 18, 2014 | BizTalk Community Blogs via Syndication
HDinsight is very easy to use from PowerShell, but how would you create and delete a cluster from Linux? How would you submit a job and get the result?
Here is is a simple sample and pointers to further documentation.
1. Create a cluster
You can create a cluster with the Windows Azure Command Line Interface (CLI).
In order to install the CLI, you can go to http://windowsazure.com, downloads. At the bottom of the page, you have two links: one for the CLI itself, the other one is the documentation.
Once you have installed it, you get an azure command line with many options.
The following bash script will create a cluster:
#!/bin/bash
# create an HDInsight cluster
# more information at http://www.windowsazure.com/en-us/documentation/articles/hdinsight-administer-use-command-line/
defaultStorageAccount='monstockageazure'
storageAccount2='wasbshared'
clusterName='monclusterhadoop'
clusterContainerName='monclusterhadoop2'
clusterVersion='2.1'
clusterAdmin='cornac'
clusterConfigFile='./hdinsightCluster.config'
subscription='demos874F33876Y'
clusterPassword='YHqj6sq#ap9'
defaultStorageAccountKey='9O5uEqY1MsT6LIKifmXL0bQgrQElbslvu4N6mX58mSpPa4sPtYPTL5YjvLvcQAItuw87BdLulZWnGJWZ/VCd6Q=='
storageAccount2Key='7on846mc+5u9AItkVIEYz1OXwJZ86gN7o7ExURXO3qWJy+jNO56EtfUmRur+/qKkFGc4drA4GvBmhYGiBMlj3g=='
azure account set $subscription
azure hdinsight cluster config create $clusterConfigFile
azure hdinsight cluster config set $clusterConfigFile --clusterName $clusterName --nodes 3 --location "North Europe" --storageAccountName "$defaultStorageAccount.blob.core.windows.net" --storageAccountKey "$defaultStorageAccountKey" --storageContainer "$clusterName" --username "$clusterAdmin" --clusterPassword "$clusterPassword"
azure hdinsight cluster config storage add $clusterConfigFile --storageAccountName "$storageAccount2.blob.core.windows.net" --storageAccountKey "$storageAccount2Key"
azure hdinsight cluster create --config $clusterConfigFile
2. Submit a job
HDInsight exposes an Apache REST API called WebHCat (the former name was Templeton). This allows to submit jobs. It is documented at https://cwiki.apache.org/confluence/display/Hive/WebHCat.
There are tons of ways to call a REST API from Linux. The one I chose for this post is Python. For this sample, you install the “requests” module
pip install requests
then you can run that script (02_submit_hive_job.py):
import requests #http://pypi.python.org/pypi/requests
clusterName='monclusterhadoop'
clusterAdmin='cornac'
clusterPassword='YHqj6sq#ap9'
#get WebHCat status
webHCatUrl='https://' + clusterName + '.azurehdinsight.net/templeton/v1/status'
r = requests.get(webHCatUrl, auth=(clusterAdmin, clusterPassword))
print r.status_code
print r.json()
#submit a hive job:
# SELECT * FROM hivesampletable limit 10
# http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win-1.3.0/ds_HCatalog/hive.html
webHCatUrl='https://' + clusterName + '.azurehdinsight.net/templeton/v1/hive'
hive_params={'user.name':clusterAdmin,
'execute':'SELECT * FROM hivesampletable limit 10',
'statusdir': '/wasbwork/hive_from_python'}
r = requests.post(webHCatUrl, auth=(clusterAdmin, clusterPassword), data=hive_params)
print r.status_code
print r.json()
with the following command line:
python 02_submit_hive_job.py
In my case, I got the following result:
benjguin@benjguinu2:~/dev/hdinsight_from_linux$ python 02_submit_hive_job.py
200
{u'status': u'ok', u'version': u'v1'}
200
{u'id': u'job_201402171346_0002'}
You can also get the status of the job, submit pig jobs, submit hive jobs from scripts you uploaded to Windows Azure Storage Blob. Here is a link to the documentation by Hortonworks:
http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win-1.3.0/ds_HCatalog/hive.html
and you get a table of contents on the left:
3. Get the result
In the Python script, as we asked the result to be at /wasbwork/hive_from_python, it is stored in the Windows Azure Storage Blob or wasb (in HDInsight, wasb is the default file system over HDFS which is also available at hdfs://namenodehost:9000/()). So, once the job is fiinished, and a script can figure it out with this REST API, you get the following files:
So, you can get the result by downloading the result (with azure CLI) and see it with this bash script:
#!/bin/bash
defaultStorageAccount='monstockageazure'
clusterName='monclusterhadoop'
defaultStorageAccountKey='9O5uEqY1MsT6LIKifmXL0bQgrQElbslvu4N6mX58mSpPa4sPtYPTL5YjvLvcQAItuw87BdLulZWnGJWZ/VCd6Q=='
export AZURE_STORAGE_ACCOUNT="$defaultStorageAccount"
export AZURE_STORAGE_ACCESS_KEY="$defaultStorageAccountKey"
azure storage blob download $clusterName wasbwork/hive_from_python/stdout
cat wasbwork/hive_from_python/stdout
In my case, this gave the following result:
benjguin@benjguinu2:~/dev/hdinsight_from_linux$ ./03_get_result.sh
info: Executing command storage blob download
+ Download blob wasbwork/hive_from_python/stdout in container monclusterhadoop to wasbwork/hive_from_python/stdout
Percentage: 100.0% (809.00B/809.00B) Average Speed: 809.00B/S Elapsed Time: 00:00:00
+ Getting Storage blob information
info: File saved as wasbwork/hive_from_python/stdout
info: storage blob download command OK
8 18:54:20 en-US Android Samsung SCH-i500 California United States 13.9204007 0 0
23 19:19:44 en-US Android HTC Incredible Pennsylvania United States NULL 0 0
23 19:19:46 en-US Android HTC Incredible Pennsylvania United States 1.4757422 0 1
23 19:19:47 en-US Android HTC Incredible Pennsylvania United States 0.245968 0 2
28 01:37:50 en-US Android Motorola Droid X Colorado United States 20.3095339 1 1
28 00:53:31 en-US Android Motorola Droid X Colorado United States 16.2981668 0 0
28 00:53:50 en-US Android Motorola Droid X Colorado United States 1.7715228 0 1
28 16:44:21 en-US Android Motorola Droid X Utah United States 11.6755987 2 1
28 16:43:41 en-US Android Motorola Droid X Utah United States 36.9446892 2 0
28 01:37:19 en-US Android Motorola Droid X Colorado United States 28.9811416 1 0
4. Remove the cluster
In order to remove the cluster, the azure CLI will also help:
#!/bin/bash
clusterName='monclusterhadoop'
azure hdinsight cluster delete $clusterName
this produces the following sample result:
benjguin@benjguinu2:~/dev/hdinsight_from_linux$ ./04_removeCluster.sh
info: Executing command hdinsight cluster delete
+ Removing HDInsight Cluster
info: hdinsight cluster delete command OK
benjguin@benjguinu2:~/dev/hdinsight_from_linux$
Conclusion
This post only shows a few simple examples. The goal is to show the principles that can be used. The azure CLI is used to manage the cluster itself, and may also be used to interact with Windows Azure Storage blobs. Submitting jobs can be done with WebHCat REST calls.
Benjamin (@benjguin)
Blog Post by: Benjamin GUINEBERTIERE