How the per-core licensing model will (by my guess) affect how we build BizTalk environments today

SQL Server 2012 comes with a new licensing model. Instead of per processor socket you will pay per core. Licenses are sold in two core packs with a minimum of four core licenses needed per physical processor (even if that processor only has two cores). The price per core is about one quarter (1/4) of the price per processor today. This means that the price for anything but quad core processor architectures will either go up, in the case of hexa core processors, or will give you less value then what you are paying for, in the case of dual core processors.

As far as how you would design your environments to keep the value without increasing the cost that means going with quad core as far as possible instead of for example hexa core. Especially if you are not sure you need that extra bit of processing power.

Although this change is explicitly announced for SQL Server and so far does not extend to BizTalk Server, the way the development of processors is and has been moving over the last couple of years it stands to reason that this may very well be (and by my guess will be) extended to other products in the future. The change is inevitable.

Therefore, today I would opt for quad core processors over hexa core if I use physical servers, in both SQL and BizTalk, to get a smooth transition to new licensing models.

Or at the very least, think twice before you go on old preferences and choose as many cores as possible just to get the most value out of the current licensing model.

Virtualization stays pretty much the same, that is, either pay for the virtual cores or for the physical cores (under the same rules as stated above).

At this point in time this is not new information. For licensing technicians this is a well known fact, but for other technicians or architects, this information might not have surfaced yet.

HTH,
/Johan

Blog Post by: Johan Hedberg

Using the bitwise operator for subscriptions (part 2)

In part 1 I defined what the bitwise operator is and how you can use it in your .net code. Now, let us see how we can use the same functionality in BizTalk to route messages.

How to use  the bitwise operator in BizTalk

One of the great things about my job is that I get to move around a lot. Different customers has different uses for BizTalk and this leads to new ways to use it. This scenario is not unique, but was new to me anyway.

BizTalk receives a message and needs to route it to different receivers. The receivers will be assigned using some form of “lookup” based on the message type. Message type A must be sent to system 1, 2 and 4. Message type B must be sent to system 3 and 4, and so on.

As usual, the preferred solution makes no use of an orchestration, should be reconfigured without redeployment and adding a new receiver should be very easy and should only impact the current deployment in a minor way.

This is a good time to use the bitwise operator!

Setup in BizTalk

This is just a simple proof of concept and, as such not complete. Keep that in mind and please provide feedback if anything is missing.

The schema: The bitwise operator will solve the task. To use that, and also avoid using an orchestration, BizTalk has to be able to route messages based on a value in the file. In order to do that, a promoted property is used. In this case a field called “ReceiverField”.

It is important to remember that in this particular case, the promoted property must be an unsignedInt. This makes sense since you cannot use bitwise on a combination of positive and negative numbers. If you use Int, the operator will not show up in the administration console.

Receive port: Now set up the ports. First we need receive to pick up the incoming data. In this a simple file receive location is used, but there is nothing stopping you form using other protocols. The bitwise handling will still work. The important thing to remember is to use the xmlreceive pipeline (as opposed to the passthru).

Sendport for different systems: Now for the magic part. Add a sendport for each of the systems you need to send the information to. Make sure to configure filters to reflect the particular number you have assigned to that system.

In the picture above, the filter is configured to send files received on the previously configured receive port and then the bitwise operator is used to look for the number two in the receiver field in the previously created schema.

Add another port for another system:

The important thing to notice here is that there really is not any major difference. The only important difference is the number for the receiver field, which is not set to a 4.

Start all the ports create a testfile to be dropped in the receive folder. Set the value of the receiver field to “2”. Drop it in the receive folder. The only the send port marked with a “2” will pick it up.

Try again and change the value to “4”, only the second port will pick it up. Lastly set the value to “6”, now both ports will pick it up!

The upsides

The most important gain is of course the routing but there are some other benefits as well. Firstly, you can put different format mapping for different systems in their respective ports. This is also true for pipelines, so if one of the systems is using custom formatting, that is not a problem.

Next, adding another system is simple as all you have to do is add another send port, and configure it to look for the next number (double the last).

The downsides

This logic cannot be used in orchestrations. You cannot use the bitwise operator as a part of an activation filter in the orchestration, sadly.

Another thing that might be a problem is the code, or handling, by which you assign the number in the “RecievierField”. If BizTalk is responsible for assigning this value, you have to make sure that adding another system is just as simple as adding another send port. Maybe you have a database that matches message types and assign the number. BizTalk can then simply use a database functiod in a map to look it up.

Blog Post by: Mikael Sand

BizTalk and RabbitMQ

BizTalk and RabbitMQ

The same article in the TechNet.

If you are working with queues in BizTalk Server, the most possible it is the MSMQ.

MSMQ is an old man of the Microsoft technology stack. It was created when there were no good standards for messaging. Now MSMQ is wrapped partly in the .NET System.Messaging namespace. It is just a small facelift. MSMQ is still a proprietary technology without well-defined messaging protocol. It means, you cannot use the MSMQ messaging protocol without MSMQ itself.

Now we see vortex of new messaging standards and technologies. One on the top is the AMQP standard and one of the bold implementation of AMQP is the RabbitMQ.

It is one more protocol, one more messaging system which, for sure, can be integrated with BizTalk Server.

Here I will implement the standard queue messaging: sending and receiving messages from BizTalk down through the RabbitMQ queue.

Installation and preparation

Assuming the BizTalk Server and the Visual Studio 2010 are also installed.

Installing RabbitMQ Service

  1. Go to the RabbitMQ site and download the last versions of server and client for Windows. For me it was rabbitmq-server-2.7.1.exe and rabbitmq-dotnet-client-2.7.1-dotnet-3.0.zip. Client includes the WCF binding which I will use lately. There are several good manuals: rabbitmq-dotnet-client-2.7.1-user-guide.pdf and rabbitmq-dotnet-client-2.7.1-wcf-service-model.pdf. I was quite impressed by quality distributives and documentation.
  2. Start rabbitmq–server-x.x.x.exe. It will request to install Erlang. Agree, go to the Erlang site [http://www.erlang.org/] and download it. For me it was Erlang.otp_win32_R15B.exe
  3. Start Erlangexe. Installation went smoothly.
  4. Start rabbitmq–server-x.x.x.exe again. Installation went smoothly.

RabbitMQ service is installed. You can see it in the Services window.

Creating RabbitMQ Client Base Assemblies

Now it is time to install the client for .NET and WCF binding extension. Use this manual.

  1. Downloaded the .msi. Installed it. Hmm no Rabbit.Client.dll (When I started with Examples, they all need this file. It is a main client dll).

  2. Downloaded the *-dotnet-3.0.zip. Nope, it was wrong.
  3. We need the last rabbitmq-dotnet-client-2.7.1.zip file. Extract everything from this zip.
  4. Made a copy of the Local.props.example file to Local.props file in the same root folder.
  5. Start Visual Studio with RabbitMQDotNetClient.sln and successfully build it. The ..\rabbitmq-dotnet-client-2.7.1\projects\wcf\RabbitMQ.ServiceModel\obj\Debug\RabbitMQ.ServiceModel.dll was created. This dll together with the RabbitMQ.Client.dll are the assemblies used by the RabbitMQ clients.

Set up the RabbitMQ binding extension

Here I really lost a lot of time. It is easy for the C# clients; the RabbitMQ.ServiceModel.Test gives all the information. Butwe have to use this binding from the WCF-Custom LOB adapter. To do sowe have to add the assemblies to GAC, change machine.config, etc. See the WCF LOP Adapter documentation.

  1. Sign the RabbitMQ.ServiceModel and RabbitMQ.Client projects and GAC-ed them.
  2. Insert an <add> node into the config section of machine.config:

<extensions>
<bindingExtensions>

<add name=”RabbitMQBinding”
type=”RabbitMQ.ServiceModel.RabbitMQBindingSection, RabbitMQ.ServiceModel, Version=0.0.0.0, Culture=neutral, PublicKeyToken=1751e286f1ab778d”/>
</bindingExtensions>
</extensions>

Note:

  • When the binding extension assembly is placed in GAC, we cannot use the above section in the app.config file! This <add> element should be placed only into the machine.config file. Otherwise it cannot be seen in the binding extension list of the WCF LOB Adapter.
  • Use >Gacutil.exe /l RabbitMQ.ServiceModel
    to get the real parameters for your assembly.

I have several machine.config files in folders:

    • C:\Windows\Microsoft.NET\Framework\v2.0.50727\CONFIG
    • C:\Windows\Microsoft.NET\Framework\v4.0.30319\Config
    • C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG
    • C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Config

Which one to use? My VM is 64bit. BizTalk Host of the port is 32bit. BizTalk is the 2010 version. In my case the right file was the C:\Windows\Microsoft.NET\Framework\v4.0.30319\Config\machine.config.

Finally the RabbitMQ binding installed and we can see it in the Binding Type list for WCF-Custom adapter.

Testing

Creating the BizTalk test

I have created two pairs of port for testing.

File RP -> WCF-Custom (RabbitMQ) SP ->

[RabbitMQ Server]

-> WCF-Custom (RabbitMQ RP) -> File RP

I’ve changed only two parameters: “hostname” and “oneWay” and let others be default:

URL-s for the Send and Receive RabbitMQ Ports set up as “soap.amqp://192.168.11.128/Mytest” for both of them.

Let’s test it.

Small text file was dropped into In folder, consumed by Receive Port, went through the RabbitMQ queue, and appeared in Out folder. Success.

Note:

  • Now the RabbitMQ binding is set up to process only well-formed Xml documents.
  • “localhost” as “hostname” works fine.

Using the RabbitMQ Binding Element

The RabbitMQ binding is limited to the Text messaging encoding. If we need to change it or add more binding extensions, we have to use the customBinding type and use the rabbitMQBindingElement together with another element (including another encoding). How to do this?

1.Add the <add> node to the machine.config:

<bindingElementExtensions>

<add name=”rabbitMQBindingElement” type=”RabbitMQ.ServiceModel.RabbitMQTransportElement, RabbitMQ.ServiceModel, Version=0.0.0.0, Culture=neutral, PublicKeyToken=1751e286f1ab778d”/>
</bindingElementExtensions>

2. Change the Receive Location transport parameters. It was the RabbitMQBinding. Change itto the customBinding. The Transport Binding Element has to be changed to the rabbitMQBindingElement. I changed only one parameter: hostname as “localhost”.

3. Test it as previously.

Conclusion

Using RabbitMQ as one more transport for BizTalk and/or WCF applications is possible and straightforward. The learning curve is short. There are not too many issues with setting up the RabbitMQ Service and Client parts. Actually there isn’t any RabbitMQ side which is exposed to the BizTalk directly. RabbitMQ works only with WCF-Custom adapter and only this adapter communicates with RabbitMQ Service.
Now the RabbitMQ implements limited client functionality for the WCF. RabbitMQ implements the version 9 of the AMQP protocol, not the last 1 version.

Next steps

Next steps are to test this solution with different messaging patterns and try to figure out how to use RabbitMQ advantages (if there are some) over the MSMQ. Potentially the advantages are the simplest implementation of the Request/Response and Duplex patterns, better scalability, better performance, better manageability and support.

Using the bitwise operator for subscriptions (part 1)

Did you know that there is a “bitwise” operator that you can use when you define subscriptions in BizTalk? I surely did not! I just stumbled upon it an since I could not find any really good article on it I thought I would give it a try:

What is the Bitwise operator?

This is certainly best answered by others; like wikipedia or this guy, or Microsoft of course.

If you want the short, short version; it is a good way to let one number represent a combination of other numbers, so that you know if a particular number is present well that did not explain a lot. Let me try a longer version.

What problem does bitwise solve?

Bitwise is often used with enums in .net and if you want to use bitwise on an enum you mark it with the [Flags] attibute. The bitwise operator is a single ampersand (&), not to be confused with another logical operator (&&).

Imagine you are designing role based access management in a web application. You decide to have three different kinds of users (or roles); standard user, project managers and administrators.
The standard user has access to information about his/her current projects. The project manager can add/remove users to projects, and lastly the administrator can add projects and users.

When a web page is accessed, the user membership is checked and the information (perhaps in tabs) is displayed to the user depending on the user’s role.

How would you solve the fact that there might be users that are both standard users and project managers (depending on project)? Or, how about a user being administrator, project manager and standard user? How do you assign one number that combines different roles and give the user a particular access number?

Of course there are a lot of ways to do this but that does not prove my point, so I will assume that the way to do it is using [Flags] enum and the bitwise operator.

The enum values

To make this work you have to assign values to your enum in a specific way: The value of the enum is double that of the last value in the series, except for the second one, which is the number one.

enum MyRoles : uint
 {
   Anon = 0,
   StdUsr = 1,
   ProjectMgr = 2,
   Admin = 4
 }

So, a standard user has an access number of 1 and a project manager has the number 2. A user that is both a standard user and a project manager; the user gets an access number of 3 (1+2 = 3). This number is unique in the series and this will hold true for all combinations, even if you would have hundreds of roles.

Based on this logic, there is an easy way to find out if a user is in a particular role by using the &-operator. If you want to know if the user is a project manager, the code is simple:

if ((MyUser.AccessNumber &
      MyRoles.ProjectMgr) == MyRoles.ProjectMgr)

If the the expression returns true, the user is a project manager.

To be clear; there is a much simpler way of defining the same expression in .Net 4.0. Here; the bitwise operator is used under the convers.

if ( MyUser.AccessNumber.HasFlag(
     MyRoles.ProjectMgr))

Continued in part 2.

Blog Post by: Mikael Sand

Analyzing 1 TB of IIS logs with Hadoop Map/Reduce on Azure with JavaScript | Analyse d’1 To de journaux IIS avec Hadoop Map/Reduce en JavaScript

 

As described in a previous post, Microsoft has ported Apache Hadoop to Windows Azure (this will also be available on Windows Server). This is available as a private community technology preview for now. Comme nous l’avons vu dans un billet pr%u00e9c%u00e9dent, Microsoft a port%u00e9 Apache Hadoop sur Windows Azure (ce sera aussi disponible sur Windows Server). Cela est disponible sous la forme d’une pr%u00e9-version priv%u00e9e %u00e0 l’heure actuelle.
This does not use Cygwin. One of the contributions Microsoft will propose in return to the open source community is the possibility to use JavaScript. Cela ne s’appuie pas sur Cygwin. Une des contributions que Microsoft veut proposer en retour %u00e0 la communaut%u00e9 open source est cette possibilit%u00e9 d’utiliser JavaScript.
One of the goals of Hadoop is to work on large amount of unstructured data. In this sample, we’ll use JavaScript code to parse IIS logs and get information from authenticated sessions. Un des buts d’Hadoop est de travailler sur une grande quantit%u00e9 de donn%u00e9es non structur%u00e9es. Dans cet exemple, nous allons utiliser du code JavaScript pour analyser les jounraux IIS et r%u00e9cup%u00e9rer des informations sur les sessions des internautes authentifi%u00e9s.

 

The Internet Information Services (IIS) logs come from a Web Farm. It may be a web farm on premises or a Web Role on Windows Azure. The logs are copied and consolidated to Windows Azure blob storage. We get a little more than 1 TB of those. Here is how this looks from Windows Azure Storage Explorer: Les journaux d’Internet Information Services (IIS) viennent d’une ferme Web. Cela peut %u00eatre une ferme Web %u00e0 demeure ou un Web Role dans Windows Azure par exemple. Les journaux sont copi%u00e9s et consolid%u00e9s dans le syst%u00e8me de stockage des blobs Windows Azure. On en a un peu plus de 1 To. Voici l’aspect que cela a dans Windows Azure Storage Explorer:
and from the interactive JavaScript console: et depuis la console interactive JavaScript:

1191124656300 Bytes = 1,083321564 TB

Here is how log files look like: Voici une id%u00e9e de la structur des journaux IIS:

#Software: Microsoft Internet Information Services 7.5
#Version: 1.0
#Date: 2012-01-06 09:09:05
#Fields: date time s-sitename s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs-version cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken
2012-01-06 09:09:05 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /cuisine-francaise - 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) - http://site.supersimple.fr/ site.supersimple.fr 200 0 0 5734 321 3343
2012-01-06 09:09:12 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /cuisine-francaise/huitres - 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) - http://site.supersimple.fr/cuisine-francaise site.supersimple.fr 200 0 0 4922 346 890
2012-01-06 09:09:19 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /cuisine-japonaise - 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8= http://site.supersimple.fr/cuisine-francaise/huitres site.supersimple.fr 200 0 0 3491 544 906
2012-01-06 09:09:22 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /cuisine-japonaise/assortiment-de-makis - 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8= http://site.supersimple.fr/cuisine-japonaise site.supersimple.fr 200 0 0 3198 557 671
2012-01-06 09:09:27 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /blog - 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8= http://site.supersimple.fr/cuisine-japonaise/assortiment-de-makis site.supersimple.fr 200 0 0 3972 544 2406
2012-01-06 09:09:30 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /blog/marmiton - 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8= http://site.supersimple.fr/blog site.supersimple.fr 200 0 0 5214 519 718
2012-01-06 09:09:49 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /ustensiles - 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8= http://site.supersimple.fr/blog/marmiton site.supersimple.fr 200 0 0 6897 525 2859
2012-01-06 09:22:13 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /Users/Account/LogOn ReturnUrl=%2Fustensiles 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8= http://site.supersimple.fr/ustensiles site.supersimple.fr 200 0 0 3818 555 1203
2012-01-06 09:22:26 W3SVC1273337584 RD00155D360166 10.211.146.27 POST /Users/Account/LogOn ReturnUrl=%2Fustensiles 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8= http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2Fustensiles site.supersimple.fr 302 0 0 729 961 703
2012-01-06 09:22:27 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /ustensiles - 80 Test0001 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8=;+.ASPXAUTH=D5796612E924B60496C115914CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19EE5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B378845AE627979EE54 http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2Fustensiles site.supersimple.fr 200 0 0 7136 849 1249
2012-01-06 09:22:30 W3SVC1273337584 RD00155D360166 10.211.146.27 GET / - 80 Test0001 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8=;+.ASPXAUTH=D5796612E924B60496C115914CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19EE5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B378845AE627979EE54 http://site.supersimple.fr/ustensiles site.supersimple.fr 200 0 0 3926 788 1031
2012-01-06 09:22:57 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /cuisine-francaise - 80 Test0001 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8=;+.ASPXAUTH=D5796612E924B60496C115914CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19EE5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B378845AE627979EE54 http://site.supersimple.fr/ site.supersimple.fr 200 0 0 5973 795 1093
2012-01-06 09:23:00 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /cuisine-francaise/gateau-au-chocolat-et-aux-framboises - 80 Test0001 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8=;+.ASPXAUTH=D5796612E924B60496C115914CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19EE5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B378845AE627979EE54 http://site.supersimple.fr/cuisine-francaise site.supersimple.fr 200 0 0 8869 849 749
2012-01-06 09:30:50 W3SVC1273337584 RD00155D360166 10.211.146.27 GET / - 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 - - site.supersimple.fr 200 0 0 3687 364 1281
2012-01-06 09:30:50 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /Modules/Orchard.Localization/Styles/orchard-localization-base.css - 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 - http://site.supersimple.fr/ site.supersimple.fr 200 0 0 1148 422 749
2012-01-06 09:30:50 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /Themes/Classic/Styles/Site.css - 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 - http://site.supersimple.fr/ site.supersimple.fr 200 0 0 15298 387 843
2012-01-06 09:30:51 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /Themes/Classic/Styles/moduleOverrides.css - 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 - http://site.supersimple.fr/ site.supersimple.fr 200 0 0 557 398 1468
2012-01-06 09:30:51 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /Core/Shapes/scripts/html5.js - 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 - http://site.supersimple.fr/ site.supersimple.fr 200 0 0 1804 370 1015
2012-01-06 09:30:53 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /Themes/Classic/Content/current.png - 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 - http://site.supersimple.fr/ site.supersimple.fr 200 0 0 387 376 656
2012-01-06 09:30:57 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /modules/orchard.themes/Content/orchard.ico - 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 - - site.supersimple.fr 200 0 0 1399 346 468
2012-01-06 09:31:54 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /Users/Account/LogOn ReturnUrl=%2F 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 - http://site.supersimple.fr/ site.supersimple.fr 200 0 0 4018 435 718
2012-01-06 09:32:14 W3SVC1273337584 RD00155D360166 10.211.146.27 POST /Users/Account/LogOn ReturnUrl=%2F 80 - 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 __RequestVerificationToken_Lw__=BpgGSfFnDr9KB5oclPotYchfIFzjWXjJ5qHrtRcXoZmLRjG8pL9fw5CtMAN3Arckjm0ZfLtUsuBUGDNRztQPPWmlGLb6tfzSmELzdYbEg5RktsGNkxBr9+eyU342Lf8wSw2YFxqiUX7X8WlXwt0DQITMg2o= http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2F site.supersimple.fr 302 0 0 709 1083 812
2012-01-06 09:32:14 W3SVC1273337584 RD00155D360166 10.211.146.27 GET / - 80 Test0001 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 __RequestVerificationToken_Lw__=BpgGSfFnDr9KB5oclPotYchfIFzjWXjJ5qHrtRcXoZmLRjG8pL9fw5CtMAN3Arckjm0ZfLtUsuBUGDNRztQPPWmlGLb6tfzSmELzdYbEg5RktsGNkxBr9+eyU342Lf8wSw2YFxqiUX7X8WlXwt0DQITMg2o=;+.ASPXAUTH=94C70A59F9DA0E7294DCAAAEF9A0C52FA585B56A7FC4E01AF24437C84327D3E862548C2C0A5B71DD073443F000CE5767AF9009FFDCDE5F3EE184C3D73CF4BA4C7B8650461A448467FBAB87E311209F4DFB83B19335C9002E5EC5423E145165F64F226AC7F47C19B6035025ABDEDB4A7CAB4FF63A8C22FEED3C6002E6A99920FA8249D3B9 http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2F site.supersimple.fr 200 0 0 3926 935 906
2012-01-06 09:33:22 W3SVC1273337584 RD00155D360166 10.211.146.27 GET / - 80 Test0001 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 __RequestVerificationToken_Lw__=BpgGSfFnDr9KB5oclPotYchfIFzjWXjJ5qHrtRcXoZmLRjG8pL9fw5CtMAN3Arckjm0ZfLtUsuBUGDNRztQPPWmlGLb6tfzSmELzdYbEg5RktsGNkxBr9+eyU342Lf8wSw2YFxqiUX7X8WlXwt0DQITMg2o=;+.ASPXAUTH=94C70A59F9DA0E7294DCAAAEF9A0C52FA585B56A7FC4E01AF24437C84327D3E862548C2C0A5B71DD073443F000CE5767AF9009FFDCDE5F3EE184C3D73CF4BA4C7B8650461A448467FBAB87E311209F4DFB83B19335C9002E5EC5423E145165F64F226AC7F47C19B6035025ABDEDB4A7CAB4FF63A8C22FEED3C6002E6A99920FA8249D3B9 http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2F site.supersimple.fr 200 0 0 3926 935 1156

By loading this sample in Excel, one can see that a session ID can be found from the .ASPXAUTH cookie, which is one of the cookies available as a IIS logs fields. En chargeant cela dans Excel, on peut voir qu’un ID de session peut %u00eatre trouv%u00e9 dans le .ASPXAUTH, un des cookies disponibles dans un champ des journaux IIS.
At the end of the processing, one tries to get the following result in 2 falt file structures. A la fin du traitement, on veut r%u00e9cup%u00e9rer 2 structures de donn%u00e9es de types fichiers plats.
Session headers give a summary of what happened in the session. Fields are a dummy row ID, sessionid, username, start date/time, end date/time, nb of visited urls. Les en-t%u00eates de sessions fournissent un r%u00e9sum%u00e9 de ce qui s’est pass%u00e9 dans la session. Les champs sont un ID de rang%u00e9e sans signification, sessionid, username, date/heure de d%u00e9but, date/heure de fin, nb d’urls visit%u00e9es.

 


134211969	19251ab2b91cb3158e21c0c74f597a9872ed257d	test2272g5x467	2012-01-28 20:06:08	2012-01-28 20:32:33	11
134213036	19251cd8a444c6642bbedc1ba5d848f26ad3c789	test1268gAx168	2012-02-02 20:01:47	2012-02-02 20:25:22	13
134213561	19252827f25750af10aaf89a9de3fc35ad15d97e	test1987g4x214	2012-01-27 01:00:46	2012-01-27 01:06:26	5
134214566	19252bb73667cc04e5de2a6eebe5e8ba7cc77c4a	test3333g4x681	2012-01-27 20:00:03	2012-01-27 20:03:23	12
134214866	19252bf03e7d962a41fde46127810339c587b0ae	test1480hFx690	2012-01-27 18:18:51	2012-01-27 18:32:51	3
134215841	19253a4d1496dfea6e264ba7839d07ebd0a9662e	test2467g6x109	2012-01-29 18:02:19	2012-01-29 18:13:10	11
134216451	19253b3c19f8a0f46fd44e6f979f3e8bedda7881	test3119hLx29	2012-02-02 18:04:17	2012-02-02 18:21:31	7
134216974	19253ff8924893dd72f6453568084e53985a8817	test2382g9x8	2012-02-01 01:07:55	2012-02-01 01:26:17	5
134217496	1925418002459ad897ed41b156f0e3eab78caa13	test3854g4x823	2012-01-27 02:06:38	2012-01-27 02:27:54	5

 

Session details give the list of URLs that were visited in a session. The fields are a dummy row ID, sessionid, hit time, url. Les d%u00e9tails de session donnent la liste des URLs visit%u00e9es pendant une session. Les champs sont un ID de rang%u00e9e sans signification, sessionid, heure du hit et l’url.

 


134216699	19253ff8924893dd72f6453568084e53985a8817	01:07:55	/Core/Shapes/scripts/html5.js
134216781	19253ff8924893dd72f6453568084e53985a8817	01:41:01	/Modules/Orchard.Localization/Styles/orchard-localization-base.css
134216900	19253ff8924893dd72f6453568084e53985a8817	01:25:02	/Users/Account/LogOff
134217072	1925418002459ad897ed41b156f0e3eab78caa13	02:08:01	/Modules/Orchard.Localization/Styles/orchard-localization-base.css
134217191	1925418002459ad897ed41b156f0e3eab78caa13	02:27:54	/Users/Account/LogOff
134217265	1925418002459ad897ed41b156f0e3eab78caa13	02:06:38	/
134217319	1925418002459ad897ed41b156f0e3eab78caa13	02:26:14	/Themes/Classic/Styles/moduleOverrides.css
134217414	1925418002459ad897ed41b156f0e3eab78caa13	02:17:08	/Core/Shapes/scripts/html5.js
134217596	1925420f22e51f948314b2a6fa0c53fe4d002455	19:11:29	/blog
134217654	1925420f22e51f948314b2a6fa0c53fe4d002455	19:00:21	/cuisine-francaise/barbecue

 

Note that the two structures could be joined thru the sessionid later on with HIVE for instance, but this is beyond the scope of this post. Also note that the sessionid is not the exact of value of the .ASPXAUTH cookie but a SHA1 hash of it so that it is shorter, in order to optimize netwrok traffic and have smaller result. On notera que les deux structures pourraient faire l’objet d’une jointure sur le champ sessionid plus tard avec HIVE par exemple, mais cela d%u00e9passe l’objet de ce billet. Il est %u00e9galement %u00e0 noter que sessionid n’est pas la valeur exacte de ce qu’il y a dans le cookie .ASPXAUTH mais un Hash SHA1 de ce dernier de fa%u00e7on %u00e0 ce qu’il soit plus petit et donc r%u00e9duire le traffic r%u00e9seau et avoir un r%u00e9sultat plus petit.

 

Here the code I used to do that. I may write another blog post later on to comment further that code. Voici le code que j’ai utilis%u00e9 pour faire cela. J’%u00e9crirai peut-%u00eatre un autre billet pour commenter un peu plus ce code.
iislogsAnalysis.js: iislogsAnalysis.js:

/*
IIS logs fields
0	date			2012-01-06
1	time 			09:09:05
2	s-sitename 		W3SVC1273337584
3	s-computername 	RD00155D360166
4	s-ip 			10.211.146.27
5	cs-method 		GET
6	cs-uri-stem 	/cuisine-francaise
7	cs-uri-query 	-
8	s-port 			80
9	cs-username 	-
10	c-ip 			94.245.127.11
11	cs-version		HTTP/1.1
12	cs(User-Agent)	Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0)
13	cs(Cookie)		- 
14	cs(Referer)		http://site.supersimple.fr/
15	cs-host			site.supersimple.fr
16	sc-status		200
17	sc-substatus	0
18	sc-win32-status	0
19	sc-bytes		5734
20	cs-bytes		321
21	time-taken		3343

sample lines
2012-01-06 09:09:05 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /cuisine-francaise - 80 - 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) - http://site.supersimple.fr/ site.supersimple.fr 200 0 0 5734 321 3343
2012-01-06 09:32:14 W3SVC1273337584 RD00155D360166 10.211.146.27 GET / - 80 Test0001 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 __RequestVerificationToken_Lw__=BpgGSfFnDr9KB5oclPotYchfIFzjWXjJ5qHrtRcXoZmLRjG8pL9fw5CtMAN3Arckjm0ZfLtUsuBUGDNRztQPPWmlGLb6tfzSmELzdYbEg5RktsGNkxBr9+eyU342Lf8wSw2YFxqiUX7X8WlXwt0DQITMg2o=;+.ASPXAUTH=94C70A59F9DA0E7294DCAAAEF9A0C52FA585B56A7FC4E01AF24437C84327D3E862548C2C0A5B71DD073443F000CE5767AF9009FFDCDE5F3EE184C3D73CF4BA4C7B8650461A448467FBAB87E311209F4DFB83B19335C9002E5EC5423E145165F64F226AC7F47C19B6035025ABDEDB4A7CAB4FF63A8C22FEED3C6002E6A99920FA8249D3B9 http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2F site.supersimple.fr 200 0 0 3926 935 906
2012-01-06 09:33:22 W3SVC1273337584 RD00155D360166 10.211.146.27 GET / - 80 Test0001 94.245.127.13 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/535.7+(KHTML,+like+Gecko)+Chrome/16.0.912.63+Safari/535.7 __RequestVerificationToken_Lw__=BpgGSfFnDr9KB5oclPotYchfIFzjWXjJ5qHrtRcXoZmLRjG8pL9fw5CtMAN3Arckjm0ZfLtUsuBUGDNRztQPPWmlGLb6tfzSmELzdYbEg5RktsGNkxBr9+eyU342Lf8wSw2YFxqiUX7X8WlXwt0DQITMg2o=;+.ASPXAUTH=94C70A59F9DA0E7294DCAAAEF9A0C52FA585B56A7FC4E01AF24437C84327D3E862548C2C0A5B71DD073443F000CE5767AF9009FFDCDE5F3EE184C3D73CF4BA4C7B8650461A448467FBAB87E311209F4DFB83B19335C9002E5EC5423E145165F64F226AC7F47C19B6035025ABDEDB4A7CAB4FF63A8C22FEED3C6002E6A99920FA8249D3B9 http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2F site.supersimple.fr 200 0 0 3926 935 1156
*/

/*
A cookie with authentication looks like this
__RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8=;+.ASPXAUTH=D5796612E924B60496C115914CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19EE5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B378845AE627979EE54
The interesting part is 
ASPXAUTH=D5796612E924B60496C115914CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19EE5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B378845AE627979EE54
the session ID is 
D5796612E924B60496C115914CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19EE5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B378845AE627979EE54

 */

 /* the goal is to have this kind of file at the end:

fffffff0a929d9fbbbbb0b4ffa744842f9188e01	D 20:07:53 /blog
fffffff0a929d9fbbbbb0b4ffa744842f9188e01	H test2573g2x403 2012-01-25 20:07:53 2012-01-25 20:33:43 7
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:09:41 /Users/Account/LogO
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:26:12 /blog/marmiton
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:16:58 /cuisine-francaise/barbecue
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:10:00 /blog/marmiton
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:11:24 /
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:27:50 /cuisine-japonaise/assortiment-de-makis
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:29:31 /cuisine-francaise/fondue-au-fromage
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:05:19 /cuisine-japonaise
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:31:32 /cuisine-francaise/dinde
fffffff7e3dbde467fb4a004c31b41e5fdb49116	D 18:04:41 /cuisine-francaise/fondue-au-fromage
fffffff7e3dbde467fb4a004c31b41e5fdb49116	H test3698g4x509 2012-01-27 18:04:41 2012-01-27 18:31:32 10

*/


var map = function (key, value, context) {
    var f; // fields
    var i;
    var s, sessionID, sessionData;


    if (value === null || value === "") {
        return;
    }

    if (value.charAt(0) === "#") {
        return;
    }

    f = value.split(" ");
    if (f[9] === null || f[9] === "" || f[9] === "-") {
        //username is anonymous, skip the log line
        return;
    }

    s = extractSessionFromCookies(f[13]);
    if (!s) {
        return;
    }

    sessionID = Sha1.hash(s); // hash will create a shorter key, here
    generated = "M " + f[9] + " " + f[0] + " " + f[1] + " " + f[6]
    context.write(sessionID, generated);

    function extractSessionFromCookies(cookies) {
        var i, j, sessionID;

        var cookieParts = cookies.split(";");
        for (i = 0; i < cookieParts.length; i++) {
            j = cookieParts[i].indexOf("ASPXAUTH=");
            if (j >= 0) {
                sessionID = cookieParts[i].substring(j + "ASPXAUTH=".length);
                break;
            }
        }
        return sessionID;
    }
};

var reduce = function (key, values, context) {
    var generated;
    var minDate = null;
    var maxDate = null;
    var username = null;
    var currentDate, currentMinDate, currentMaxDate;
    var nbUrls = 0;
    var f;
    var currentValue;
    var firstChar;

    while (values.hasNext()) {
        currentValue = values.next();
        firstChar = currentValue.substring(0,1);

        if (firstChar == "M") {
            f = currentValue.split(" ");

            if (username === null) {
                username = f[1];
            }

            currentDate = f[2] + " " + f[3];

            if (minDate === null) {
                minDate = currentDate;
                maxDate = currentDate;
            }
            else {
                if (currentDate < minDate) {
                    minDate = currentDate;
                }
                else {
                    maxDate = currentDate;
                }
            }
            context.write(key, "D " + f[3] + " " + f[4]); // D stands for details
            nbUrls++;
        }
        else if (firstChar == "H") {
            f = currentValue.split(" ");

            if (username === null) {
                username = f[1];
            }

            currentMinDate = f[2] + " " + f[3];
            currentMaxDate = f[4] + " " + f[5];

            if (minDate === null) {
                minDate = currentMinDate;
                maxDate = currentMaxDate;
            }
            else {
                if (currentMinDate < minDate) {
                    minDate = currentMinDate;
                }
                if (currentMaxDate > maxDate) {
                    maxDate = currentMaxDate;
                }
            }
            nbUrls += parseInt(f[6]);
        }
        else if (firstChar == "D") {
            context.write(key, currentValue);
        }
        else {
            context.write(key, "X" + firstChar + " " + currentValue);
        }
    }

    generated = "H " + username + " " + minDate + " " + maxDate + " " + nbUrls.toString(); // H stands for Header
    context.write(key, generated);
}

var main = function (factory) {
    var job = factory.createJob("iisLogAnalysis", "map", "reduce");
    job.setCombiner("reduce");
    job.setNumReduceTasks(64);
    job.waitForCompletion(true);
};

//V120120c



/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  */
/*  SHA-1 implementation in JavaScript | (c) Chris Veness 2002-2010 | www.movable-type.co.uk      */
/*   - see http://csrc.nist.gov/groups/ST/toolkit/secure_hashing.html                             */
/*         http://csrc.nist.gov/groups/ST/toolkit/examples.html                                   */
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  */

var Sha1 = {};  // Sha1 namespace

/**
* Generates SHA-1 hash of string
*
* @param {String} msg                String to be hashed
* @param {Boolean} [utf8encode=true] Encode msg as UTF-8 before generating hash
* @returns {String}                  Hash of msg as hex character string
*/
Sha1.hash = function (msg, utf8encode) {
    utf8encode = (typeof utf8encode == 'undefined') ? true : utf8encode;

    // convert string to UTF-8, as SHA only deals with byte-streams
    if (utf8encode) msg = Utf8.encode(msg);

    // constants [%u00a74.2.1]
    var K = [0x5a827999, 0x6ed9eba1, 0x8f1bbcdc, 0xca62c1d6];

    // PREPROCESSING 

    msg += String.fromCharCode(0x80);  // add trailing '1' bit (+ 0's padding) to string [%u00a75.1.1]

    // convert string msg into 512-bit/16-integer blocks arrays of ints [%u00a75.2.1]
    var l = msg.length / 4 + 2;  // length (in 32-bit integers) of msg + ’1’ + appended length
    var N = Math.ceil(l / 16);   // number of 16-integer-blocks required to hold 'l' ints
    var M = new Array(N);

    for (var i = 0; i < N; i++) {
        M[i] = new Array(16);
        for (var j = 0; j < 16; j++) {  // encode 4 chars per integer, big-endian encoding
            M[i][j] = (msg.charCodeAt(i * 64 + j * 4) << 24) | (msg.charCodeAt(i * 64 + j * 4 + 1) << 16) |
        (msg.charCodeAt(i * 64 + j * 4 + 2) << 8) | (msg.charCodeAt(i * 64 + j * 4 + 3));
        } // note running off the end of msg is ok 'cos bitwise ops on NaN return 0
    }
    // add length (in bits) into final pair of 32-bit integers (big-endian) [%u00a75.1.1]
    // note: most significant word would be (len-1)*8 >>> 32, but since JS converts
    // bitwise-op args to 32 bits, we need to simulate this by arithmetic operators
    M[N - 1][14] = ((msg.length - 1) * 8) / Math.pow(2, 32); M[N - 1][14] = Math.floor(M[N - 1][14])
    M[N - 1][15] = ((msg.length - 1) * 8) & 0xffffffff;

    // set initial hash value [%u00a75.3.1]
    var H0 = 0x67452301;
    var H1 = 0xefcdab89;
    var H2 = 0x98badcfe;
    var H3 = 0x10325476;
    var H4 = 0xc3d2e1f0;

    // HASH COMPUTATION [%u00a76.1.2]

    var W = new Array(80); var a, b, c, d, e;
    for (var i = 0; i < N; i++) {

        // 1 - prepare message schedule 'W'
        for (var t = 0; t < 16; t++) W[t] = M[i][t];
        for (var t = 16; t < 80; t++) W[t] = Sha1.ROTL(W[t - 3] ^ W[t - 8] ^ W[t - 14] ^ W[t - 16], 1);

        // 2 - initialise five working variables a, b, c, d, e with previous hash value
        a = H0; b = H1; c = H2; d = H3; e = H4;

        // 3 - main loop
        for (var t = 0; t < 80; t++) {
            var s = Math.floor(t / 20); // seq for blocks of 'f' functions and 'K' constants
            var T = (Sha1.ROTL(a, 5) + Sha1.f(s, b, c, d) + e + K[s] + W[t]) & 0xffffffff;
            e = d;
            d = c;
            c = Sha1.ROTL(b, 30);
            b = a;
            a = T;
        }

        // 4 - compute the new intermediate hash value
        H0 = (H0 + a) & 0xffffffff;  // note 'addition modulo 2^32'
        H1 = (H1 + b) & 0xffffffff;
        H2 = (H2 + c) & 0xffffffff;
        H3 = (H3 + d) & 0xffffffff;
        H4 = (H4 + e) & 0xffffffff;
    }

    return Sha1.toHexStr(H0) + Sha1.toHexStr(H1) +
    Sha1.toHexStr(H2) + Sha1.toHexStr(H3) + Sha1.toHexStr(H4);
}

//
// function 'f' [%u00a74.1.1]
//
Sha1.f = function (s, x, y, z) {
    switch (s) {
        case 0: return (x & y) ^ (~x & z);           // Ch()
        case 1: return x ^ y ^ z;                    // Parity()
        case 2: return (x & y) ^ (x & z) ^ (y & z);  // Maj()
        case 3: return x ^ y ^ z;                    // Parity()
    }
}

//
// rotate left (circular left shift) value x by n positions [%u00a73.2.5]
//
Sha1.ROTL = function (x, n) {
    return (x << n) | (x >>> (32 - n));
}

//
// hexadecimal representation of a number 
//   (note toString(16) is implementation-dependant, and  
//   in IE returns signed numbers when used on full words)
//
Sha1.toHexStr = function (n) {
    var s = "", v;
    for (var i = 7; i >= 0; i--) { v = (n >>> (i * 4)) & 0xf; s += v.toString(16); }
    return s;
}


/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  */
/*  Utf8 class: encode / decode between multi-byte Unicode characters and UTF-8 multiple          */
/*              single-byte character encoding (c) Chris Veness 2002-2010                         */
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  */

var Utf8 = {};  // Utf8 namespace

/**
* Encode multi-byte Unicode string into utf-8 multiple single-byte characters 
* (BMP / basic multilingual plane only)
*
* Chars in range U+0080 - U+07FF are encoded in 2 chars, U+0800 - U+FFFF in 3 chars
*
* @param {String} strUni Unicode string to be encoded as UTF-8
* @returns {String} encoded string
*/
Utf8.encode = function (strUni) {
    // use regular expressions & String.replace callback function for better efficiency 
    // than procedural approaches
    var strUtf = strUni.replace(
      /[\u0080-\u07ff]/g,  // U+0080 - U+07FF => 2 bytes 110yyyyy, 10zzzzzz
      function (c) {
          var cc = c.charCodeAt(0);
          return String.fromCharCode(0xc0 | cc >> 6, 0x80 | cc & 0x3f);
      }
    );
    strUtf = strUtf.replace(
      /[\u0800-\uffff]/g,  // U+0800 - U+FFFF => 3 bytes 1110xxxx, 10yyyyyy, 10zzzzzz
      function (c) {
          var cc = c.charCodeAt(0);
          return String.fromCharCode(0xe0 | cc >> 12, 0x80 | cc >> 6 & 0x3F, 0x80 | cc & 0x3f);
      }
    );
    return strUtf;
}

/**
* Decode utf-8 encoded string back into multi-byte Unicode characters
*
* @param {String} strUtf UTF-8 string to be decoded back to Unicode
* @returns {String} decoded string
*/
Utf8.decode = function (strUtf) {
    // note: decode 3-byte chars first as decoded 2-byte strings could appear to be 3-byte char!
    var strUni = strUtf.replace(
      /[\u00e0-\u00ef][\u0080-\u00bf][\u0080-\u00bf]/g,  // 3-byte chars
      function (c) {  // (note parentheses for precence)
          var cc = ((c.charCodeAt(0) & 0x0f) << 12) | ((c.charCodeAt(1) & 0x3f) << 6) | (c.charCodeAt(2) & 0x3f);
          return String.fromCharCode(cc);
      }
    );
    strUni = strUni.replace(
      /[\u00c0-\u00df][\u0080-\u00bf]/g,                 // 2-byte chars
      function (c) {  // (note parentheses for precence)
          var cc = (c.charCodeAt(0) & 0x1f) << 6 | c.charCodeAt(1) & 0x3f;
          return String.fromCharCode(cc);
      }
    );
    return strUni;
}

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  */

This code will produce an intermediary flat file structure that looks like this (headers are after details): Il produit uns structure de fichiers interm%u00e9diaires qui ressemble %u00e0 cela (les en-t%u00eates sont apr%u00e8s les lignes de d%u00e9tails):

 


00000e399c3e94f8f919314762998b784d178bd4        D 02:14:32 /Core/Shapes/scripts/html5.js
00000e399c3e94f8f919314762998b784d178bd4        D 02:00:54 /Users/Account/LogOff
00000e399c3e94f8f919314762998b784d178bd4        D 02:09:39 /Modules/Orchard.Localization/Styles/orchard-localization-base.css
00000e399c3e94f8f919314762998b784d178bd4        D 02:13:24 /Themes/Classic/Styles/moduleOverrides.css
00000e399c3e94f8f919314762998b784d178bd4        D 02:12:37 /
00000e399c3e94f8f919314762998b784d178bd4        H test3059g2x50 2012-01-25 02:00:54 2012-01-25 02:12:37 5
00000e7fd498e90cf3f10b5158e1ccf6ff3b8153        D 00:26:22 /Users/Account/LogOff
00000e7fd498e90cf3f10b5158e1ccf6ff3b8153        D 00:24:12 /
00000e7fd498e90cf3f10b5158e1ccf6ff3b8153        H test0118g5x29 2012-01-28 00:24:12 2012-01-28 00:26:22 2

 

then, 2 jobs will be able to get only headers, and details. Here they are. puis, les 2 jobs suivants vont filtrer uniquement les en-t%u00eates d’une part et les d%u00e9tails d’autre part.
iisLogsAnalysisToH.js iisLogsAnalysisToH.js

 


// V120120a

var map = function (key, value, context) {
    var generated;
    var minDate;
    var maxDate;
    var username;
    var nbUrls;
    var l, f;
    var firstChar;
    var sessionID;

    if (!value) {
        return;
    }

    l = value.split("\t");
    if (l.length < 2) {
        return;
    }

    sessionID = l[0];

    firstChar = l[1].substring(1, 0);
    if (firstChar != "H") {
        return;
    }

    f = l[1].split(" ");

    username = f[1];

    minDate = f[2] + " " + f[3];
    maxDate = f[4] + " " + f[5];

    nbUrls = f[6];

    generated = sessionID + "\t" + username + "\t" + minDate + "\t" + maxDate + "\t" + nbUrls;
    context.write(key, generated);
};

var main = function (factory) {
    var job = factory.createJob("iisLogAnalysisToH", "map", "");
    job.setNumReduceTasks(0);
    job.waitForCompletion(true);
};

 

and iisLogsAnalysisToD.js: et iisLogsAnalysisToD.js:

 


// V120120a

var map = function (key, value, context) {
    var generated;
    var hitTime
    var Url
    var l, f;
    var firstChar;
    var sessionID;

    if (!value) {
        return;
    }

    l = value.split("\t");
    if (l.length < 2) {
        return;
    }

    sessionID = l[0];

    firstChar = l[1].substring(1, 0);
    if (firstChar != "D") {
        return;
    }

    f = l[1].split(" ");

    hitTime = f[1];

    Url = f[2];

    generated = sessionID + "\t" + hitTime + "\t" + Url;
    context.write(key, generated);
};

var main = function (factory) {
    var job = factory.createJob("iisLogAnalysisToD", "map", "");
    job.setNumReduceTasks(0);
    job.waitForCompletion(true);
};

 

Before executing the code, one needs to provision a cluster in order to have processing power. With Windows Azure, here is how this can be done: Avant d’ex%u00e9cuter ce code, on doit demander la cr%u00e9ation d’un cluster pour avoir de la puissance de calcul. Avec Windows Azure, voici comment cela se passe:

In order to copy the data from blob storage to Hadoop distributed file system (HDFS), one way is to connect thru Remote Desktop to the headnode and issue a distcp command. Before that one needs to configure Windows Azure Storage (ASV) in the console. De fa%u00e7on %u00e0 copier les donn%u00e9es depuis le stockage des blobs Windows Azure vers le syst%u00e8me de fichiers distribu%u00e9 d’Hadoop (HDFS), une possibilit%u00e9 est de se connecter via le bureau %u00e0 distance au noeud principal du cluster et d’ex%u00e9cuter une command distcp. Mais avant cela, on doit configurer le stockage Windows Azure (ASV) dans la console.

 

 

 

distcp automatically generates a map only job that copies data from one location to another in a distributed way. This job can be tracked from the standard Hadoop console: distcp g%u00e9n%u00e8re automatiquement un job de type map seulement qui copie les donn%u00e9es d’un endroit %u00e0 un autre de fa%u00e7on distribu%u00e9e. Ce job peut %u00eatre suivi depuis la console standard Hadoop.
JavaScript code must be uploaded to HDFS before being executed: On doit ensuite charger le code JavaScript dans HDFS avant de pouvoir l’ex%u00e9cuter:

then javascript code can be executed: puis on peut ex%u00e9cuter le code
This code runs within a few hours on a 1x8CPU+32x2CPU cluster. Ce code tourne en quelques heures sur un cluster 1x8CPU+32x2CPU.
Once it is finished, the two remaining scripts can be run in parallel (or not): Une fois que c’est fini, les deux scripts restant peuvent %u00eatre ex%u00e9cut%u00e9s en parall%u00e8le (ou pas):

Then, one gets the result in HDFS folders that can be copied back to Windows Azure blobs thru distcp, or exposed as HIVE tables and retrieved thru SSIS in SQL Server or SQL Azure thanks to the ODBC driver for HIVE. This may be explained in a future blog post. Puis, on obtient le r%u00e9sultat dans des dossiers HDFS qui peuvent %u00eatre copi%u00e9s %u00e0 nouveau vers des blobs Windows Azure via distpc, ou encore %u00eatre expos%u00e9s sous forme de tables HIVE et r%u00e9cup%u00e9r%u00e9s via SSIS vers SQL Server ou SQL Azure gr%u00e2ce au pilote ODBC pour HIVE. Cela fera peut-%u00eatre l’objet d’un prochain billet.
Here are just the HIVE commands to view the files as tables: Voici juste les commandes HIVE pour voir les fichiers sous forme de tables:

 


CREATE EXTERNAL TABLE iisLogsHeader (rowID STRING, sessionID STRING, username STRING, startDateTime STRING, endDateTime STRING, nbUrls INT)
ROW FORMAT DELIMITED
	FIELDS TERMINATED BY '\t'
	LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/user/cornac/iislogsH'

 



CREATE EXTERNAL TABLE iisLogsDetail (rowID STRING, sessionID STRING, HitTime STRING, Url STRING)
ROW FORMAT DELIMITED
	FIELDS TERMINATED BY '\t'
	LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/user/cornac/iislogsD'

 

Benjamin

Blog Post by: Benjamin GUINEBERTIERE

Azure Service Bus – Two Day Course in Stockholm

As I was working on “The Developers Guide to AppFabric” e-book I also started developing a training course on the Azure Service Bus. The first delivery of this course is scheduled for May 3-4 in Stockholm. I’m really looking forward to delivering this, it will be exciting to be working with emerging technologies. The course will be updated on regular bases to cover the new features as they pass through CTP and into release.

The course details are here.

If you have any questions about the course, or are interested in an on-site delivery, feel free to contact me using the form on my blog.

As for the e-book I’m working on the February version now, and will have it out in a couple of weeks. I’ll have to change the title to reflect the changes in branding.

Biztalk Server 2009 EDI parties, batches and deployment

Biztalk Server 2009 EDI parties, batches and deployment

If you are using batches for EDI in BizTalk Server 2009 and earlier you might experience problems with deployment if you include your parties in your binding files or have a role link that have enlisted the party. If it is the later part you will always run into problems as you binding always includes parties even if you don’t select it in your deployment.

The normal way to deploy would go into each party and stop the batch as this will stop the orchestration that always is running. You can see your running orchestrations in the BizTalk hub where it is possible to see which parties that have a batch running.

Well I ran into a case where it wasn’t possible to see the batches in the BizTalk hub. Maybe someone terminated the orchestrations. And I had about 20 parties and with the slow BizTalk administration tool in BizTalk Server 2009 I wanted to find another way to find out which parties were blocking my deployment.

As always there is the SQL way to find the information. What you want to look at is the output from this SQL:

SELECT TOP 1000 [PartyId]
      ,[BatchOrchestrationId]
      ,[NumOccurences]
      ,[BatchId]
  FROM
[BizTalkMgmtDb].[dbo].[PAM_Batching_Log]

This gave my the following output:

And with the following SQL I could find the Party that was blocking by deployment:
SELECT TOP 1000 [nID]
      ,[nvcName]
      ,[nvcSignatureCert]
      ,[nvcSignatureCertHash]
      ,[nvcSID]
      ,[nvcCustomData]
      ,[DateModified]
  FROM
[BizTalkMgmtDb].[dbo].[bts_party]

With that information I could find the party in BizTalk and stop the batch. Some day I am going to find a way to start and stop edi batches automatically during deployment… 



Breeze: 2012 ’the year of the BizTalk Developer’

Hi folks, I hope 2012 has been a great start for you as well.

Currently at Breeze I’m after 2 more BizTalk/SharePoint/.NET junior Developers to
join a great team.

If you love technology and want to get your hands dirty then we should chat – ideally
you’ve got sound .NET development experience and exposure to SharePoint and BizTalk.

We’re also a training company, so we will skill you up in required areas – the thing
I’m looking for is a great attitude. The rest can be learnt

If you want to get into the Software/Systems integrations space and start solving
some great puzzles then let’s hear from you.

Here’s the Job
Details
.

Blog Post by: Mick Badran

My next ten TechNet Wiki Articles on BizTalk Continued.

Last couple of weeks I written a number of wiki articles for TechNet Wiki. The TechNet Wiki is a place, where content is generated by the community and Microsoft employees about Microsoft technologies and products for the community. I very much like the concept and in November 2011 I completed my first ten articles. Next ten were created by 15th of January. Now a few weeks later I have finished the next ten:

Again I hope you will find these articles useful. Feel free to edit any of them if you feel something needs to correct or added. The complete list of BizTalk wiki articles can be found here. I myself will be working on next couple of articles in the near future until I am out of inspiration.
Cheers!