Event logs are normally used to inform about an important event in the running applications and subsystems, which plays a vital role in troubleshooting problems.
While monitoring multi-server environments, how many times in a day does your administration team log in to multiple servers to check for the root cause of a problem? Have you ever thought of a tool that could help you avoid this time-consuming process? Yes, BizTalk360’s in-built Advanced Event Viewer (AEV) helps you solve this business problem.
Set-up AEV to retrieve the event data you want from your BizTalk and SQL servers in your environment and display it all in a single screen, where you can use the rich query capabilities to search and analyze the data.
How to Set Up AEV in BizTalk360
As a first step, in BizTalk360 settings, you need to configure event logs and event sources that you want to monitor and then Enable AEV for the environment. Now the BizTalk360 Monitoring service will collect event log data for all the configured servers in that environment and store it in BizTalk360 DB.
What ‘s new in v8.6?
BizTalk360 already supports AEV in operations and monitoring section for a long time. While demonstrating BizTalk360 to customers, we asked for “How to monitor a specific event occurring in BizTalk environments on a specific frequency and get an alert based on Threshold conditions”. So, keeping that in mind we have implemented Event Log Data Monitoring in version v8.6.
Let us take this complex scenario to understand more about Event Log Data Monitoring.
Scenario1: User wants to monitor different event logs for multiple servers. Example: If an administrator wants to monitor ESB events from BizTalk server and also wants to ensure there is no problem in SQL servers and also to monitor ENTSSO events form SSO server.
Start Monitoring Event log Data in 3 Steps:
- Enable AEV for an environment
- Create a Data Monitoring Alarm
- Create a schedule under event log and configure the rich filtering conditions based on your business needs as below.
Server Type : BizTalk, SQL
Server Names :BizTalk Server ,SQL Server,SSO Server
Event Type: Error
Event Sources: ESB Itinerary Selector, ENTSSO , MSSQLSERVER,
And group (All these below conditions are true)
Event ID Greater than or equal to 3010
Event ID Less than or equal to 3034
Message Contains 'ESB.ItineraryServices.Generic.WCF/ProcessItinerary.svc'
EventID IS Between 10500-10550
Message Contains ‘SSO Database’
When we looked in more detail, it would normally take us into running a filtering query against configured event sources in servers and alert them when certain conditions are met.
Scenario 2: To detect the same event occurring on different servers. For example, when a certain instance of an orchestration is firstly executed on server 1 and throwing a certain error and next to another instance of the same orchestration throws the same error, while the instance becomes executed on server 2, this will now easily be detected with event log data monitoring.
BizTalk360 brings all these data into a single console and on top of that provides a powerful capability to set alerts based on various thresholds.
You can also set how frequently you wanted to run the queries based on their business requirements such as the frequency of daily validations (ex every 15 mins, 1 hour etc), end of business day or even monthly events such as month-end processing. With these thresholds, the result from the query will be evaluated and in case of any threshold violation, you will be notified via notification channels/Email.
Event Log Details in Alerts
Event Log Details will be listed in alerts by enabling the option ‘Send Event Log details in Mail’ while creating the schedule.
Event Log data in the Data Monitoring Dashboard
Also, the information will be visible on the Data Monitoring dashboard, you can visualize the day calendar view. If you need to understand what happened for an execution, you can click on one of the entries in the day view of the dashboard and view the details as shown below.
- Maintenance is very simple, once after scheduling event log data monitoring, when you disable AEV for the environment, it will stop collecting Event Log data.
- And you don’t Worry about data growth, BizTalk360 purge policy will take care of it.
- Apart from monitoring BizTalk specific SQL server, you can also monitor other SQL servers simply by adding SQL server names for monitoring in the settings section.
As the below quote says,
Quality in a service or product is not what you put into it. It is what the client or customer gets out of it – Peter Drucker
We, the product support team at BizTalk360, always put in our best efforts to solve the customer problems to make them feel satisfied. Our team often gets different varieties of problems, some related to functionality, performance, and data related issues and we make sure the issue is resolved within proper timelines. Recently there was a case with the customer that they were facing exception in Data Monitoring. In this blog, I am going to share my experience on how we resolved this problem.
The Message Box Data Monitoring:
How many times in a day does the support person have to watch for suspended instances in a particular application and take an appropriate action, or look out for ESB exceptions with a particular fault code? Wouldn’t it be nice if there was a way to set up monitoring on a particular data filter, get notified when there is a violation, and take actions automatically depending on the actual situation? Yes, that’s exactly what BizTalk360 achieves through the concept of data monitoring, which was a result of a customer feedback.
In MessageBox Data Monitoring, we can identify the number of suspended instances, the messages flowing and take appropriate action, either to suspend, resume or terminate the instances. This can be done from BizTalk360 itself, given the user has appropriate permissions. In my previous blog, I have explained about the permissions to be given for the user to resume/suspend/terminate service instances. Let’s look into the customer’s case.
The support ticket was raised for the case that the customer has configured auto termination functionality for the suspended instances. The messages were neither getting archived nor terminated. But the Data Monitoring dashboard was showing the details of the successful run.
Our investigation starts:
For a support ticket, with respect to Data Monitoring section, the first thing that we would check is for the alarm configuration details and then the logs for an exception. So, we started the investigation by checking the alarm details and they seemed to be fine. But the exception was captured in the Data Monitoring dashboard when the details of the task action were checked for.
System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. —> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
The real cause for the Timeout exception:
There was an additional information from the customer that they were not able to delete the existing MessageBox data alert from BizTalk360, even though the BT360 service account was a superuser. The same timeout exception was displayed while deleting the alarm also. The suspended instances were getting terminated from BizTalk admin console without any issues. Since this was an issue in the production environment, we immediately went on a call with the customer to probe further on the case.
There were four BizTalk360 servers configured in High Availability mode in the production environment, out of which 3 were passive and one was active. We checked the status of the monitoring sub-services and found that Purging was not getting updated properly.
The timeout exception usually happens when there is a large volume of data. Checking all the permissions and configurations, everything was fine. The next step was to check for the data in the BizTalk360 database. But from where does the large volume of data come from?
BizTalk360 communication with other BizTalk databases:
It’s a well-known fact that BizTalk360 is a one-stop monitoring solution to monitor BizTalk server. So, for monitoring the BizTalk artefacts and the messages flowing through the receive and send ports, BizTalk360 polls for the data from the BizTalk databases namely BizTalkDTADb and BizTalkMsgBoxDb and inserts the required data into the BizTalk360 database as per the alarm configurations. For MessageBox Data Monitoring, the data from BizTalkMsgBoxDb is fetched. If there is any action (resume/terminate) is configured for the suspended instances in Data Monitoring, then the data is inserted into the following tables in BizTalk360 database.
When we checked the number of records in the b360_st_DataMonitorTaskActionResults table, the select query was just spinning and was taking lot of time to load the results. This was due to the reason that there were 8 million records in that table. And obviously, this was cause for the timeout exception in BizTalk360.
Purging in BizTalk360:
BizTalk360 comes out of the box with the ability to set purging duration and the background monitoring service has the capability to purge older data automatically after the specified period. The Administrators/Superusers can set up the “Purge duration” under “Settings”. This will control the database growth and hence the performance of BizTalk360 will not get affected. The default purging settings in BizTalk360 can be seen in the below screenshot.
We can see that the purging duration for Data Monitoring is 2 Months. Hence the historical data for 2 months will be present in BizTalk360 database.
Purging needs to be done to remove the historical data, thereby making the database healthy. BizTalk360 purges the data by running the stored procedure in the specified duration specified in the settings. The purging settings can be altered by the customers according to their business needs and data flow. If a large volume of data flows through the ports, they can set the purge duration to a minimum value so that data growth is controlled.
We recommended the customer to decrease the purge duration to 1 month and the observe the Data Monitoring. After modifying the purge duration, the MessageBox Data Monitoring started working as expected. The suspended instances were getting terminated as per the alarm configuration and there were no timeouts happening.
The best practice to follow in case of timeout exception:
Whenever a timeout exception occurs, the first thing to be checked is the standard database reports in the relevant databases. This will ensure which table occupies more space. Then we can act on the purging policy and change it according to the business needs and data flow.
If you have any questions, contact us at email@example.com. Also, feel free to leave your feedback in our forum.