Natural Language Message Validation with Logic Apps and ChatGPT

Natural Language Message Validation with Logic Apps and ChatGPT

In one of my previous documents, I spoke about how to configure ChatGPT to be used with Azure Logic Apps. At that time, my team and I decided to create that blog post just for fun, but we didn´t yet have in mind a good idea of how to use it in a real-case integration scenario or to help build integration projects.

During a break at INTEGRATE 2023 conference, Mike Stephenson show me an idea that he had while listening to one of the talks about putting ChatGPT to do transformation using natural language, thereby replacing the maps. Needless to say, it was a fascinating and fun conversation. You can see his blog post here:

This put me thinking about where else we could apply ChatGPT in use to help develop, implement, or process our integration needs. Despite having a few ideas, one quickly stood out, and that somehow follows Mike’s idea: Messages Validation! In other words, have a Logic App which is processing some data, and then we use ChatGPT to validate the data for us by describing the schema validation in natural language text.

So I ended up creating this small sample scenario where we have the following JSON Message:

{ 
    "Company": "SP", 
    "Operation": "Insert", 
    "Request": { 
        "OrderID": 2, 
        "ClientName": "Sandro Pereira", 
        "ShippingAddress": "Pedroso, Portugal", 
        "TaxId": 111222111, 
        "ProductName": "Chocollate Bar", 
        "Quantity": 1 
    } 
} 

And we want to validate the incoming message if it is valid or not based on the following rules:

  • The OrderID must exist and be an Integer. 
  • The ClientName can not be empty or null.
  • The Quantity must exist, and it is an Integer. 
  • The value of the Company field can only be “SP” or “LZ”.
  • The Operation field must exist. 

Of course, this task can be made by creating a JSON schema with all the rules and then validating the message against the schema. However, some of these rules are not natively supported by Logic Apps like the use of regular expressions or patterns.

But the main idea here on this post is to provide a simplified way to do this task and use natural language to replace the Schemas.

You can watch my coworker Luís Rigueira describing all the process in this video:

So how do we achieve this? 

First, as you already know, you should create a Logic App. It can be Consumption or Standard. We will be using Consumption, and then you should give it a proper name because starting using proper names from the day one rule never gets old! 

And then create a Logic App that has the following structure:

  • When a HTTP request is received trigger.
  • A Compose action.
  • An HTTP action.
  • And finally, a Response action.

Leave the When a HTTP request is received trigger as his. And on the Compose action, add the following configurations with all the actions to apply the validation of the message:

Check if the Json Input is valid or invalid by these rules:

The OrderID must exist and be an Integer
The client name can not be empty or null
The Quantity must exist and it is an Integer
Company can only be "SP" or "LZ"
Operation must exist

If it is valid return a message saying "body is valid"
If it is invalid return a message saying the "body is invalid" and explain why.

Json input: @{triggerBody()}

As the JSON input, we dynamically select the Body property from the Trigger, which contains the JSON we send via Postman.

Next, on the HTTP call, we need to perform the call to the ChatGPT API to perform or try to perform the JSON message validation. To do that, we need to specify the following body:

{
  "model": "gpt-3.5-turbo",
  "messages": [
      {
          "role": "user",
           "content": "@{outputs('Compose_-_Instructions_to_be_followed')}",
           }
        ]
}

If you want to know or understand a little bit about this – How to call ChatGPT from a Logic App – see my previous blog post: Using Logic Apps to interact with ChatGPT.

Finally, we need to configure the Response action to use the following expression in the response:

trim(outputs('HTTP_-_Call_Chat_Gpt_API')?['body']?['choices']?[0]?['message']?['content'])

This way, we only send the message content as a response. Now if you test your logic app with Postman, this should be the result:

If the rules we added to the Compose action are met, we will get the following response:

  • The body will be valid.

Otherwise, the body is invalid, and ChatGPT gives the reason why.

Of course, I think at this stage maybe AI is still not a reliable option to be used in data transformation or data validation, but it shows potential.

Once again, thank my team member Luis Rigueira for helping me with this always crazy scenarios.

Hope you find this helpful! So, if you liked the content or found it helpful and want to help me write more content, you can buy (or help buy) my son a Star Wars Lego! 

Author: Sandro Pereira

Sandro Pereira lives in Portugal and works as a consultant at DevScope. In the past years, he has been working on implementing Integration scenarios both on-premises and cloud for various clients, each with different scenarios from a technical point of view, size, and criticality, using Microsoft Azure, Microsoft BizTalk Server and different technologies like AS2, EDI, RosettaNet, SAP, TIBCO etc.

He is a regular blogger, international speaker, and technical reviewer of several BizTalk books all focused on Integration. He is also the author of the book “BizTalk Mapping Patterns & Best Practices”. He has been awarded MVP since 2011 for his contributions to the integration community.

BizTalk Schema Validation: DateTime Restrictions

BizTalk Schema Validation: DateTime Restrictions

Today I was involved in a BizTalk Schema importation that includes not-so-used restrictions on Date and Time elements formats. And that gave me the idea and inspiration to create this blog post.

When we work with DateTime on Schemas by default we can choose from the following data types:

  • xs:dateTime: The dateTime data type is used to specify a date and a time in the following form “YYYY-MM-DDThh:mm:ss.fffK” where:
    • YYYY indicates the year
    • MM indicates the month
    • DD indicates the day
    • T indicates the start of the required time section
    • hh indicates the hour
    • mm indicates the minute
    • ss indicates the second
    • fff indicates the milliseconds
    • K represents the time zone information of a date and time value (e.g. +05:00)
  • xs:date: The date data type is used to specify a date in the following form “YYYY-MM-DD” where:
    • YYYY indicates the year
    • MM indicates the month
    • DD indicates the day
  • xs:time: The time data type is used to specify a time in the following form “hh:mm:ss.fffK” where:
    • hh indicates the hour
    • mm indicates the minute
    • ss indicates the second
    • fff indicates the milliseconds
    • K represents the time zone information of a date and time value (e.g. +05:00)

or you could use an xs:string that b.asically accepts everything. The only problem here is that by default we can’t do a schema validation to see if it is a valid DateTime format.

But not all systems respect de DateTime formats expected by the XSD default values. So, what are my options if a system expects other types of DateTime, Date, or Time formats? Like:

  • MM/DD/YYYY
  • YYYY-DD-MM
  • YYYY-MM-DD HH:mm:ss
  • YYYYMMDD
  • HHmmss
  • HH:mm:ss
  • and so on.

Simple Type Derivation Using the Restriction Mechanism

Luckily for us BizTalk Schema Editor and schemas, in general, allow us to derive a simple type, for example, xs:string, by using the restriction mechanism, i.e., we are typically restricting the values allowed in a message for that attribute or element value to a subset of those values allowed by the base simple type. A good and common sample of these types of restrictions is to restrict a string type to be one of several enumerated strings.

Luckily for us, again, we can also apply a pattern (that uses Regex expression) to validate the element or attribute value.

To derive a simple type by using restriction:

  • Select the relevant Field Element node or Field Attribute node in the schema tree
  • And then, in the Properties window, on the Derived By property set as Restriction.
    • This will add/present the Restriction properties on the Properties window.
  • On the Restriction properties, click on the (3 dots) on the Pattern property to define the RegEx.

Regular expression samples to validate date formats

Here is where the fun starts. There are many ways to archive this goal:

  • One’s more simple but probably not that efficient since they may not validate all cases (Leap year, and so on)
  • Others more complex that requires more knowladge but more accurated.

In a general overview, the use of regex to validate the date format supports a variety of situations and possibilities like:

  • Rule to validate the year:
    • d{4} – it says that accepts 4 digits like: 2022
    • (19|20)[0-9][0-9] -accepts years starting with 19 or 20, i.e., from 1900 to 2099
  • Rule to validate the month:
    • d{2} – it says that accepts 2 digits like: 12, but the problem here is that also accepts invalid months like 24 or 99.
    • 0?[1-9]|1[012] – accepts 01-09 (leading zero), 1-9 (single digit) and 10,11,12
  • Rule to validate the day:
    • d{2} – it says that accepts 2 digits like: 12, but the problem here is that also accepts invalid days like 32 or 99. It also don’t validate what is the month we define to validate if accepts 28, 29, 30 or 31
    • 0?[1-9]|[12][0-9]|3[01] – accepts 01-09 (leading zero), 1-9 (single digit), 10-19, 20-29 and 30-31. It doesn.t check if it is a Leap year or not.
  • To implement the lead year that needs to be with a concatenation of several rules like this sample:
    • ^(?:(?:31(/)(?:0[13578]|1[02]))1|(?:(?:29|30)(/)(?:0[13-9]|1[0-2])2))(?:(?:18|19|20)d{2})$|^(?:29(/)023(?:(?:(?:(?:18|19|20))(?:0[48]|[2468][048]|[13579][26]))))$|^(?:0?[1-9]|1d|2[0-8])(/)(?:(?:0[1-9])|(?:1[0-2]))4(?:(?:18|19|20)d{2})$

This is a different approach to do the same as above:

  • (19|20)((([02468][48]|[13579][26])-0?2-29)|dd-((0?[469]|11)-([012]?d|30)|(0?[13578]|1[02])-([012]?d|3[01])|(0?2-([01]?d|2[0-8]))))

But we can go further and allow different types of format like:

  • ^(?:(?:31(/|-|.)(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))1|(?:(?:29|30)(/|-|.)(?:0?[1,3-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))2))(?:(?:1[6-9]|[2-9]d)?d{2})$|^(?:29(/|-|.)(?:0?2|(?:Feb))3(?:(?:(?:1[6-9]|[2-9]d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1d|2[0-8])(/|-|.)(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))4(?:(?:1[6-9]|[2-9]d)?d{2})$

Your imagination and skills are the limit.

Note: images and debug RegEx at https://www.debuggex.com/.

Using multiple patterns to simplify complexity

As you saw above, things can go out of control and become quite complex. Fortunately, the BizTalk Schema Editor and the schemas, in general, allow us to apply multiple patterns to simplify the overall expression.

So, for example, if I want to have the following Date format: YYYYMMDD with Leap year validated I can use the combination of these 4 expressions:

  • (19|20)dd(0[1-9]|1[0-2])(0[1-9]|1[0-9]|2[0-8])
  • (19|20)([02468][048]|[13579][26])0229
  • (19|20)dd(0[13-9]|1[0-2])(29|30)
  • (19|20)dd(0[13578]|1[02])31

Some samples

Date in the following format: YYYY-MM-DD like 2022-12-11

Simple formats

  • d{4}-d{2}-d{2} – simple format without validating month or day
  • (19|20)d{2}-d{2}-d{2} – restricting the year

Complex formats

  • (19|20)((([02468][48]|[13579][26])-0?2-29)|dd-((0?[469]|11)-([012]?d|30)|(0?[13578]|1[02])-([012]?d|3[01])|(0?2-([01]?d|2[0-8]))))

Date in the following format: YYYYMMDD like 20221211

Simple formats

  • d{4}d{2}d{2} – simple format without validating month or day
  • (19|20)d{2}d{2}d{2} – restricting the year

Complex formats

  • (19|20)dd(0[1-9]|1[0-2])(0[1-9]|1[0-9]|2[0-9]) – do not control Leap year

Date in the following format: MM/DD/YYYY like 12/11/1999

Simple formats

  • d{2}/d{2}/d{4} – simple format without validating month or day
  • d{2}/d{2}/(19|20)d{2} – restricting the year

Complex formats

  • ^(?:(?:31(/|-|.)(?:0?[13578]|1[02]))1|(?:(?:29|30)(/|-|.)(?:0?[1,3-9]|1[0-2])2))(?:(?:1[6-9]|[2-9]d)?d{2})$|^(?:29(/|-|.)0?23(?:(?:(?:1[6-9]|[2-9]d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1d|2[0-8])(/|-|.)(?:(?:0?[1-9])|(?:1[0-2]))4(?:(?:1[6-9]|[2-9]d)?d{2})$
  • ^([0]d|[1][0-2])/([0-2]d|[3][0-1])/([2][01]|[1][6-9])d{2}(s([0-1]d|[2][0-3])(:[0-5]d){1,2})?$

Date in the following format: YYYY-MM-DDZ like 2019-06-12Z

Simple formats

  • d{4}-d{2}-d{2}Z – simple format without validating month or day
  • (19|20)d{2}-d{2}-d{2}Z – restricting the year

Complex formats

  • ^dddd-(0?[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])Z?(-+:([0-5][0-9]))?$

Time in the following format: HH:mm:ss like 23:59:59

Simple formats

  • d{2}:d{2}:d{2} – simple format without validating valid hours, minutes or seconds

Complex formats

  • (([01][0-9]|2[0-3]):[0-5]:[0-9])

Time in the following format: HHmm like 2359

Simple formats

  • d{2}d{2} – simple format without validating month or day

Complex formats:

  • (([01][0-9]|2[0-3])[0-5])

Download

THIS SAMPLE CODE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND.

You can download the POC: BizTalk Schemas Handle Restrictions on Date from GitHub here: