by Sandro Pereira | Feb 22, 2022 | BizTalk Community Blogs via Syndication
Yesterday I spoke about how you can apply custom pattern restrictions to properly validate DateTime, Date, and Time formats inside elements or attributes. You can see more about it here:
And I mention that these Regular Expressions can be simple and relatively easier to read if you have some basic knowledge like this one below:
- d{4}d{2}d{2} – that is a simple format that expects 4 digits followed by 2 additional digits and another 2 digits that is the date in the YYYYMMDD format without validating the accuracy of months or days and where:
- YYYY is the for digits year, like 2022.
- MM is the 2 digits month, like 01.
- DD is the 2 digits day, like 21.
But it can get highly complex that even people with good knowledge have difficulty translating the expression to the expected pattern, like the one below:
- ^(?:(?:31(/|-|.)(?:0?[13578]|1[02]))1|(?:(?:29|30)(/|-|.)(?:0?[1,3-9]|1[0-2])2))(?:(?:1[6-9]|[2-9]d)?d{2})$|^(?:29(/|-|.)0?23(?:(?:(?:1[6-9]|[2-9]d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1d|2[0-8])(/|-|.)(?:(?:0?[1-9])|(?:1[0-2]))4(?:(?:1[6-9]|[2-9]d)?d{2})$
It is not simple to look to this RegEx and say: “yep, we are expecting this Date format: MM/DD/YYYY like 12/11/1999, and by the way, it validates the Leap Year!”
Imagine users that don’t have strong know-how about RegEx!
So, what can we do to improve this experience? What are the best practices in these cases?
Best practices
A good best practice to improve readability while documenting your schemas is to add notes to these elements or attributes.
We can and should use the Notes property to enter notes, such as comments related to the business process, that you would like to make about the selected Record, Field Element, or Field Attribute node. In these DateTime, Date, and Time cases, we can simply add the expected format in the Notes property, like:
- Format: YYYY-MM-DD
- Format: HH:mm:ss
- Format: YYYYMMDD
- Format: HHmm
- Format: YYYY-MM-DD – It validates Leap Year
To accomplish this, we need to
- Right-click on the Element or Attribute fields in the schema tree view and select the Properties option.
- On the Properties window, click on the … (three dots) on the Notes property. This action will open a Notes window where you can add all your relevant notes.
Do that to all your elements and fields that are using pattern restrictions.
Even non-technical guys can understand the Schema specification you provide or are consuming. This best practice implementation will also help you improve productivity since you will not spend too much time decompiling Regular Expressions.
Download
THIS SAMPLE CODE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND.
You can download the POC: BizTalk Schemas Handle Restrictions on Date from GitHub here:
by Sandro Pereira | Feb 21, 2022 | BizTalk Community Blogs via Syndication
Today I was involved in a BizTalk Schema importation that includes not-so-used restrictions on Date and Time elements formats. And that gave me the idea and inspiration to create this blog post.
When we work with DateTime on Schemas by default we can choose from the following data types:
- xs:dateTime: The dateTime data type is used to specify a date and a time in the following form “YYYY-MM-DDThh:mm:ss.fffK” where:
- YYYY indicates the year
- MM indicates the month
- DD indicates the day
- T indicates the start of the required time section
- hh indicates the hour
- mm indicates the minute
- ss indicates the second
- fff indicates the milliseconds
- K represents the time zone information of a date and time value (e.g. +05:00)
- xs:date: The date data type is used to specify a date in the following form “YYYY-MM-DD” where:
- YYYY indicates the year
- MM indicates the month
- DD indicates the day
- xs:time: The time data type is used to specify a time in the following form “hh:mm:ss.fffK” where:
- hh indicates the hour
- mm indicates the minute
- ss indicates the second
- fff indicates the milliseconds
- K represents the time zone information of a date and time value (e.g. +05:00)
or you could use an xs:string that b.asically accepts everything. The only problem here is that by default we can’t do a schema validation to see if it is a valid DateTime format.
But not all systems respect de DateTime formats expected by the XSD default values. So, what are my options if a system expects other types of DateTime, Date, or Time formats? Like:
- MM/DD/YYYY
- YYYY-DD-MM
- YYYY-MM-DD HH:mm:ss
- YYYYMMDD
- HHmmss
- HH:mm:ss
- and so on.
Simple Type Derivation Using the Restriction Mechanism
Luckily for us BizTalk Schema Editor and schemas, in general, allow us to derive a simple type, for example, xs:string, by using the restriction mechanism, i.e., we are typically restricting the values allowed in a message for that attribute or element value to a subset of those values allowed by the base simple type. A good and common sample of these types of restrictions is to restrict a string type to be one of several enumerated strings.
Luckily for us, again, we can also apply a pattern (that uses Regex expression) to validate the element or attribute value.
To derive a simple type by using restriction:
- Select the relevant Field Element node or Field Attribute node in the schema tree
- And then, in the Properties window, on the Derived By property set as Restriction.
- This will add/present the Restriction properties on the Properties window.
- On the Restriction properties, click on the … (3 dots) on the Pattern property to define the RegEx.
Regular expression samples to validate date formats
Here is where the fun starts. There are many ways to archive this goal:
- One’s more simple but probably not that efficient since they may not validate all cases (Leap year, and so on)
- Others more complex that requires more knowladge but more accurated.
In a general overview, the use of regex to validate the date format supports a variety of situations and possibilities like:
- Rule to validate the year:
- d{4} – it says that accepts 4 digits like: 2022
- (19|20)[0-9][0-9] -accepts years starting with 19 or 20, i.e., from 1900 to 2099
- Rule to validate the month:
- d{2} – it says that accepts 2 digits like: 12, but the problem here is that also accepts invalid months like 24 or 99.
- 0?[1-9]|1[012] – accepts 01-09 (leading zero), 1-9 (single digit) and 10,11,12
- Rule to validate the day:
- d{2} – it says that accepts 2 digits like: 12, but the problem here is that also accepts invalid days like 32 or 99. It also don’t validate what is the month we define to validate if accepts 28, 29, 30 or 31
- 0?[1-9]|[12][0-9]|3[01] – accepts 01-09 (leading zero), 1-9 (single digit), 10-19, 20-29 and 30-31. It doesn.t check if it is a Leap year or not.
- To implement the lead year that needs to be with a concatenation of several rules like this sample:
- ^(?:(?:31(/)(?:0[13578]|1[02]))1|(?:(?:29|30)(/)(?:0[13-9]|1[0-2])2))(?:(?:18|19|20)d{2})$|^(?:29(/)023(?:(?:(?:(?:18|19|20))(?:0[48]|[2468][048]|[13579][26]))))$|^(?:0?[1-9]|1d|2[0-8])(/)(?:(?:0[1-9])|(?:1[0-2]))4(?:(?:18|19|20)d{2})$
This is a different approach to do the same as above:
- (19|20)((([02468][48]|[13579][26])-0?2-29)|dd-((0?[469]|11)-([012]?d|30)|(0?[13578]|1[02])-([012]?d|3[01])|(0?2-([01]?d|2[0-8]))))
But we can go further and allow different types of format like:
- ^(?:(?:31(/|-|.)(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))1|(?:(?:29|30)(/|-|.)(?:0?[1,3-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))2))(?:(?:1[6-9]|[2-9]d)?d{2})$|^(?:29(/|-|.)(?:0?2|(?:Feb))3(?:(?:(?:1[6-9]|[2-9]d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1d|2[0-8])(/|-|.)(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))4(?:(?:1[6-9]|[2-9]d)?d{2})$
Your imagination and skills are the limit.
Note: images and debug RegEx at https://www.debuggex.com/.
Using multiple patterns to simplify complexity
As you saw above, things can go out of control and become quite complex. Fortunately, the BizTalk Schema Editor and the schemas, in general, allow us to apply multiple patterns to simplify the overall expression.
So, for example, if I want to have the following Date format: YYYYMMDD with Leap year validated I can use the combination of these 4 expressions:
- (19|20)dd(0[1-9]|1[0-2])(0[1-9]|1[0-9]|2[0-8])
- (19|20)([02468][048]|[13579][26])0229
- (19|20)dd(0[13-9]|1[0-2])(29|30)
- (19|20)dd(0[13578]|1[02])31
Some samples
Date in the following format: YYYY-MM-DD like 2022-12-11
Simple formats
- d{4}-d{2}-d{2} – simple format without validating month or day
- (19|20)d{2}-d{2}-d{2} – restricting the year
Complex formats
- (19|20)((([02468][48]|[13579][26])-0?2-29)|dd-((0?[469]|11)-([012]?d|30)|(0?[13578]|1[02])-([012]?d|3[01])|(0?2-([01]?d|2[0-8]))))
Date in the following format: YYYYMMDD like 20221211
Simple formats
- d{4}d{2}d{2} – simple format without validating month or day
- (19|20)d{2}d{2}d{2} – restricting the year
Complex formats
- (19|20)dd(0[1-9]|1[0-2])(0[1-9]|1[0-9]|2[0-9]) – do not control Leap year
Date in the following format: MM/DD/YYYY like 12/11/1999
Simple formats
- d{2}/d{2}/d{4} – simple format without validating month or day
- d{2}/d{2}/(19|20)d{2} – restricting the year
Complex formats
- ^(?:(?:31(/|-|.)(?:0?[13578]|1[02]))1|(?:(?:29|30)(/|-|.)(?:0?[1,3-9]|1[0-2])2))(?:(?:1[6-9]|[2-9]d)?d{2})$|^(?:29(/|-|.)0?23(?:(?:(?:1[6-9]|[2-9]d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1d|2[0-8])(/|-|.)(?:(?:0?[1-9])|(?:1[0-2]))4(?:(?:1[6-9]|[2-9]d)?d{2})$
- ^([0]d|[1][0-2])/([0-2]d|[3][0-1])/([2][01]|[1][6-9])d{2}(s([0-1]d|[2][0-3])(:[0-5]d){1,2})?$
Date in the following format: YYYY-MM-DDZ like 2019-06-12Z
Simple formats
- d{4}-d{2}-d{2}Z – simple format without validating month or day
- (19|20)d{2}-d{2}-d{2}Z – restricting the year
Complex formats
- ^dddd-(0?[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])Z?(-+:([0-5][0-9]))?$
Time in the following format: HH:mm:ss like 23:59:59
Simple formats
- d{2}:d{2}:d{2} – simple format without validating valid hours, minutes or seconds
Complex formats
- (([01][0-9]|2[0-3]):[0-5]:[0-9])
Time in the following format: HHmm like 2359
Simple formats
- d{2}d{2} – simple format without validating month or day
Complex formats:
- (([01][0-9]|2[0-3])[0-5])
Download
THIS SAMPLE CODE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND.
You can download the POC: BizTalk Schemas Handle Restrictions on Date from GitHub here: