Logic Apps: How to process CSV files

Logic Apps: How to process CSV files

One of the most ancient and common standards for message representation is using text files (Flat Files) like CSV (Comma Separated Values) or TXT files, many of which are custom-made for their systems. And do not be fooled and think these messages are outdated and rarely used. Still today, many existing integrations and new integrations are made based on Flat Files.

No matter if you are working with Logic Apps Consumption or Logic Apps Standard, there are only two ways to generate a Flat File Schema:

  • Using a BizTalk Server Visual Studio Project (Visual Studio 2019 or older, depending on the BizTalk Server version you are using), this is a good option if you have a BizTalk Server environment and if you are migrating BizTalk Server solutions.
  • Or using the Azure Logic Apps Enterprise Integration Tools. This is an add-in for the Integration Account project to support XML Schema creation, XML transform designer, and Flat File Schema generation for Visual Studio 2019.

In this blog post, we will address how to create a flat file schema using the Azure Logic Apps Enterprise Integration Tools – in this case, a CSV file – and how to process it using the Logic Apps Consumption. The biggest difference between using Logic App Consumption or Standard is that to process a flat file using Consumption, it is required to have an Integration Account, which is not needed in Standard.

How to create a Flat File Schema for a CSV file

Normally, CSV files contain structured data so that a comma or semicolon separates individual items in the file, and each record is on a new line. In the image below, the CSV file contains five columns:

  • First name
  • Last name
  • Birth year
  • City
  • And Zip Code

and on each row, there is a person’s data.

We are going to use this sample as our example.

To create a flat file schema for this CSV file, we need to:

  • Open Visual Studio 2019 and create a new project by selecting Create a new project option.
  • On the Create a new project window, search for the Integration Account template, select the Integration Account template, and click Next.
  • On the Configure your new project window, give a Project name and a Solution name and set the Location. Then click Create.
  • Once the Visual Studio project is created, right-click the project name and select the option Add > New Item…
  • On the Add New Item window, on the left tree, select Logic Apps and then select the Flat File Schema, give it a name, and click Add.
  • This will start the Logic Apps Flat File Schema Wizard. On the Welcome screen, click Next >.
  • On the Flat File Schema Information screen, set the following configurations and then click Next >:
    • On the Instance file property, click the Browse button to locate the flat file from which the schema will be generated.
    • On the Record name property, give a name to define the root node name of your XML message—for example, People.
    • On the Target namespace property, set your desired namespace or leave the default.
    • On the Code page property, select UTF-8 (65001) from the drop-down selection list.
  • Because this is a small message on the Select Document Data screen, leave the default selected data and click Next >.
  • We will separate or define what separates a person on this first Select Record Format screen. And in this case, it is the new line. So, select the option By delimiter symbol and click Next >.
  • On the Delimiter Record screen, the child delimiter property is already defined to be the new line ({CR}{LF}), so leave the default configuration and click Next >.
  • On the Child Elements screen, on the first line, set the following configuration:
    • Element Name: Person
    • Element Type:  Repeating record
  • And on the rest of the lines, set the Element Type to Ignore since all lines represent the Person we will define on the first line. And then click Next >.
  • On the Schema View screen, you notice our Schema representation is being created. Click Next >.
  • Now, back to the Select Document Data screen, you notice that only the first line is automatically selected, excluding the new line character, and this is what we want. We will be defining the structure of the person object. Click Next >.
  • A semicolon separates this document, so in the Select Record Format screen, select the option By delimiter symbol and click Next >.
  • On the Delimiter Record screen, change the child delimiter property to a semicolon (;), and click Next >.
  • Now you notice that we have all fields separated in the Child Elements screen. Give the following configuration and click Next >.
    • FirstName (string)
    • LastName (string)
    • BirthYear (integer)
    • City (string)
    • ZipCode (string)
  • On the Schema View screen, you notice our Schema representation is finished. Click Finish.
  • And the Schema will open inside Visual Studio.

Now that we have our schema created, we need to upload it to our Integration Account – if you don’t have it, you need to create an Integration Account inside the Azure Portal. To do that:

  • In the Azure Portal, access your Integration Account and select the Schemas option under Settings.
  • On the Schemas page, click in + Add and on the Add Schema panel on the Schema property, browse for the schema we just created and click Ok.
  • This will add the schema to your Integration Account, which can be used inside your Logic Apps.

Now we need to create a Logic App to process this CSV. To do that, we need to:

  • Create a Logic App and use the HTTP Request-Response template – of course, this can be adapted to your requirements.
  • Once the Logic App design is loaded, click Save and return to your Logic App page. From there, click Workflow settings under Settings.
  • On the Workflow settings page, under the Integration account section, select the integration account in which we add our previous flat file schema on the Select an Integration account property. And click Save.
  • Now back to our Logic App designer. After the When a HTTP request is received trigger, select Add an Action.
  • On the Choose an operation panel, search for Flat File and then choose the Flat File > Flat File Decoding action.
  • On the Flat File Decoding action, set the following configurations:
    • Content: Body from the When a HTTP request is received trigger.
    • Schema Name: Persons – the flat file we created earlier.
  • Now, select the Response action and set the following configurations:
    • Status Code: 200
    • Headers: Content-Type: text/xml
    • Body: Body from the Flat File Decoding action.
  • And finally, save the Logic App.

Now if we open Postman and send the following request, the expected response will be the same payload translated to XML, as you see in the picture below:

Hope you find this useful! So, if you liked the content or found it useful and want to help me write more content, you can buy (or help buy) my son a Star Wars Lego! 

Author: Sandro Pereira

Sandro Pereira lives in Portugal and works as a consultant at DevScope. In the past years, he has been working on implementing Integration scenarios both on-premises and cloud for various clients, each with different scenarios from a technical point of view, size, and criticality, using Microsoft Azure, Microsoft BizTalk Server and different technologies like AS2, EDI, RosettaNet, SAP, TIBCO etc.

He is a regular blogger, international speaker, and technical reviewer of several BizTalk books all focused on Integration. He is also the author of the book “BizTalk Mapping Patterns & Best Practices”. He has been awarded MVP since 2011 for his contributions to the integration community.

Flat File Support inside Logic Apps

Flat File Support inside Logic Apps

We live in a REST era where JSON or XML is our text-based format default choice of text-based formats and widespread use by major corporations. However, one of the most ancient and common standards for message representation is using text files (Flat Files) like CSV (Comma Separated Values) or TXT files, many of which are custom-made for their systems. Do not be fooled and think these messages are outdated and rarely used. A good example is EDI messages, which are used extensively by large companies, so it is often necessary to transform text files into XML and vice versa.

Still today, many existing integrations and new integrations are made based on Flat Files.

What are Flat Files?

The standard definition states that a flat file is a collection of data stored in a two-dimensional database in which similar yet discrete strings of information are stored as records in a table. The table’s columns represent one database dimension, while each row is a separate record. Or in other words, it is a type of data storage file in which data is stored as plain text, often in a table-like structure with rows and columns. Each row represents a single record, while columns represent fields or attributes of the data. The information stored in a flat file is generally alphanumeric with little or no additional formatting. The structure of a flat file is based on a uniform format as defined by the type and character lengths described by the columns.

Flat files serve various purposes in software development, primarily for data storage, exchange, and processing tasks. They are widely used due to their simple structure, human readability, and ease of manipulation across different platforms and applications. For instance, flat files are commonly employed in data import and export operations, where applications or systems with diverse data storage mechanisms must communicate or transfer data.

A flat-file instance message is a text file that can contain three logical parts:

  • A header.
  • A body.
  • And a trailer.

In that order. Of course, both the header and the trailer are optional. The following example shows a flat-file instance message consisting of all three parts, with the body in bold type:

Sandro Pereira
Porto, Portugal
PO,1,BOOK,4415
TRANS-1

If you come from a Microsoft BizTalk Server background or if you are migrating a BizTalk Server project using these types of Flat-Files, you may know that the Flat file disassembler’s parsing allows you to specify:

  • The Header Schema in the Header schema design-time property of the flat file disassembler or the XMLNORM.HeaderSpecName message context property.
  • The Body Schema in the Document schema design-time property of the flat file disassembler or the XMLNORM.DocumentSpecName message context property.
  • And the Trailer Schema in the Trailer schema design-time property of the flat file disassembler or the XMLNORM.TrailerSpecName message context property.

However, Logic Apps only supports Body Schemas. You cannot have different schemas for Headers, Bodies, and Trailers. You can still process these types of flat-files messages but in a different approach with a single Schema dividing what is a header, body and trailer in different records (structures).

Flat-File Schema Types

Within a particular part of a flat-file instance message, different data items are grouped into records, which themselves can contain sub-records and, ultimately, individual data items known as fields. These records and fields are distinguished from each other using one of two different basic methodologies.

  • The first methodology, known as positional, defines each data item as a pre-established length, with pad characters being used to bring a shorter item of data up to its expected length.
  • The second methodology, known as delimited, uses one or more special characters to separate items of data from each other. This methodology avoids the need for otherwise superfluous pad characters. Still, it introduces some special considerations when the data itself contains the character or sequence of characters being used as a delimiter.

Positional Flat Files

Positional records within a flat-file instance message contain individual fields (items of data) that are each of a predefined length. The fields are parsed according to these lengths. For example, consider the following positional record from a flat-file instance message containing an id, country code, client name, and Country name:

01 PT Sandro Pereira                       Portugal

A reasonable definition for this record in a flat-file schema can be described as follows:

  • A positional record named Client contains the following fields:
    • An attribute named id that is left-aligned, three characters in length, with a zero character offset.
    • An element named countryCode that is left-aligned, three characters in length, with a zero character offset.
    • An element named name that is left-aligned, 37 characters in length, with a zero character offset.
    • An element named country that is left-aligned, and the length is until the end of the line.

Given these record and field definitions, the Flat file disassembler will produce the following XML equivalent of this record:


   PT 
   Sandro Pereira                       
   Portugal

There are several considerations related to positional records that will affect how the record is parsed when received and constructed when sent, including:

  • The character used to fill the unused portion of each field, known as the pad character.
  • An optional tag within the record can be used to distinguish the record from other similar records. Tags usually occur at the beginning of the record but are allowable anywhere within it. Positional records can be defined to have a tag or not have a tag, but once defined, the tag must be present or not, based on the definition.
  • How data is justified within a fixed length field relative to the accompanying pad characters.
  • Positional records nested within other positional or delimited records.
  • Positional records with field lengths specified as a specific number of bytes rather than a specific number of characters.

Notes:

  • Suppose your flat file contains both delimited and positional records. In that case, you must set the Structure property of the root node to Delimited and the Structure property of subordinate record nodes to either Delimited or Positional as appropriate.
  • Fields in positional records have a limit of 50000000 characters.

Delimited Flat Files

Delimited records within a flat-file instance message contain nested records and/or individual fields (items of data) that are separated by a predefined character or set of characters. The fields are parsed according to these separating delimiters. For example, consider the following delimited records from a flat-file instance message, which contain three client lines to add to our internal system hypothetically:

Sandro;Pereira;1978;Crestuma;4415
José;Silva;1972;Crestuma;4415
Rui;Barbosa;1975;Lever;4415

A reasonable definition for this record in a flat-file schema can be described as follows:

  • A delimited repeating record named Client with child delimiter {CR}{LF}
  • And  delimited elements with child delimiter ;
    • firstName
    • lastName
    • birthYear
    • city
    • zipCode

Given these record and field definitions, the Flat file disassembler produces the following XML equivalent of these records.


   Sandro
   Pereira
   1978
   Crestuma
   4415


...

...

There are several considerations related to delimited records that will affect how the record is parsed when received and constructed when sent, including:

  • The character or characters are used to override the interpretation of delimiters so that they are treated as part of the data.
  • An optional tag at the beginning of the record can be used to distinguish the record from other similar records.
  • How data is justified within fields with minimum lengths relative to the accompanying pad characters.
  • Positional records nested within other delimited records.
  • How data is justified within a fixed length field relative to its accompanying pad characters.

Preservation and suppression of delimiters when flat-file messages are received and sent.

Notes:

  • Suppose your flat file contains both delimited and positional records. In that case, you must set the Structure property of the root node to Delimited and the Structure property of subordinate record nodes to either Delimited or Positional as appropriate.
  • Delimited fields in flat files have a limit of 50000000 characters.

How do Logic Apps process the text files (Flat Files)?

When building a logic app workflow in Azure Logic Apps, you can encode and decode flat files using the Flat File built-in connector actions and a flat file schema for encoding and decoding. You can use Flat File actions in multi-tenant Consumption logic app workflows and single-tenant Standard logic app workflows.

  • Inside Logic Apps Consumption, Integration Account is required to store the flat-files schemas and use the Flat File built-in connector.
  • Inside Logic Apps Standard, there is no need for having an Integration Account since Schemas are supported built-in. However, if you desire, you can still use the Integration Account.

While no Flat File triggers are available, you can use any trigger or action to feed the flat-file content into your workflow. For example, you can use a built-in connector trigger, a managed or Azure-hosted connector trigger available for Azure Logic Apps, or even another app, like the Request built-in trigger or File System trigger.

Flat File Encoding action

The Flat File Encoding action allows you to convert an XML message into a flat file. Note that this action does not validate the incoming XML message. For that, you need to use the XML Validation action.

In Logic Apps Consumption, this action allows the following inputs:

  • The Content property, where you specify the XML message you want to encode to flat-file.
  • The Schema Name property is where you choose the flat-file body Schema or the Document schema.
    • If the schema list is empty, either your logic app resource isn’t linked to your integration account or doesn’t contain any schema files.
  • And then we have the following optional properties:
    • Mode of empty node generation, where we specify the mode to use for empty node generation with flat file encoding. Possible values are ForcedDisabled, HonorSchemaNodeProperty, or ForcedEnabled.
    • And XML Normalization allows you to enable or disable XML normalization in flat file encoding. Possible values are Yes or No.

In Logic Apps Standard, this action allows the following inputs:

  • The Content property, where you specify the XML message you want to encode to flat-file.
  • In the Source property, we select either LogicApp or IntegrationAccount as your schema source.
  • The Name property is where you choose the flat-file body Schema or the Document schema.
    • If the schema list is empty, either your logic app resource isn’t linked to your integration account, your integration account doesn’t contain any schema files, or your logic app resource doesn’t contain any schema files.
  • And then we have the following optional properties:
    • Mode of empty node generation, where we specify the mode to use for empty node generation with flat file encoding. Possible values are ForcedDisabled, HonorSchemaNodeProperty, or ForcedEnabled.
    • And XML Normalization allows you to enable or disable XML normalization in flat file encoding. Possible values are Yes or No.

Flat File Decoding action

The Flat File Decoding action allows you to convert a flat-file message into an XML message. Note that this action does not validate the outcome XML message. For that, you need to use the XML Validation action.

In Logic Apps Consumption, this action allows the following inputs:

  • The Content property, where you specify the flat-file message you want to decode to XML.
  • The Schema Name property is where you choose the flat-file body Schema or the Document schema.
    • If the schema list is empty, either your logic app resource isn’t linked to your integration account or doesn’t contain any schema files.

In Logic Apps Standard, this action allows the following inputs:

  • The Content property, where you specify the flat-file message you want to decode to XML.
  • In the Source property, we select either LogicApp or IntegrationAccount as your schema source.
  • The Name property is where you choose the flat-file body Schema or the Document schema.
    • If the schema list is empty, either your logic app resource isn’t linked to your integration account, your integration account doesn’t contain any schema files, or your logic app resource doesn’t contain any schema files.

Unlike BizTalk Server, where this Syntax Transformations or Data translation typically happens inside Receive or Send Pipelines. Inside Logic Apps, they happen inside our business process, aka Logic App using the Flat File connector.

Hope you find this useful! So, if you liked the content or found it useful and want to help me write more content, you can buy (or help buy) my son a Star Wars Lego! 

Author: Sandro Pereira

Sandro Pereira lives in Portugal and works as a consultant at DevScope. In the past years, he has been working on implementing Integration scenarios both on-premises and cloud for various clients, each with different scenarios from a technical point of view, size, and criticality, using Microsoft Azure, Microsoft BizTalk Server and different technologies like AS2, EDI, RosettaNet, SAP, TIBCO etc.

He is a regular blogger, international speaker, and technical reviewer of several BizTalk books all focused on Integration. He is also the author of the book “BizTalk Mapping Patterns & Best Practices”. He has been awarded MVP since 2011 for his contributions to the integration community.

BizTalk Server Teach me something new about Flat Files (or not) Part II video and slides are available at Integration Monday

BizTalk Server Teach me something new about Flat Files (or not) Part II video and slides are available at Integration Monday

Last year I presented several sessions on Integration Monday community but I never had the chance, for several and different reasons, to proper highlight on my personal blog. The second session that I delivered last year was the second part of the topic BizTalk Server: Teach me something new about Flat Files (or not) Part II.

In that session I tried to address and solve the following questions:

  • How to suppress Empty Lines everywhere (end or middle)?
  • My positional flat-file contains data that does not match the length expected (and they are not filled with empty spaces). How can I handle that?
  • What if we don’t want to remove Headers but… in fact we want to deal with Headers and Trailers?
  • Do I need to always create a custom pipeline for dealing with Flat-Files? Or it is possible to create a Generic Pipeline?

That was the sixth session that I deliver for that community:

About my session

Session Name: BizTalk Server: Teach me something new about Flat Files (or not) Part II

BizTalk Server Teach me something new about Flat Files (or not) Part II

Session Overview: This is the second part of Sandro Pereira’s earlier presentation on flat files on Integration Monday. You can watch the video recording of the previous session here.

Despite over the year’s new protocols, formats or patterns emerged like Web Services, WCF RESTful services, XML, JSON, among others. The use of text files (Flat Files) as CSV (Comma Separated Values) or TXT, one of the oldest common patterns for exchanging messages, still remains today one of the most used standards in systems integration and/or communication with business partners.

While tools like Excel can help us interpret such files, this type of process is always iterative and requires few user tips so that software can determine where there is a need to separate the fields/columns as well the data type of each field. But for a system integration (Enterprise Application Integration) like BizTalk Server, you must reduce any ambiguity, so that these kinds of operations can be performed thousands of times with confidence and without having recourse to a manual operator.

You can see the recording session here: BizTalk Server: Teach me something new about Flat Files (or not) Part II.

About Integration Monday

Integration Monday is full of great sessions that you can watch and I will also take this opportunity to invite you all to join us next Monday.

Author: Sandro Pereira

Sandro Pereira lives in Portugal and works as a consultant at DevScope. In the past years, he has been working on implementing Integration scenarios both on-premises and cloud for various clients, each with different scenarios from a technical point of view, size, and criticality, using Microsoft Azure, Microsoft BizTalk Server and different technologies like AS2, EDI, RosettaNet, SAP, TIBCO etc. He is a regular blogger, international speaker, and technical reviewer of several BizTalk books all focused on Integration. He is also the author of the book “BizTalk Mapping Patterns & Best Practices”. He has been awarded MVP since 2011 for his contributions to the integration community.

BizTalk Server: Teach me something new about Flat Files (or not) video and slides are available at Integration Monday

BizTalk Server: Teach me something new about Flat Files (or not) video and slides are available at Integration Monday

Last Monday I presented, once again, a session in the Integration Monday series. This time the topic was BizTalk Server: Teach me something new about Flat Files (or not). This was my fifth session that I deliver:

And I think will not be the last! However, this time was different for many aspects and in a certain way it was a crazy session… Despite having some post about BizTalk Server: Teach me something new about Flat Files on my blog, I didn’t have time to prepare this session (sent to a crazy mission for a client and also because I had to organize the integration track on TUGA IT event), I had a small problem in my BizTalk Server 2016 machine in which I had to switch to my BizTalk Server 2013 R2 VM, interrupted by the kids in the middle of the session because the girls wanted me to have dinner with them (worthy of being in this series)… but it all ended well and I think it was a very nice session with two great real case samples:

  • Removing headers from a flat file (CSV) using only the schema (without any custom pipeline component)
  • And removing empty lines from a delimited flat file, again, using only the schema (without any custom pipeline component)

For those who were online, I hope you have enjoyed it and sorry for all the confusion. And for those who did not have the chance to be there, you can now view it because the session is recorded and available on the Integration Monday website. I hope you like it!

Session Name: BizTalk Server: Teach me something new about Flat Files (or not)

BizTalk Server: Teach me something new about Flat Files

Session Overview: Despite over the year’s new protocols, formats or patterns emerged like Web Services, WCF RESTful services, XML, JSON, among others. The use of text files (Flat Files ) as CSV (Comma Separated Values) or TXT, one of the oldest common patterns for exchanging messages, still remains today one of the most used standards in systems integration and/or communication with business partners.

While tools like Excel can help us interpret such files, this type of process is always iterative and requires few user tips so that software can determine where there is a need to separate the fields/columns as well the data type of each field. But for a system integration (Enterprise Application Integration) like BizTalk Server, you must reduce any ambiguity, so that these kinds of operations can be performed thousands of times with confidence and without having recourse to a manual operator.

In this session we will first address: How we can easily implement a robust File Transfer integration in BizTalk Server (using Content-Based Routing in BizTalk with retries, backup channel and so on).
And second: How to process Flat Files documents (TXT, CSV …) in BizTalk Server. Addressing what types of flat files are supported? How is the process of transforming text files (also called Flat Files) into XML documents (Syntax Transformations) – where does it happen and which components are needed. How can I perform a flat file validation?

Integration Monday is full of great sessions that you can watch and I will also take this opportunity to invite you all to join us next Monday.

Author: Sandro Pereira

Sandro Pereira lives in Portugal and works as a consultant at DevScope. In the past years, he has been working on implementing Integration scenarios both on-premises and cloud for various clients, each with different scenarios from a technical point of view, size, and criticality, using Microsoft Azure, Microsoft BizTalk Server and different technologies like AS2, EDI, RosettaNet, SAP, TIBCO etc. He is a regular blogger, international speaker, and technical reviewer of several BizTalk books all focused on Integration. He is also the author of the book “BizTalk Mapping Patterns & Best Practices”. He has been awarded MVP since 2011 for his contributions to the integration community.