not the shortest post, but very interested in your opinion

Home Page Forums BizTalk 2004 – BizTalk 2010 not the shortest post, but very interested in your opinion

Viewing 1 reply thread
  • Author
    Posts
    • #17176

      This is a general Biztalk 2006 best practices question.  We brought biztalk into our organization recently and have several plans for integrating our data between our systems.  For this, it is working great.  However, we have a specific process that we are struggling to accomplish and are now starting to wonder if Biztalk is even the right solution at all.  Or if there is a better Biztalk technique of doing this.

       

      Here’s the scenario.  We have several files that come in from outside clients.  In this example we’ll use a 400mb .tar file that is pgp encrypted.  The tar file includes several 15mb .tif images.

       

      We need to bring the file in, decrypt it, untar it, run a stored procedure based on a value from the .tif file name and then write the .tif to a file server location.  Then “untarring” and “decrypting” methods come from the GNU libraries as shown in the Apress “Pro BizTalk 2006” book and we wrote custom pipeline components implementing them.

       

      We have the entire process working correctly.  However, when we begin to load test the Biztalk Application, we are starting to get several headaches.  From out of memory errors from the custom pipeline components, to the message processing time w/in the orchestration itself slowing down to a snail’s pace.

       

      The common scenario we expect to see is a few 300 meg .pgp files a day, however this number will grow fairly rapidly.  Our load test consists of 6 separate files (1.14 gig combined) dropped on the receive location at the same time.

       

      We have a dedicated Biztalk server w/ 4 gig ram, 3.2 GHz Processor X 4, 400 gig hard drive.  SQL Server:  8gig ram, 3.2 GHz Processor X 4, 250+ gig hard drive.

       

      I have no doubt that if we continue to tune, throw hardware at it, try different techniques, we can get it to work (although I don’t know how scalable it will be).

       

      So my question is not how to fix a certain issue, but to verify or discount that we are heading in the right direction in the first place.  Is Biztalk the right soluction forhandling this type of process or are we simply trying to force a solution into Biztalk that it was not designed for?

       

      Any thoughts on this matter would be greatly appreciated.  Thanks.

    • #17180

      Can you say what your Orchestration is doing?  I'm wondering what you're putting through it.  It may be that you can break this process up into component processes, but this is coming from standard development best practices.

      Can you find a way to break the .tif files out and process them separately?  You can receive the .tar file, break it out and send the "parts" to another port.  I guess it depends on what data you need from the original file(s) and at what time.

      Another simple suggestion is to see how you would implement it "without" BizTalk and verify if either way makes more sense, as well as making it easier to see how to implement a solution with BizTalk.

      • #17185

        thank you very much for your reply.  The actual orchestration is miminal.  It calls a sproc to update a table based upon the file's name.  It calls a simple custom assembly to kick off another small process.  We've verifed that once the .tif files are in the orchestration, everything runs beautifully.  The bottleneck is definately loading multiple larger files (200+mb) .pgp/.tar files in at the same time and processing them through custom pipeline components.  I can't give you specifics, but my gut instinct tells me there's a memory leak in our untar component.

         If we decrypt and untar manually and just throw a ton of .tifs at it, there is no problem.  I think the problem is coming in by reading several 200mb+ files into the message box at the same time?  We've considered breaking this process out, however, that is what we're currently doing w/ a windows service, so why migrate the process to biztalk at all? 

         My main question is:  I'm wondering if anyone else is using biztalk for the purpose of really just waiting for large files to be FTPed to us then moving them from one place to another.  We've found great uses for biztalks on more system integration tasks and it almost seems to me that in this one case, we're trying to make BizTalk a file mover, when it really wasn't designed for that.

         I will tell you that today we've found a way to "work around" the issue.  I will post that later, but it is really just a work around and I'd like to get some other people's thoughts first, w/o "tainting the jury 😉

         

        • #17190

          Why use Biztalk? You're not mapping anything.  Could a DTS package do the job as well?

          -wa 

           

          • #17191

            as a followup – BizTalk is forcing you to throw these 15meg blobs through the database, when really all you want to do is move them on the file system. That is why I wonder why not write the thing in vb.net and skip the BizTalk message box convention.

            -wa 

            • #17194

              Again, thx for everybodies replay.

               The current app we have is a windows service that does just that.  Waits for the file to come in, decypts/untars/calls a sproc and moves the file.  As I mentioned above, we implemented Biztalk about a month ago and picked 3 "different styles" of projects that we currently had running in production to rewrite in Biztalk just so we could get our hands around the new product.  Our other projects we completed (more data integration between systems) are working great.  This one has just felt like we've been forcing Biztalk the entire way.

               This process does have a bit of business logic in it, in the sense that it performs certain actions in our DB based on the file name of the .tifs.  Other than that, it's a file mover.

               We considered using a "pointer" system as describied in the Apress Pro Biztalk 2006 book, where the actual files are not moved through the message box, but rather the file locations and biztalk is used for process flow logic only.

               What we ended up doing is pulling all files recieved in our port and forcing them in a single file line before entering our orchestration through our custom pipeline components.  In other words, the 1st file must be completely through the rcvd pipeline/orchestration before the 2nd one is allowed to enter.  This seems to work fine and we've successfully pushed 4+gig of .pgp files through our process succesfully.  However, it's a bit of a "quirky" way to do it.

               Again, thx to your responses.  I would still appreciate any more feedback, if someone feels like giving their 2 cents.

              • #17195

                rather….."thx for everybody's reply"  😉

                • #17197

                  I'm going to recommend the "pointer" method since we had that scenario presented to us in a Deep Dive / Ranger training last year.  It makes sense since otherwise it is a complete waste of bandwidth and resources to push those files into and out of SQL Server, not to mention moving that data over the wire twice more than would be required otherwise.  Even if SQL is on the same Dedicated BizTalk Server, now its a resource and performance issue you shouldn't need.

                  That would be a great project to work on and then test performance against the original app and current BizTalk application.

Viewing 1 reply thread
  • The forum ‘BizTalk 2004 – BizTalk 2010’ is closed to new topics and replies.