Describing workflows in a multitude of textual formats

The novelty has never worn out for me. Being able to create something that a computer can quasi-understand, execute it and do meaningful work… It simply remains magical, and encouraging people to experience the thrill of programming to me is a noble cause.

Simply put, it is liberating. In a deeper sense, as the thesis of Douglas Rushkoff’s Program or be Programmed holds out, critical if you want to be one of the few that will ride the wave of innovation of the age we are living in.

In a sense, Warewolf reaches out to those unfamiliar with traditional methods of programming. With its flow-based style it greatly simplifies programming. It makes simple things very easy, and difficult things possible, albeit complex – some computation tasks are simply complex no matter how you look at it.

Although Warewolf utilizes aspects of Windows Workflow Foundation (WF), it definitely is not WF.
In a tremendous way it compensates for the limitations and potential misuse of WF that has resulted in the apparent distrust of flow-based programming solutions in the Microsoft ecosystem.

Undisputed, starting out with a graphical flow-based representation of logic jump starts development. Only having such a graphical view in the long run, in my opinion is limiting and only polarizes those who need to use the system into avid supporters and staunch opponents. If at all possible, providing alternative ways of describing logic should be sought after.

This article is about the plan to enhance Warewolf with a generalized, pluggable architecture for producing and consuming radically diverse representations of workflows.

Along the way we will hopefully get to dabble with concepts from computer science such as parsing, compiler construction, abstract syntax trees, OO patterns and the like.

If these topics interests you check back here regularly, get the code, and comment – your suggestions are most welcome!

Note: help on getting the code abounds in our Knowledge Base on I want to contribute.

Text representation beats binary mostly, but don’t overdo it

In The Art of Unix Programming, Eric S. Raymond extols the virtues of human readable, text based data representation over binary formats:

“When you feel the urge to design a complex binary file format, or a complex binary application protocol, it is generally wise to lie down until the feeling passes.

If performance is what you’re worried about, implementing compression on the text protocol stream either at some level below or above the application protocol will give you a cleaner and perhaps better-performing design than a binary protocol (text compresses well, and quickly).”

Since the book was published in 2003 we have surely seen this in a big, big way with HTML on the web and in particular, the massive invasion of XML on the .Net platform. From XML config files to XAML for UI layouts and specification (WPF). From WSDL for contract specification for WCF to XSD for specification and validation of user created XML – text representations, and XML in particular. It has almost become too much!

The larger picture however seems to suggest that the verbose XML has been overkill and that the more lean JSON will continue to push XML out of the picture. In the Warewolf realm this seems to hold as well.
While plain XML served us well when we started out when we solely relied on Windows Workflow Foundation (WF) for workflow execution, we have moved on with the introduction of our very own execution engine, which allow us to run using either WF, or our own optimized engine instead.

Keeping the option of WF execution safeguards us into the future just in case WF receives a radical face lift, while for the moment we gain 10x speedups with our own execution engine. Our own engine also opens up the possibility for moving away from the cumbersome XAML that the WF Design Surface produce.

XML Inside Warewolf

Currently, there are two important places inside Warewolf where XML makes it’s appearance:

  1. For configuration files, and
  2. the XAML that forms the native representation of a workflow.

Internally XAML is produced by the WF Design Surface inside Warewolf Studio as the artifact of your crafted workflows. Inside the Warewolf Server, the XAML corresponding to your workflows are persisted, parsed and executed.

While this is all good and well for reliable transmission, parsing, and visual presentation by the Studio on the design surface, the sheer amount of information that needs to be persisted coupled with the auto-generated nature of the code makes human inspection or alteration of the XAML virtually impossible.

This is not a problem as such, since the Studio’s design surface presents a clear and intuitive graphical representation of a workflow. However, when you require an alternative output such as a structured textual document describing the actions of a workflow in pseudo-code you are stuck.

What about representing your Warewolf workflows as JSON formatted data? Or what if you would like to highly optimize a particular workflow by implementing it straight in C#? Or even a non CLR language, running the entire process as a call to a DLL method from inside Warewolf? How about producing a literate programming of your workflow that would allow an iterative refinement that is extensively documented?

In essence, the Workflow Way embodies the concept of literate programming in that the graphical representation of a workflow is first and foremost intended for human comprehension, with the added ability that the execution semantics can be understood by a computer.

Enter the Warewolf Documenter: a drive towards creating an architecture that will allow the implementation of a broad range of output formats.

Planning the documenter

Since the precise representation of a workflow resides in the XAML generated by the Studio, the first step towards a documenter requires the parsing of this XML.

Intuitively there seems to be two broad ways of accomplishing this:

  1. A standalone parser that will generate a kind of intermediate representation of the workflow XAML on its own,
  2. incorporating the generation of this intermediate representation into the existing parsing process in a visitor-like pattern.

It turns out that on the Warewolf Server side, the parsing of a workflow XAML produces a sort of abstract syntax tree, or AST, that completely describes the intended execution that is to be performed. Per definition, the generated AST no longer contains information present in the XAML that might be very important in creating a useful alternate representation, as the documenter aims to do.

To capture this additional information contained in the XAML, again two options exist:

  1. Augment the AST during parsing or thereafter, or
  2. alternatively build a concrete syntax tree (CST) while parsing and/or thereafter.

The best approach still needs to be determined, but the product of this step will produce a data structure that will be far more compact than the original XAML and which will be open for interrogation by various publishers, discussed next.

A more compact representation might very well be more portable to future versions of the execution engine, and it will definitely run across the wire faster. Additionally, having the Studio transfer only the compact representation – the Server needs nothing more than that really – will increase responsiveness of the design surface also.

Publishers: a celebration of diversity

Having obtained our intermediate representation in the form of an augmented AST or special built CST, we can now move on to the exciting part of producing some interesting output.

As a start, we can create some Markdown, template to produce a level of verbosity that is suited towards the target audience of the document. For example, a business user might only be interested in the broad outline of a workflow; the steps, business friendly annotations on activities etc.

A Business Analyst might be interested in a little bit more information, such as input and output sources, the names of external services, a diagram and bulleted list representation of the whole workflow.

Drilling down even deeper, a software developer might very well want to see as much information as possible, including the ability to perform alterations in specific areas of the text which then can potentially be validated by the inverse of a publisher, call it a reader.

Similarly, a publisher for the augmented version of Markdown such as Multi Markdown could be developed from a derivation of the straight Markdown publisher. Such a Multi Markdown publisher would enable the direct generation of a number of formats, and many more formats in an indirect way via additional applications.

The possibilities are endless: given a generic, easy to interrogate interface to the essence of what constitutes a workflow, multiple publishers can be developed, each presenting the information in a unique perspective ideally suited for a specific target audience.

Conclusion

We have briefly looked at how and where Warewolf makes use of human readable, albeit prohibitively verbose, XML representations. The case for alternative representations, aimed at human comprehension, and possibly even alteration is clear and strong. The task is complex, but theory for achieving this is sound and a solved science. As a bonus we are already parsing, or at least walking the AST that the WF XAML parser provides us with. We have some clear, albeit high level ideas of how to implement all of this.

Feel free to offer your comments and suggestions here, or by contacting us on any of the available channels. In a further post I will delve into the details of the process, such as how the XAML actually looks, what OO patterns we’ll be using and so on.

Remember that you can get the entire code base of Warewolf from GitHub.

FacebookTwitterLinkedInGoogle+RedditEmail

Leave A Comment?