GlazeBlog: 2007

Sunday, March 11, 2007

Command Prompt Here

Every time I have a fresh install of Windows XP, one of the million tasks on my todo list is to add in the "Command Prompt Here" context menu. This time I did some searching around, and came across a few interesting results. First, I found this. Now, this page is very informative, and explains several different techniques for installing the shortcut. It also mentions that in Windows Vista, this will finally be a built-in feature. But the one thing it doesn't have is a simple download for an inf or reg script to actually just install it!

Then I found this page, which has the necessary stuff for a simple reg file - just copy and paste.

Still searching, I found this page on Scott Hanselman's Computer Zen blog. He really went the extra mile, and included install scripts for Visual Studio 2003 "Command Prompt Here" and also for Visual Studio 2005 "Command Prompt Here". He also has some other useful tools linked from his blog. Thanks a million, Scott. I won't be wasting time searching for these again.

Wednesday, February 21, 2007

RE: Not Another C# Versus VB Article

I was just digging through some old emails, and I came across this gem. This was a follow-up discussion I had with Nigel Shaw, author of Not Another C# Versus VB Article posted on CodeProject.

Date: May 26, 2005 1:05 PM
Subject : RE: Not Another C# Versus VB Article

Hi Nigel,

I just finished reading your article, "Not Another C# Versus VB
Article", and first of all let me say thank you for this work - I'm sure it will be very useful for consultants like myself, as a well-thought-out argument for building development teams on and switching them to C#.

As a developer with about 12 years of professional experience, my history breaks down into something like 1 year of BASIC, Pascal, Fortran, etc., 5 years of VB (vb3 through vb6), 2 years of Java, 1 year of VB.Net and C#, and then 3 years of C#, MC++, and C++. Needless to say, there have been periods in my career where I have considered myself a VB Guru, a Java Guru, and more recently, a .NET Guru. Thus, I feel particularly qualified to comment on your article, and also maybe offer some insights.

I absolutely agree with your central premise - that the fundamental difference between VB and C# is one of culture, rather than technology. Prior to the emergence of .NET, after I had a good amount of exposure to Java, I began to develop very similar conclusions about the differences between VB and Java - which were, after all, much more technically different than are VB.Net and C# - but even then, I believed the central premise applied - as different as they were technically, the really fundamental difference had a lot more to do with what kind of culture gravitated towards each technology. That said, at the time I always had the feeling that there was something fundamentally flawed with each of those 2 technologies, and the general culture that they seemed to promote. On the VB side, of course, you have the opposite 20/80 rule that you've commented on well in your article, in terms of developer capability. On the Java side, my experience at the time was that there was a strong culture in terms of technical competence, but that there was also a lack of focus on solving problems efficiently, often opting instead of the elegant or over-architected solution. This often lead me to believe that perhaps the VB culture actually was the bestwe could do, because it surely would not be susceptible to "too much" elegance!

Then when .NET emerged, my initial impressions were that the platform itself would provide a great step forward in terms of combining the benefits of these two cultures, while eliminating many of the negatives. After some experience dealing with cross-VB-C# teams and seeing how the two sub-teams interacted, I would absolutely say that I can confirm many of your assumptions about the culture breakdown being carried forward to these two technologies. I have been amazed at times to see the same exact piece of logic produced in two entirely different ways by VB.NET and C# teams - and with those two different ways demonstrating precisely the difference in cultures we are discussing. When I initially moved back from the Java world into the new .NET world, I was very gung-ho on VB.NET, first because I was sick of much of the flawed ideology of Java and saw much of that carried on to C#, and second because I was similarly enthusiastic to get back into what seemed to be the most MS-oriented technology - VB.NET. But after time, I began to pick up on these culture differences, and as I found myself gravitating towards the C# culture more and more, I found that the technology conventions supported by C#, and more importantly, those more "encouraged" by C#, were much more in-line with the type of development that I wanted to do.

Ok, so a few notes for your article:

1. Under "The Culture of C#", you write the following: "Marc Andreessen, CEO of Netscape and therefore the 'God of the Internet'" - I believe that you are referring to "god" in the pluralistic sense of the word, i.e. "the god of love", "the god of war", etc. - thus it would be not only incorrect English, but also potentially offensive to those that believe the word "God" should only be capitalized when referring to the one God.

2. The statement "Java applications...were more powerful and feature-rich than the vast majority of VB apps..." struck me as a bit off-base when I first read it. While you may be technically correct in a certain sense, I think this statement is a bit loaded and really depends on context and point of view of the reader. For example, if I am thinking from the viewpoint of a Windows GUI developer, this statement would seem absolutely false - even today it's not necessarily true, because even the most modern Java windows toolkits cannot match the power and look-and-feel of the equivalent toolset in VB - thus it is simply not possible to build Java Windows apps that are as "feature-rich" in that sense as VB apps. However, if I am developing server-side multi-threaded distributed messaging clusters, then yes, of course Java offers a lot more capability than VB has offered in the past, or even maybe today. But the point is, the "vast majority" of VB apps up until .NET were most certainly Windows GUI apps, while the vast majority of Java apps has always been server-side. So the statement as you have it is very "apples-and-oranges" to me. I think the point you are trying to get across is about how Java was technically superior on many levels at that time - I would just re-write this in a less ambiguous and debatable way than what you have.

3. I really like the section "Propagation of the Culture in .NET" - but I think you would do better to split up your list into 2 lists or something. About half of the points are what I would consider absolutely undebatable points about what is wrong with VB.NET (1, 2, 3?, 4, 7). However other points (5 and 6, and to a smaller extent maybe 3) are very debatable in my opinion, and seem to have a lot more to do with personal preference. I personally believe VB is the superior one on points 5 and 6 for example, and I don't agree with your conclusions that they detract absolutely from the potential of VB to promote a strong "culture". I believe the opposite could be true in fact - if these features were part of C#, they would be a further benefit to the environment, and would help provide distinguishing factors that set C# ahead of Java (like properties, events, delegates, etc.). So my point here is that at least a couple
of your arguments here seem to be very much based on personal preference, and much less like "absolute truth" that should be logically accepted by anyone reading the argument - and that detracts from the rest of your points, which are much more like absolute truths.

4. Speaking of those absolute truths, I would make some further comment to expand your point #4 - I believe this is truly fundamental to the continuation of the negative aspects of the VB culture. For example, when I first moved to VB.NET, I obviously started looking for parallels and language equivalents between VB and VB.NET. When I saw two options for doing the same thing, for example String.Length and Len(String), I noticed that the first one looked a lot like the Java way of doing things (which I was trying to escape from), while the second looked like good old familiar Len() in VB6. I continued to use this function until it came time to port a section of code from a VB.NET assembly to a C# assembly a colleague was working on - and then it hit me - this non-standard, non-object-oriented VB extensions stuff is not directly portable! If I would have wrote it with the new .NET convention String.Length instead, it would have directly ported, basically with the addition of semicolons here and there. Instead, it because a non-trivial excercise to port code between the two languages, because it was actually more like porting code bewteen two cultures!

So, once I realized this, I had to go back and re-examine a lot of what I had been doing in VB.NET and try to understand the "right way" for true .NET compatibility. This excercise really helped to clarify my understanding of the distinctions between the object-oriented aspects of VB.NET and the non-object-oriented aspects. Even more interesting, when I finally realized that all of these things simply exist in that VB assembly, I thought I should be able to simply reference that assembly from C# and give myself all the same "powers" from within the C# environment. This would have been especially useful, for example, with the famous IsNumeric() function in VB, which doesn't seem to have a .NET equivalent, but which was always very useful in VB. Unfortunately, this doesn't work from C# - you can't reference the VB dll and call all the VB functions, because it just doesn't work! So this led to an ultimate rewriting of a lot of code to remove usage of Instr(), Mid(), Trim(), Len(), IsNumeric(), and countless other non-standard-.NET functions that seem only natural to a VB
programmer.

As a final note, I would recommend a book to you, if you haven't already heard of it - I just finished reading it, after having it reccomended by a friend of mine, and I'm not sure if it is because the content is still so fresh in my mind or not, but while reading your article, I found myself going back to the ideas in the book many times and drawing parallels. The book is called "The Tipping Point", and discusses the ways that ideas turn into trends and then into epidemics, and how specific kinds of individuals are attracted to those ideas.

Thanks,
Geoff

I received a gracious reply from Nigel, and wrote a follow-up:

I'd be happy to be involved in a rewrite. Let me know what you have in mind, and perhaps we could start brainstorming a new outline, new ideas, etc.

Not sure what you think about this, or if there is already significant work out there - but I would consider expanding the scope to cover the "evolution of culture" as it has gone on through the various iterations, including a bit about culture as it applied to both Delphi and Java as well, and maybe even new contrasts between "J2EE" and ".NET" culture. My predictions for the future would be an eventual "survival of the fittest" race towards a re-unified culture in which the best of .NET and the best of J2EE co-exist happily in a single culture.

However, while this topic of general evolution of cultures is kind of a pet subject area for me, it may not be the best idea to expand your overall scope, as the current main premise you have (stated loosely, "C# is better than VB") is very pointed and controversial, and no doubt has a lot to do with what has drawn such a great response for your article throughout the community.

Perhaps along these lines I would consider a separate article as a subsequent chapter in the "series", following on from the basic premises in this one...

Hmm, I'll have to put some thought into it myself...

Geoff

Thursday, February 08, 2007

Snow!

For most people in London, snow is just another regular yearly occurence. But where I come from, it's quite a novelty!

Wednesday, February 07, 2007

USB Rechargeable Batteries

Here is the cool gadget of the day:

these are apparently available now at this site:

http://www.usbcell.com

The price is also cheaper than I would have expected - £12.50 for a 2-pack of AA batteries.

They also have AAA and 9V batteries, and are apparently developing cell phone batteries.

I've seen some discussion about these. Apparently the USB design means they are really only half as efficient as normal Ni-mh batteries, which already have a fairly disappointing efficiency. An alternative was suggested, for longer life and lower cost, here. Of course, this one requires you to carry around a charger - but at least it's still a USB charger!

Another guy has actually done some detailed experiementation to analyze actual discharge results for different types of rechargeable batteries. His results are posted here.

Wednesday, January 31, 2007

Gadgets

I use a few gadgets on a daily basis. Here are a few of my favorites:

1. Sandisk Sansa e280 Mp3 Player

I get a lot of mileage out of this mp3 player. It has 8 gigs of internal memory, plus a miniSD slot for extra swappable memory. Most of my friends have some kind of iPod or iPod Nano, but for my purposes I think the Sandisk player suits me better. I'm not so concerned with how flashy it looks, and honestly I prefer the look of my player to any iPod - I think it looks less like an product made by a computer manufacturer, and more like an electronics gadget. From what I understand, the iPod products require iTunes to really do anything useful. For me that would be a deal-breaker on it's own - I love the ability to use Windows Explorer to just drag and drop stuff, and it's also good that it will recognize pretty much any file format.

Before this player, I used the MPIO FL100 player. This served it's purpose for a reasonably long lifetime - over 3 years, I think. Towards the end it got to the point that the LCD screen display was bleeding so badly that I could only read the 2 leftmost characters. I finally gave in and decided to get something new. I should mention that with this player, I was forced to use an extra program to transfer files back and forth - but there was more than one option, and the protocol appears to be open enough for users to write their own software, such as this one.

2. Sony Headphones

For most of my daily usage while I'm mobile, I insist on these:

I should admit that they have broken many, many times - it usually comes down to the quality of the cable inside the wires, which is hair-thin. I've tried many times to repair them, and a couple of times I've had success - but it's never really because the soldering has just come loose - it's usually because the wire close under the earpiece has completely stripped down to a hair. So I end up just cutting the wire down to that point, and re-soldering to end up with a much shorter cable length. This technique has extended the lifetime an extra 3 months or so.

This hair-thin wire design makes them very light (and explains why they are cheap), but also makes them particularly delicate, and probably way too fragile for my rough excercise and transportation habits. But so far they have always proven easy enough to find replacement for, and they usually cost around 10 pounds (18 dollars) , so it's not too bad. I try to keep the warranty stuff around, but never quite get around to trying to use it. I've been using these for about 4 years now, and have probably had about 8 or 9 sets in that time.

But it's worth it, because while they keep working, they are great. Well, at least for my taste - I don't particularly want something penetrating deep in my ear, and I don't even really care about noise blocking or cancellation. In fact, when I am at the gym or out for a run, I kind of prefer to have the ambient noise - running a path where I need to cross streets with busy traffic, it's especially useful, and would probably be reckless to use noise-cancelling headphones. Since they don't really weigh anything, I definitely prefer the over-the-ear design, even though it probably looks a bit more tacky than a simpler in-the-ear headphone.

I also want to rant briefly about one of my biggest pet peeves - the standard iPod headphones, and people who use them in public. I'm not sure who in the "world's greatest design firm" designed this masterpiece, but from a usability standpoint they are ridiculous. Perhaps it's not really something you notice when you are wearing them, and it's also probably not something you would notice if you never take public transportation. But for those of us who sit on the subway every day and invariably end up sitting between two iPod users, it is like torture. The design of these headphones almost seems to have an "open ambient" technique, which makes them louder externally than internally. Regardless of what type of audio or music the person sitting next to you is listening to, you'll be able to make out very clearly everything they are listening to - even if you have your own headphones on.

For some of the work I do in my studio at home, I prefer the MDR-V700 fully enclosed headphones. These are the same headphones used by many DJs, and give a great combination of excellent quality, encapsulation, and flexibile/portable design.

3. Motorola MPx220

After using the MPx200 phone for about a year until its total extinction, I was more than happy to upgrade to the MPx220 phone. I had a ton of complaints about the reliability of the older MPx200, and I think most of those issues were to do with the older Windows Mobile 2003 operating system, but also probably a few device issues.

This new phone has been going solid for a couple of years now, even though I managed to break the clamshell hinge a good while back. Since then I've had to be extra-careful opening and closing the thing, but suprisingly, it still works perfectly and doesn't even seem to know that its wires could snap at any moment. This is a big improvement from older clamshell phones I've used where they were rendered useless very shortly after breaking the hinge.

I've already purchased a brand new empty casing for the phone off eBay, for about 10 dollars. But it's a bit too complex for me to switch it out without putting some serious study into it, or just taking it down to the electronics center and finding someone more capable to do it for me. But so far it keeps working fine, so I haven't had the motivation to get down there and get it done.

In short, I love the phone and I love the OS - I think they've fixed the majority of stupid issues I had with the old one. I actually used to have to reboot that thing as often as I rebooted my Win2k desktop. But this new one hardly ever gets a reboot - probably only once a month or so, or whenever I have to turn it off to get on an airplane.

One thing I haven't really had an opportunity to try with the new phone is the GSM internet access. I used this quite frequently with my old MPx200, but always had problems with it. For some reason it would hardly ever work with the cable, and I could only really get it working with infrared. I would guess that this is going to work much better with this phone, like everything else does, if I ever get an opportunity to try it out.

I found a very detailed review of this phone here. They even go and open the phone up and show all of it's inner guts.

4. Motorola HS850

I really can't live without my bluetooth headset. I've seen various complaints on websites about this unit, but honestly I haven't had major problems. And considering how much the price has dropped, it's now cheap enough that if one breaks I can just grab a new one without making a big deal about it. So far, I've had one break, after about 6+ months of very frequent usage. I then went out and bought a pair of new ones, in different colors, to go with the two telephones I use (one for UK, one for Italy).

I'm biding my time until this one breaks, so I can upgrade to the next gen model, the Motorola H700 headset, which is lighter and even more streamlined:

My roommate just got one, and it's really suprisingly small. But on a minor negative note, they have once again pulled the famous old Motorola trick - they've switched the power adapter again! So it seems that with every new iteration of products, Motorola loves to switch between the proprietary "fork" plug interface, and the more standard mini-USB format. Here's the rundown I've seen so far, based on my experience (in order of purchase date):

Motorola MPx200 : mini-USB connector
Motorola MPx220 : proprietary Motorola connector
Motorola HS850 : proprietary Motorola connector
Motorola SLVR : mini-USB connector
Motorola H700 : mini-USB connector

Most feedback I've seen on the web is more in favor of the mini-USB connection, and I am too. If for no other reason - those are much easier standard cables to find replacements for. They also don't feel as flimsy, and tend to stay connected better.

Oh, by the way, as for the H700 - they even make a flashier gold Dolce & Gabbana version!

Tuesday, January 30, 2007

Fast Excel Reporting

A recurring topic where I work is how to improve performance of code that exports data to Excel for reporting. Lately I've seen a lot of interesting advice on different options, so I thought I'd post some of it here.

Approach #1: C# with interop for Excel COM API
The fundamental problem with this approach is that the COM API is inherently slow, and all of the marshalling (especially with variant types!) has a lot of overhead. The net effect is that a lot of people who started out with this approach have seen performance degrade. Now, I should mention that in our core reporting engine we still primarily do things this way, and we produce reports with literally 100,000's of rows, ending up with several-hundred-megabyte size files - and yes some of them take a while, but nothing that seems that extremely bad, given what they are doing. From what I recall, the worst is something like a half hour, for almost 1M rows by 50+ columns. And obviously any user who really wants us to put that much data in one spreadsheet is not really going to be able to use it very effectively.

The first approach most people try is cell-by-cell, which is pretty slow, because in addition to marshalling you are dealing with lots of IDispatch-type calls over the interop barrier. An improvement on this, to eliminate the "COM-chattiness" is to pass a large chunk of data to Excel at once, as an array, and then to apply post formatting afterwards. The formatting can be done within a VBA macro, or it can be done via COM - in this case it's best to try to optimize formatting by grabbing ranges of similar cells and formatting them together at once.

Approach #2: Directly calling via COM vtable interfaces
This is a technique that can be used to eliminate a lot of the overhead of calling through a generic COM interop layer. The problem with the typical interop layer is that you are essentially going through the same IDispatch interface that was used back in the VB days. By encoding the vtable interface and linking it in, typically in C++ code, the maximum performance with a COM call can be achieved. This is also theoretically possible in C#, but you must be a master of P/Invoke to set it up properly, and since the Excel API is fairly extensive, to be able to use most of it you would spend a ridiculous amount of time trying to write all of the call signatures, and then debug all of the calls to get marshalling and types right. It sounds like a nightmare to me, but I've heard of people having done it.

Approach #3: Use clipboard as a protocol
Since Excel is integrated with both the text clipboard and also much more extensive clipboard formatting including DDE/OLE formats, you can pretty much pass anything to Excel via the clipboard. This would be particulary useful if you need to put random things like graphics into a spreadsheet. A common approach to sending a large set of data to Excel is to format your data as a big text csv string, copy that to the clipboard, and then paste into Excel, or even use some of the paste special to get formatting and types to work properly. This technique can provide big gains over API calls in some scenarios. One of the problems with this approach is that it is using the shared clipboard, so it's hard to run something like this in the background if there are user apps running as well. Additionally, it presents serious challenges for any kind of multi-threading application.

Approach #4: Generate Excel XML document format
More recent versions of Excel, and particularly the most recent 2007 version, have extensive support for XML. By reverse-engineering the schema from an Excel-generated XML document, you can understand how to control pretty much everything. Then the problem becomes a much simpler one of how to generate XML in a quick way - something that has numerous solutions readily available on the web. I recently came across this article, where someone has taken the time to document this approach step-by-step.

Approach #5: Use a pre-packaged solution
The last time this topic came up, someone recommended Spreadsheet Gear. This is a (not free, not open) component that can be used to generate Excel documents without even having Excel available on the server. It looks like they have basically reverse-engineered the entire document format (the legacy proprietary one, not the new open XML one). Then they've put a much more friendly and performant .NET API on top of that, and added more value around it. I haven't spent a lot of time looking at this yet, but anecdotal evidence on the site claims impressive performance. I'm looking forward to getting a license through our company (it's pretty expensive for what it does!), and then I'll be able to say something more concrete about this - but it definitely looks promising.

Approach #6: XML and Scalability
Assuming the future direction of these document formats is XML, that should provide a new range of possiblities. One idea that comes to mind is to leverage the open format to create a scalable reporting platform. Imagine you really do need to generate a ridiculously large XML-based Excel report. And imagine that the time involved in generating this report is not only document generation, but also back-end number crunching and data processing. With the normal current approaches, you would probably break these two aspects apart, so that the processing bit could be scaled and optimized - while the document generation could be isolated in a non-scalable way.

But, if you can use an XML format, perhaps you could divide the work up in such a way that you could have a scalable system, with many "workers" producing individual pieces of the report, perhaps across many virtual servers. You would end up with many individual XML fragments, and then you could have a final component that pieces everything back together and streams out the final report. This approach would have the benefit of simplifying the architecture of the workers - you wouldn't need to break up the processing bit from the document generation bit. And the scalability should only be limited to the power of your XML parser.

Monday, January 29, 2007

Big Chief Golf

One of my longest-running side projects is a website called BigChiefGolf.com. This website provides a free service to golfers, helping them to maintain their golf scores and view a trend-line of handicap improvement over time.

The most amazing thing about this service is that it has been up and running for almost 15 years in various forms, with amazingly little maintenance or revision. There have been a couple of major phases of revision, first from QuickBasic to Visual Basic, and then to ASP.Net.

I recently got it up and running on ASP.Net 2.0, which was an interesting experience. While porting the majority of libraries and even Windows apps appears to be fairly straightforward, porting the ASP.Net pages was particularly challenging. Firstly, while everything else seems to work fine in some sort of compatibility mode between 1.1 and 2.0 - i.e. I'm able to just use the old 1.1 binaries in many cases - the web stuff absolutely would not work. Then even after recompiling everything else to 2.0, the web stuff still did not work. I think the first reason for this was that they have changed the name of an important page-level attribute from "Codepage" to "CodeFile". Here's an article on MSDN that discusses this. Without this the page can't even compile with 2.0, and so an IIS error occurs right away if you try to access the page.

The second, and more substantial porting task is to change all of the code-behind files to partial classes - the new 2.0 model uses partial classes instead of inheritance to define the code behind an ASP.Net page. This involves, obviously, putting the "partial" keyword in the definition of each code-behind class. Then you need to remove and clean up certain things - in particular, the definitions of all controls should just be removed. With the partial classes approach, those controls are already declared in the portion of the class generated from the ASP code. This is much nicer for new development, as it leads to simpler and less code, but it certainly made porting a chore.

One final thing I had to spend a good amount of effort to fix was to reformat a lot of my HTML as conformant XHTML - i.e. closing my <img src="" /> tags, <br> tags, etc. When I compiled the site initially, after everything else was compiling OK, I was left with hundreds of these schema validation errors. I believe there might be some way to turn off this particular level of validation and remove these errors without fixing anything. But I figured I might as well go ahead and get it done, to ensure my code would compile on any computer with VS.Net, even with the default settings.

Saturday, January 27, 2007

WebServices + AJAX

One thing I've never understood is how there can be so much hype about AJAX, and so much hype about Web Services, and yet no one seems to realize these two are naturally suited for each other. Web Services provides an open XML way to send data over the HTTP protocol, and AJAX provides a client-side mechanism for making HTTP calls to ask for data. What more does it take to make it obvious that these two are made for each other?

Well, from all of the amazing open-source libraries out there supporting AJAX, it seems clear that no one on the AJAX side is thinking about this approach, because they are not built with XML formatted WS compatibility in mind. It seems like AJAX is all about using raw text data formats, or just receiving and replacing HTML content directly, or anything other than a well-structured data format made for transferring structured data properly.

So I've decided to take some of the best of what I've found out there, and extend it to support the WS/XML approach in a more developer-friendly way. In a nutshell, I want to be able to put an include on my page, and then have some simple javascript functions that do display logic, by essentially accessing Web Services and handling the results as objects, rather than doing my own XML generation and parsing. Then an added bonus would be to have some binding capability, to automatically bind requests and responses to certain controls on the page, without having to actually write a lot of javascript to do the binding and processing logic myself.

There are a lot of AJAX libraries and approaches out there, so it would be impossible to really highlight them all. I'm a big fan of the Anthem framework. This framework seems to replicate the "developer feel" of traditional ASP.NET server-side controls, and so it feels more natural from a developer's perspective than, for example, the new .NET framework's AJAX controls. But even the Anthem framework seems to be focused on raw html replacement, rather than a more structured approach to passing data back and forth.

One article I really liked was this one on CodeProject, about integrating the jQuery javascript library for AJAX on an ASP.NET page. It's very simple, and allows me to focus on the WS aspects for now, without having to think about all of the added layers of ASP.NET control libraries. Those libraries add a great nice-to-have set of developer features, but I'm inclined to go back and start from the ground up, focusing first on a more solid communication infrastructure. Once we've got the passing back and forth of data working in a structured WS/XML approach, then we can try to adapt the control library facade to make development easier.

I've already got some code working with WS as an internal communications protocol for a very raw AJAX approach. Stay tuned with a follow-up article where I'll present more progress on this idea.

Friday, January 26, 2007

Against Web Services

I wrote this up last year, as we were just beginning a phase to re-architect a core piece of the system. New resources had just been brought into the team, and so lots of ideas were flying around, some very good, and some not so good.

One big idea being pushed was to change everything to run as Web Services, in an attempt to create a sort of SOA architecture. Now, in principle I certainly wouldn't be against trying out the next big architectural paradigm. But in this particular case, we were dealing with a 10-year-old legacy system, with much more fundamental issues to work out. So I was convinced that trying to throw a new architectural paradigm into the mix - especially one that was, at the time, really a very cutting edge, not yet industry-proven idea - seemed to really be deviating off the right path.

Why we shouldn't use Web Services in this phase of Re-Architecture:

I've been trying to avoid going in to a long drawn out discussion about these points, because, as you'll see when you get to the bottom of this, the answer should be pretty obvious to anyone who fully understands the problems we've faced in the past and are trying to address now. In fact, when an answer is this obvious, it is very hard to come up with a comprehensive argument for it, because most of the assumptions are just taken for granted. But I've tried to really sit down here and enumerate all of the important points to consider here, and to draw a reasoned conclusion from that.

I'm not really looking for a point-by-point debate on each of these points - for me, the real point is that there are clearly a number of debatable problems with the WebServices approach for our team, and so the conclusion is that the fact that there are so many enumerable problems here, and probably many more I haven't considered, should lead us to conclude that there's no reason to introduce so much risk into the project without some clear advantage.

I'll try to break my discussion up into two main categories: Support and Technical.

Support Reasons:

Because WebServices run in the IIS ASP.NET context, they are managed as part of IIS. This means that everyone on the development/testing team would have to become an IIS expert overnight. Given that I don't think we have a great breadth of experience with IIS, and especially with ASP.NET, this is a major learning curve for the team, and would require significant time for everyone to get trained up.

The need to ramp up on IIS with ASP.NET would extend to support people as well. We would have to convert our entire application support team into a web management team. This would mean significant restructuring of and turnover in the support team, probably having to replace or augment a good chunk of it. Currently our support people are experts primarily in AutoSys, and secondly in MQ - but from the user side only, not the system management side - because:

MQ is supported (and centrally hosted/managed) by the Middleware team. AutoSys is supported, hosted, and managed by the Scheduling Management team. The WebServices would need to be hosted on our own servers, and thus supported (at a system management level) by our own support team. There is no central managed hosting of WebServices, because with WebServices we are essentially combining both our communication and application management technologies with our actual application components, so that it is all running in one process on a single machine. The only support offered to help us with WebServices would come from the Web Support team - but clearly this is not our first-line support to guarantee reliability of the system - this would be a sort of second line support that we could call out specifically if we are diagnosing an IIS problem - similar to the kind of support we get when we are diagnosing a DB problem. But the DBAs certainly do not replace the role of our first-line support people in supporting our application - they perform tasks very specific to the DB. And even the DB is something hosted and managed centrally for us. Similarly, the Web Support team would not give us any kind of comprehensive management of IIS - they are just there to help troubleshoot when we have issues managing our own infrastructure. They are ceretainly not going to perform the level of monitoring and performance tuning on our own servers, that the middleware team would provide on the MQ servers. In this case, we are talking purely about taking ownership of all aspects of the system, and not taking advantage of any centralized management or hosting for any of it, other than managing the physical boxes. Thus, we now become responsible for (1) guaranteeing reliability of our communication infrastructure; (2) the application management infrastructure. With MQ and AutoSys, both of these are services provided to us by IT, by groups who are much more focused on the performance of those infrastructural pieces. We would essentially be giving up those benefits.

There is no out-of-the-box centralized process management solution for WebServices, in a way that would be analogous to AutoSys. Our support people clearly need a mechanism for centrally viewing and managing all services across all boxes (as do our testing and development people, really). It would also not be possible to use AutoSys in any meaningful way with WebServices, as a process management tool. So we would have to build something to meet this need for WebServices. Given the complexities introduced by being integrated with the IIS ASP.NET process (see more detail in my "Technical Reasons" comments), it would be very challenging to put together a framework/tool to manage the individual services being hosted in a WebServices envrionment.

There are too many risks and unknowns in the WebServices approach for our team right now. The focus of our current project should be making the application better, by taking the next evolutionary step in re-architecting the way we manage our systems, leveraging all of the collective experience we've gained and problems we've found with our previous approaches. The focus should certainly not be embarking out on an crusade into unknown territory to try out a new technology for the fun of it. Certainly, that could be a future phase. But do we really want to be spending so much of our time right now on learning this entirely new technology, dealing with it, supporting it, and preparing ourselves for the risk involved with all the unknowns? We now have extensive experience with MQ, and know exactly what is involved in running our current applications on MQ and porting further applications to MQ. However, we have absolutely no experience with WebServices, and we have not spent any time thinking about all of the unknowns that may come up with the development, testing, and support phases. Of course, there will be new ground that the application will cover for the first time, and there will be new support responsibilities - but shouldn't those be well-justified as providing something necessary? Here we are discussing introducing what would surely be the most risky and radically new change to our infrastructure, and there is no clear benefit to the problems we are specifically trying to address, which could not also be achieved by something much less risky and much more well known, such as MQ.

Technical Reasons:

There is no inherent guaranteed communication protocol when using Web Services - it is essentially like TCP in this regard - a method gets called directly, and if your server is not available (due to reboot, restart, crash, or even a temporary 10-millisecond network glitch), an error occurs, or the transaction may be lost in the middle if the server crashes, etc. It is, of course, possible to implement a strategy to overcome this limitation, with things like retries, re-requests, re-synchronization, custom heartbeats, etc. - however, this starts to look very similar to a lot of the inherent database problems we are trying to address now - think about all of the issues that have to be addressed in the DataLoader in order to provide a level of guaranteed processing to the database - there are similar concepts there, like retries. So, the point is, you can overcome these limitations with a lot of work, but why would you want to if you have another solution that avoids the problem altogether?

Everything runs in the IIS ASP.NET context. This introduces quite a few issues for the kind of services we run:- In order to restart one service, you typically have to restart all services on that server. For example, that means it would not be possible to restart the ServerA, without also restarting the ServerB and ServerC, assuming you run all those services on the same server. This would cause a major halt in the workflow if you are trying to restart something while you have processing underway. This is a common scenario for us in current systems (restarting a particular service for some reason), and I think we would be nieve to imagine this is going to magically go away.- A crash/corruption in one of our services, could potentially corrupt the state of all other running services. Again this would result in restarting of one service potentially necessitating a restart of all services.- To get around this problem, you would have to manage multiple instances for different services. This means an even more complex infrastructure to manage on each server.

Running in the ASP.NET context introduces a number of technical uncertainties in this company's environment. I personally have a small amount of experience with this, having built our existing web applications - and I would wager the rest of the team has absolutely no experience with this and thus no idea what may come up. To provide a few specific examples of where we have spent extensive time with our existing web apps working around these limitations:

Accessing databases requires a specific security context, meaning you either have to tweak the ASP.NET user context, or you have to implement custom impersonation. Our web app has extensive impersonation logic to deal with this issue, and I would not wish to have to deal with any of that in a component of this application, if it can be avoided.
Accessing MQ from a WebService would similarly have security issues, and similarly require impersonation, and potentially conflicting credentials with those needed for the database access.
Accessing other services such as remoting, windows management instrumentation, or remote processes requires further work involving complex machine configurations and low-level windows local policy settings manipulation.
Simple things like accessing certain files in certain locations (i.e D: drive shares, dynamic loading of dlls in different locations) can become problematic depending on permissions settings. In setting up our existing web applications on a new machine, a fair amount of time is usually dedicated just to resolving some of the dynamic plugin DLLs and dependencies, as loading them in the ASP.NET context is more problematic than loading them in a normal .exe process context.
Differences between Win2K and Win2003 and between workstation and server (in terms of ASP.NET permissioning models - the two systems don't even use the same users!) mean that things developed on our local machines typically have numerous problems when we port them to servers. As an example, WebBatch still does not fully work on any Win2003 server because we haven't been able to address some of the policies and permissioning issues for certain pieces of functionality - and we've spent several days on that problem alone.

So I don't mean to say I anticipate having all of these exact same issues again, but just to point out the kind of uncertainties that pop up when you are dealing with the complexity of a managed permissioning context model like IIS and ASP.NET. It's not like normal .NET applications where you just xcopy to deploy and then double click to run, and there are hardly ever issues. It's back to a model like VB where a significant number of tweaks are necessary just to get components installed correctly and up and running on a new system, and the things you have to do are different between our win2k dev workstations and the win2003 servers. Who knows if these problems will creep up when we are trying to integrate with our databases, or any other random integration point we haven't considered? We can't possibly anticipate where we will have the big problematic areas when we try to deploy application components, but why would we possibly want to add this many unkowns to the project if it's not necessary?

WebServices in ASP.NET also introduce a much more complicated set of security issues. They normally run on the HTTP protocol, which is the most notoriously insecure, open, exploitable, and broadly-permissioned communications protocol. They also run on IIS, which is the most notoriously insecure, open and exploited web server. This is exactly the opposite of what we should be thinking of for an internal system with tightly coupled services, which should be communicating in a simple and secure manner. Clearly, this company has numerous levels of firewalls and security layers, but the fundamental point is still there - this is introducing an exploitable hole in the architecture where there was none before. Managing the security of each WebService on each individual server is going to become a very complex task, and will equate to a full-time job for another resource on the team. On the other hand, MQ is fully and simply secured, and the security is managed by the middleware team. I'm not that worried that we are at major risk of putting an exploitable system out there - but this would certainly be another on the long list of things we have to think about and plan for, and it's still not clear what the advantage would be.

There are significant barriers to taking a "generic" load balancing solution with WebServices and converting it to an "intelligent" load balancing solution, once we arrive at the need to do so. WebServices are very good at working with hardware balancers to balance load purely based on generic load decisions like cpu and memory utilization, etc. However, they are very bad at working with a more intelligent software load balancing scheme where you might want to have configurable control over exactly which request goes to a particular instance of a service on a particular machine. WebServices may be one standard out there for generically load balanced solutions, but they are certainly not a standard out there for intelligently balanced solutions, and I've never heard of anyone trying to implement any kind of intelligent balancing with a WebService. This is an area where MQ excels - a solution can be built initially for a generic scenario, and then adapted to an intelligent scenario later, and MQ works well in both.

There is no inherent transactional nature in WebServices, in the sense that we do transactions with MQ. We are able to use MQ transactions to synchronize a potentially complex (but quick) operation like receiving a request into the data loader via a message, begining a transaction to read that request off the queue, writing the data to the database (in a database transaction), and then completing the transaction on the queue in a way that is synchronized with the success of the database transaction - i.e. if the database transaction fails, roll back the transaction on the queue, or even roll back to the end of the queue or to a different queue to redirect to work somewhere else. This is exactly what we already do, for example with xml data loader in stress testing. Transactional mechanisms could certainly be implemented in a custom way on top of a raw WebService protocol, but they are not inherent in the technology, and so this is just one more thing we would have to worry about implementing (and implementing correctly, since it is such a fundamental piece of our component interoperation).
* We subsequently uncovered that there was some work being done on an experimental approach for Web Service enterprise transactions - but this was apparently still in a very experimental stage at the time.

WebServices do not offer us anything in terms of performance improvement over any other technology - we are using MQ successfully in other systems, and while we do have perf issues in those systems related to DB, dataloader, etc.; I certainly don't think we've ever noticed any significant performance issues resulting from MQ. We know that other teams have, but this is more due to improper usage, rather than any inherent MQ problem. Thus, the whole idea of using WebServices to address a performance problem we don't actually have should not really come into consideration - we know where the perf issues are with this application, and they are certainly not in any kind of communications layer. We shouldn't be trying to address problems that don't exist. The whole issue of SOAP performance vs. MQ performance is debatable in any case, but this should not be our guiding factor - it's like comparing using flat files vs. using adatabase based purely on performance aspects. These are two entirely different ways to address communication, and are designed to meet different needs. Assertions that WebServices should be applied to "anything you can think of" are incorrect - there is a place to use them, and there is a place not to use them.

WebServices is not the only "standard". The idea that WebServices is a "standard" or that it is built on "standards" is being thrown around as a single justification to use it. MQ is also a standard in this company, and also throughout the industry - actually it is a much more well established standard than web services, as it has been around much longer, and I'm sure that if you did a comprehensive survey in the company, you would find many more projects using MQ than using WebServices. In any case, the fact that something is a standard does not make it right to solve every problem. Again, each standard or technology has it's place, and there are also areas that are not good for some standards. The only right way to use the argument of something being a "standard" is to argue about it being the best standard to solve a *specific problem* - here we have not identified any specific problem that WebServices might be better than other standards at solving, and in fact, given all the evidence I've presented above, I think it's safe to say the opposite - WebServices, while being a nice standard, is not the right standard to solve the problems we are trying to solve.

WebServices is not the only way to implement load balancing, nor is it the best way. There are many ways to implement load balancing for MQ, Remoting, SmartSockets, or any other communcation protocol. One of the simplest possible ways is exactly what our other application are already doing with MQ - just sharing a single request queue across N instances - this results in an easy, no-hassle, and optimally efficient load balancing mechanism. Each server is guaranteed to pick up a new message when and only when they are freed up - so optimal load balancing is achieved by simply letting the individual clients decide on their own when to pull a message. This is not to say that the the model we have used for our other applications is the model that we must necessarily use for our own load balancing, but just to show that if our goal is either simplicity or optimal balancing, there are even better ways to achieve this than what would be offered with a WebServices / hardware load balancer solution - because that is already more complex. Additionally, if there are additional goals to accomplish with load balancing, that might be better solved by a hardware balancer, then as we have already discussed a hardware balancer can also be used with MQ, for example. There is nothing inherent about WebServices that should make it a special case for LoadBalancing that cannot be achieved as effectively with some other communication mechanism.

Summary: They don't offer anything special to us! So why the extra hassle? - if we have identified a number of support and technical issues that may or may not be addressable, but there is no clear advantage over any other communcation protocol, why are we even considering all of the additional risk to the project? There is most likely an answer to each point that has been brought up above - whether that is to extend the standard WebServices model with something additional for support, or to write code to implement a custom guaranteed protocol on top of the infrastructure, or to apply further tweaks to the design - but the point is, why would we want to spend one iota of our time or have any of the risk introduced by this new and unfamiliar territory, when there is no clear benefit other than using something new and cool?

As an additional point:

Other teams in this area of the company do not think this would be a good idea. I've been around to chat with the tech leads of a number of the major projects in this area, and while these projects do have diverse opinions about technologies and approaches, the consensus regarding WebServices is very consistent - it would not be appropriate for any kind of internal component-to-component communication, and should only be considered for external connection points between disparate systems. Even in the case of external connection points, it should not be considered as the only possibility, but rather as one of a number of possible solutions, depending on the nature of the connection. For example, the connection point between two of our internal sub-systems is probably more tightly coupled than would be appropriate for WebServices. But all teams agree - appropriate technologies for internal tightly-coupled connection points (i.e. component to component) inside a system could be MQ, SmartSockets, Remoting, or raw TCP - but certainly not WebServices. Here's a summary of teams I've spoken with, and the approaches they are using:

3 projects using SmartSockets for internal communications
2 projects using MQ for internal communications, and 5 projects using MQ for external communications
1 project using DCOM for internal communications
5 projects using a file-feed approach to external communications
3 projects connecting to external systems directly through the DB, but with intentions to move to MQ
2 projects using WebServices internally, and 1 using them externally

The interesting thing to note is that all tech leads were very consistent in their views on when it would be appropriate to use WebServices vs. when to use MQ, and all agreed that for the discussion of WebServices as a framework for components in our application, it really would not make any sense. This opinion was even shared among those teams that are using WebServices extensively. It is also interesting to note that those are the two projects that also have a heavy focus on a ASP.NET website GUI, and the WebServices they are building are relatively simple components that are not really meant to be distributed, but rather to provide the business logic behind their web site. Obviously, because they are already fully dependent on ASP.NET for the rest of their application, they don't see any disadvantage from a support or development perspective in respect to continuing to use ASP.NET for underlying business components - and in fact this is a benefit for them. However, for an application that is not already very tied to ASP.NET via a web user interface as the primary piece of functionality, they all agree that there would need to be a very compelling reason to move to WebServices.

Interestingly, even on the discussion of WebServices as an external communications mechanism, most teams agree that this decision still needs to be contemplated vs. the current accepted company standard for this type of interaction, which is still MQ. In fact, for teams that are currently still using Feed and DB mechanisms for connecting to other systems, rather than looking at WebServices as the best alternative for those, they are all considering MQ. One other project in particular is also fairly determined on that approach, and they are receiving a lot of guidance directly from a key individual in the CTO group. They are confident that if we go to the CTO group with this WebServices idea, we would probably not have an easy time convincing them.

References:

Finally, I would offer a few external references (just from the first few results that pop up in google) to show that these are not just ideas I'm inventing, but are actually well known issues in delineating between a Web Services approach and an MQ approach. Additionally, since some of these articles are older than others, this demonstrates that this should not be a new school of thought on the frontier of innovation - some of these concepts are well known and have been around for quite some time now:

http://expertanswercenter.techtarget.com/eac/knowledgebaseAnswer/0,295199,sid63_gci984269,00.html

When wouldn't I recommend Web services in this scenario? Web Services are currently lacking two key capabilities that can be a strong point of a message queue server. First, they have no inherent support for transactions. Most message queue systems have good transaction support and even support transactions across multiple data sources. If this type of cross-application transaction support is key to your implementation, then choose a message queue service.

Second, message queues allow guaranteed delivery. In their current incarnation, Web Services do not. Since messages can be sent asynchronously (fire and forget) with a message queue service, it is vital that the message be guaranteed to eventually arrive at its destination. Hence the idea of message "queues".

In the end, you need to carefully delineate your business needs and prioritize them. Include not only the technical details, but also keep in mind maintenance, support and Total Cost of Ownership.

http://weblogs.asp.net/ahoffman/archive/2004/03/10/87051.aspx

Remember that orthogonal component plumbing provided by either the platform or yourself, is an implementation issue - one not relevant to a specific architectural technology. It applies equally to traditional business logic within a process, as to business logic that implements an "asmx" type web service.

The choice of whether you "scale up" or "scale out" will depend on your particular circumstance and requirements. If your business logic is simple, stateless and largely based on interacting with the data tier, you could scale hrough an application farm. But if your requirements are processing intensive and complex, the most performing solution is likely to be a long living logic block that maintains state - one to which you might pply enterprise services - one that might itself be the implementation for a service.

Microsoft does not propose that business logic call other logic in the same process through XML based messaging. How efficient would that be? Rather, a service oriented architecture remedies the failure of distributed object technologies (like COM, Corba or RMI) to provide seamless program integration outside of a process (or application domain).

http://weblogs.cs.cornell.edu/AllThingsDistributed/archives/000120.html

Web services are frequently described as the new incarnation of distributed object technology. This is a serious misconception, made by people from industry and academia alike, and this misconception seriously limits a broader acceptance of the true web services architecture. Even though the architects of distributed systems and internet systems alike have been vocal about the fact that these technologies hardly have any relationship, it appears to be difficult to dispel the myth that they are tied together
....
A first thing to realize however is that the current state of web services technology is very limited compared to distributed object systems. The latter is a well-established technology with very broad support, strong reliability guarantees, and many, many support tools and technologies. For example, web services toolkit vendors have only just started to look at the reliability and transactional guarantees that distributed object systems have supported for years.

http://www.hanselman.com/blog/ClassicWebServicesVersusPOXXMLOverMQAreYouReallyUsingXML.aspx

For the transport layer, it ultimately will come down to what you are most comfortable with in your enterprise. The advantage of sticking to SOAP over HTTP is that the tool support (both development and management) will be much stronger. While web services are in theory supposed to be transport-neutral, HTTP is certainly the "first among equals" choice. MQ is a good choice when you need to support guaranteed messaging or if your own testing has demonstrated that you have higher scalability and/or throughput for the volumes you are seeing.

http://www.financetech.com/featured/showArticle.jhtml?articleID=14702692

That's because issues like security, non-repudiation (both parties are sufficiently authenticated so the transaction cannot be disavowed), redundancy (a message will continually be sent until its receipt is verified), transport (http vs. MQ Series, etc), transaction semantics (do both functions or neither), and workflow (gather information X from this application and information Y from the next) have yet to be adequately solved by Web-services architects, says Carey.

Thursday, January 25, 2007

Waterfall 2006

The latest fad for techies seems to be this new Waterfall 2006 site, which was appropriately scheduled on April 1, 2006.

I first heard about this last week via some of my company's internal chat rooms, and since then it seems to be growing at an impressive rate, with more and more contributors presenting new material. Some of my favorite headlines include:

Eliminate Collaboration: Get More Done Alone
User Stories and Other Lies Users Tell Us
Pair Managing: Two Managers per Programmer
Unfactoring from Patterns: Job Security through Unreadability

Probably the best article so far is Major SDLC phases: Development Driven Development and Test Driven Testing.

Now, maybe I'm just slow, but at first I was honestly a bit confused about whether this was a joke or not! Let me explain: I did go through extensive "waterfall education" back in my days in the software engineering masters program, but I'd like to think I've moved forward with the rest of the industry since then.

But then reading some of these headlines, it's amazing how some of them seem to still have a familiar ring! Yes, headlines like Refuctoring pretty much give it away, but others such as Testing: Saving the Best for Last, could almost provide an accurate picture of our team's current motto. Eerily familiar phrases such as "Why write tests for code you're not likely to get wrong?" can make it hard to distinguish reality from fiction.

Wednesday, January 24, 2007

Revisiting Personas

I opened this blog back in 2001, when the blogger.com service had just started to become well known. At the time I had no idea what blogs were, but it was a big buzzword at the time, so I figured it was the next big thing.

Since then I've never really had a chance to come back and get my blog started. Well, it's about time. And what better topic to start with than Personas, probably my most well-known piece of work to date. Ironically, this wasn't my master's thesis, or even part of my master's program - it was a small essay I wrote during my undergraduate CS373 Software Engineering course, back in 1998! Since then, I've continued to receive feedback from new students, letting me know that Personas is still alive and kicking.

I think it's time to revisit the Personas concept, and to look at how it fits with modern software design methodologies and development lifecycles.

First, let’s recap on the original ideas. I borrowed the Personas concept from a book by Alan Cooper, The Inmates are Running the Asylum. The idea is that rather than following a generalized approach to analyzing requirements, a more targeted approach should be used. This involves defining a set of Personas, characters with specific characteristics, on which design decisions can be based.

The motivation behind this idea is that we often try to meet the needs of the "average" user, which results in "making everyone happy some of the time" - but with software that's not good enough. With the Personas approach, we are focused on "making some of the users happy all of the time", and combined with good decisions about the targeted characteristics of the Personas, this can result in making the majority of users happy with most or all of the software.

It's amazing to think that many years later, the most modern and popular software products still suffer from this problem of generalization - how many applications still seem to blindly target generalized needs of some average user, while not actually meeting the needs of anyone? One recent case that springs to mind was highlighted recently in an insightful post on the Joel on Software blog, where a usability problem was pointed out with a major feature of Windows Vista. It makes me wonder if a Personas-style approach would have eliminated this problem.

Back in 1998, RUP and UML were very new concepts, and we were just starting to learn about them and put them into practice. Today UML has lost that new-acronym scent, but the use-case technique still has a familiar ring for Personas, and the idea of creating a Persona storyboard seems to apply a more targeted technique that can improve a standard use-case approach.

Has the more recent Agile Development approach rendered Personas obsolete? The Agile approach is in direct contrast to a planning-driven methodology. But there are aspects of the Agile approach, including Extreme Programming, that have similarities to the Personas approach. The XP approach encourages the use of note cards to engage in “story telling” – clearly this can benefit from the Personas approach to target the scenario to specific user needs. As noted on the IBM website, one goal in XP is to “motivate the team to sit together in a room and talk about what the proposed system needs to do and how it needs to do it” – and this is exactly the idea with the Personas approach.

In a way it seems that newer Agile approaches are the next evolution of the old Personas idea – in fact, while that idea was almost revolutionary in 1998, today it can almost seem obvious. But new Agile approaches would benefit from looking back and the real motivations behind the Personas approach, and attempting to incorporate those ideas. While an Agile approach can drive a more user-focused and rapid delivery approach, the end result can still be wrong if based on user requirements that are too generalized. Moving from an “average user” approach to a more targeted user approach to analyzing requirements will ensure that not only are results delivered to meet some interpretation of the requirements in an “agile” manner, but also more likely that those results will meet a targeted, and more correct analysis of those requirements.

By the way, Apel Mjausson has a blog entry linking to my essay, and also to a number of other interesting papers on the topic. Also, there is a Polish site referencing my essay.

GlazeBlog