Pearls of Wisdom

Steve Harman had a post back at the beginning of the month about stuff you'd tell a young developer. It's a reaction to a similar piece by Jeremy Allison. It's an interesting topic, so I thought I'd waste a few pixels on it myself.

If it's not what you love, don't do it

I wouldn't generalize this to other fields, but for software development, I think this is a good thing to keep in mind. A lot has been made in the past by career counselors and other gurus about "finding your bliss" or similar nonsense. I think that's mostly a crock. You can be a good doctor, mechanic, or professor without liking what you do. Sure, those who love what they do will tend to excel at it and rise to the top of their field, but you have to ask yourself how much better off people at the top of a field are vs. those who are merely good enough.

That said, I think software development is somewhat unique as a career. Software development advances incredibly fast and keeping up with new technologies and ways of doing things takes a non-trivial amount of effort. You have to paddle hard just to stay even and that means experimenting and learning on your own. You could probably do fine if you have Arnold Schwarzenegger level drive, but short of that, loving what you do is about the only way to get you where you need to be.

Reputation is important

People who have been around the block know that the difference between developers can be extreme. Top developers are 5 to 20 times more productive than their peers. Given some of the complete incompetents I've known, I suspect that there's really no upper limit on that number.

Unfortunately, both Jeremy and Steve use this point to jump into the virtues of Open Source software. Open source contributions can enhance your reputation, but that has nothing to do with people being able to look at your code—it's more about working with other developers who can then form their own opinions of you as a developer. That's helpful, but limited.

A good reputation is someone talking about how well you did a job for them. It's your name coming up when someone asks "do you know any good developers?" A good reputation is more important than mere code sample availability and the wider your reputation can penetrate the better off you are.

How important is reputation? Let me put it this way: I haven't gotten a job from anyone who didn't know somebody I'd worked with since 1997. Your reputation and your skills are your only true security in this field.

Never stop asking why

The vast majority of people in any field are content to learn what they have to do to get the job done. Top-tier people are the ones who are asking pesky, even impertinent, questions all the time. Jeremy puts this under the heading of "Learn the architecture of the machine" but that's too limiting. Learn as much as you possibly can about why and how things come together to get the job done.

There's two parts to this, really. The first is that you don't stop just because you got something working. Find out why iterating backwards through an array fixed your bug. Look for the best event to use for control loading rather than just the first one you found that happens to work. Learning how to parse technical documents, navigate help files, and put a concise Google query together are key skills you'll need to master.

The second part of this is harder: if you don't know something people are talking about, ask. This becomes harder to do over time because you start believing your own hype. It's easy to believe that inner voice that's saying you can bluff your way through this conversation and look up anything important later. The thing is, the really bright people, the ones you most want to impress, aren't going to be fooled. I've also found that top technologists enjoy getting intelligent questions almost as much as they love answering them. Note the qualifier there and spend the mental effort to make your questions intelligent, though.

Your employer is more important than the community

This goes directly counter to both Jeremy and Steve and I'm going to tank any chance I have with digg or reddit, but it needs to be said.

Jeremy looks like he's coming from that staunch anti-corporation Slashdot tradition that talks in moral terms about the beauty of open source software and the depravity of Microsoft. Reality check: it is not a common goal of open source developers to create "the greatest, most beautiful work of art that all of you can create together." That's certainly not why I contribute to open source projects.

Open source developers are no more virtuous than corporate developers. While you may program for fun and the satisfaction of a job well done, the choice of project you work on is inherently a selfish one. Always. Whether you program for a paycheck, community recognition, or because you need working blog software that you can tweak when you want to your choice is based on internal needs, not virtues.

If they're both selfish, shouldn't I be saying that they're equally important? I don't think so and it's not just because a company will help support your family. The reason your company is more important is that your company has to compete in the market place. It has no choice. Because a company has to charge money for its products or services, the company is forced to provide things that other people want badly enough to part with their hard-earned cash. In practical terms, it means that a company has a reality check that can't be ignored because sooner or later they're going to have to go out there and prove themselves in the marketplace. While the tech bubble bursting was tremendously painful when it happened, imagine the crap pile we would have if it didn't burst. Ever.

In addition, companies force you, the developer, to do work that you don't want to do. Developers hate that, which is perfectly understandable. Being forced to work with unreasonable managers, clueless users, and hopelessly disconnected marketing weenies is a drag. If you're good, though, it's also a rich source of opportunities to tackle things that you might never have thought needed tackling. And it forces you to justify your dogma of choice to a group of supremely skeptical, self-confident people who have the welfare of their families on the line. Yeah, that's not fun, but it means that you'll have to learn to understand and articulate things you just know are true. With the added bonus that you'll find that they sometimes aren't so true after all.

Finally, be wary of advice from old timers

Not me, of course. My advice is nigh infallible. Just ask me, I'll tell you. If you go awry following my advice, it's probably because of errors in implementation.

Okay, maybe not. I've blogged about recent errors I've made, so I have to assume there will be others. Be careful about who you listen to. Be aware that everybody has their own hobby horses and pet peeves and it's a human trait to cram those prejudices into everything (particularly advice). Also be aware that internet communities trend towards consensus as much as any other human community does. Consensus is dangerous and will tend to mask assumptions and pitfalls. Develop a well-honed BS filter and use it liberally.

While you're at it, the sooner you learn to pass your own ideas and theories through a well-developed skepticism the better off you'll be.

25. April 2007 01:26 by Jacob | Comments (4) | Permalink

Winning Arguments

Have you ever found yourself in a situation where you know what's the best thing to do, but are unable to convince anyone else that you are right? Developers know that even simple problems have more than one solution. Developers who have worked on a team of more than one have probably been in a situation where they just knew that the team was heading in the wrong direction and that they had a solution that was more elegant, easier to program, and better to maintain.

Higher profile developers often find themselves trying to explain their solutions to non-technical people as well; sometimes before development has begun, but sometimes after it is truly too late to do anything different.

So what do you do when you have to butt heads with buttheads? Here is my secret to winning in these situations

Geek Whisperer

As a developer and/or manager of developers, it is astonishingly easy to find yourself trying to explain concepts to people who, however smart, just aren't equipped to understand what the heck you are talking about. This is an uncomfortable position for all involved.

It is tempting in these situations to pull out a call to authority and tell people that you are the subject matter expert and they simply have to trust your expertise. This works fine for the little stuff, but as the scope increases, so too does the reluctance to put up with this answer.

And with good reason. Technologists can find themselves literally with the fate of the company in their hands. Business leaders find that hard to take and are going to be extremely wary of any solution that involves the fate of their company.

How to Win With Business Folk

There is really only one way to handle this situation: concentrate on what is best for the company.

Everything you say and do in this situation should start and end with what is best for the company. The magic of this orientation is that you can defeat every argument and diffuse even personal attacks by leading back to what's best for the company. No matter how irrational the objection, how confusing the question, or how deliberate the ignorance, you'll lower tension and create space for a solution if you can turn back to what is best for the company. Even asking the question, "What course of action would be best for the company?" can be sufficient to redirect an out-of-control or confrontational situation.

Talk Amongst Yourselves

Developers tend to believe that they live in a rational universe—or at least that their work is based on the application of structured reason to provide solutions to defined problems. This belief is shared, more or less, among all developers—even those who believe themselves to be the only developer who is actually rational.

This makes it both easier and harder to solve disagreements between developers. The fundamental belief in reason tends to keep disagreements from being personal, so you'll find most developers are willing to discuss alternatives and the ramifications of design decisions.

Every now and then, however, it is possible to find yourself at an impasse with another developer (or group of developers). In an impasse, developers can become more intractable than business folk if only because you lose the ability to appeal to consensus. Rational is rational just as right is right. The best rational choice remains the best rational choice no matter how many delusional fools find themselves unable to comprehend it.

How to Win With Technical Folk

There is really only one way to deal with this situation: concentrate on what is best for the company.

When two (or more) sides have presented what they feel is the most rational solution to a problem and remain at an impasse, the source of contention almost inevitably lies in unstated assumptions. In this situation, asking the question, "Why is this solution better for the company?" helps to ferret out those assumptions. Some disagreements will need to apply this question repeatedly as it is not uncommon for assumptions to be stacked on other assumptions. Ferreting out the source of your assumptions can be tiring, but in the end well worth the effort.

Open Sores

But what if you aren't developing as a part of a company? The principle remains the same as does the effect. Every open source project has a problem that it is trying to solve and people working on that project will tend to be there because they believe they can help solve that problem. Leading disagreements back to the reason of a project's existence can help straighten out priorities and direct solutions back to the original problem domain (and away from the individuals involved in the discussion).

Winning Everywhere!

Is there application for this question outside of geekdom? Darn skippy there is. In fact, asking "what is best for the company?" is a technique I learned at a leadership conference when I was more manager than developer. It turns out that any time people group up, they'll tend to have a reason for doing so. In fact, "tend to" == "always" in this case. The reason can be explicit or implicit, but people simply don't group up for no reason. Learning to find that reason and drive disagreements back to it helps keep you focused and gives you the opportunity to pick your battles.

A Warning

Asking what's best for your company/group/whatever will often show you the flaws in your own arguments, particularly as you first start applying the question. If you want to win for personal egotistical reasons then this technique isn't going to do you much good.

In fact, ego can destroy the benefits of this technique more generally. Depending on how much power the person with the ego has in the organization, their presence can significantly alter the ability of this technique to work. An example: I once worked for a company where the owners truly did not care what was best for the company. As such, arguing from the standpoint of what was best for the company had no effect whatsoever.

Fortunately, reality has a way of asserting itself in these situations. Companies that can't act for their own best interest tend to disappear, members who don't care about what's best for the group tend to leave the group, and employees who don't care about the best interest of their company tend to find themselves parted from the company.

Still, there can be significant discomfort in the meantime and heaven knows that "tends to" != "will" in this case. You'll have to decide how you approach a situation where ego is going to be a problem. It's almost always best to separate yourself from the ego getting in the way—particularly if it is your own.

24. April 2007 22:02 by Jacob | Comments (1) | Permalink

Head Games

Good friend and Indie game developer Jay Barnson has just taken game development in a new direction: Developing in Public. He sounds a little nervous about it, which makes sense. Unlike those who have previously attempted this feat, though, I think Jay stands a good chance of pulling it off, and with good style.

There are two things that are likely to make this interesting. First, Jay's about ready to release his second indie game, Apocalypse Cow—So we're likely to see this through all the way to a completed game. Second, he's an honest and engaging writer.

It's that second that makes me eager to eavesdrop on the development of his new game. I know he'll present it warts and all and that it'll be entertaining along the way.

Technorati tags: , , , ,
20. April 2007 15:31 by Jacob | Comments (0) | Permalink

Are We There Yet?

"So when will you be done with this development project?"

I don't know about you, but I hate this question. There simply is no good answer for it. It seems like such a simple question with a simple DateTime valued answer. One of these days I swear I'll answer with, "Oh, I'll be done next Tuesday at 2:34pm." just to see what happens.

And seriously, businesses hate that we have such difficulty answering the question. It seems perfectly reasonable for them to want to know when they can plan to have the new processes that they know they desperately need. Developers demand high salaries and are ostensibly professionals, they should be able to give a professional answer, right?

The Road is Well Paved

The thing is, software development is a lot harder than people expect it to be--and this includes software professionals. Even simple software projects can run afoul of hidden complexities that can destroy well meaning estimates and make everyone unhappy. And no matter how you hedge your answers, people simply don't remember all your caveats, maybes, and what ifs that you use to indicate uncertainty.

The end result is that developers seldom make their ship-by dates and companies become disillusioned and impatient with all software development. That's not helpful for anybody, but it's pretty much the rule anymore.

And the fact of the matter is that the vast majority of developers (and development managers) never learn how to answer the estimate question. They'll move from company to company, repeating the cycle of hope, suspicion, and disappointment over and over again. Which works well enough for the developers in the boom times when the demand for development is so high that mildly talented house plants can get hired as developers.

So a lot of people are making the same mistakes over and over. Businesses can be excused for assuming that this is simply the way things are and feel confident in their distrust of software professionals. They've been there, done that, bought the t-shirt.

Paying the Toll

This environment causes developers who care about these kinds of things a lot of heartburn. Everyone pays for the ongoing cycle of disillusionment. I believe that this is what really prompts posts like the recent ones from Ted Neward talking about professional ethics. And I've been known to throw my own hat into the ring as well.

We get tired of paying for the sins of those who have gone before. And I'm not referring to the messed up legacy code we stumble into, either. Frankly, messed up code is the least of your problems coming into a situation with a client who has been burned by previous developer promises. Companies that have had deadline after deadline missed have a degree of mistrust that is very hard to overcome.

We pay for this distrust in a hundred different ways. The thing is, trust is a paying commodity in business. Working with partners you trust means a whole lot of overhead you can simply skip. An analogy: if I trust a plumber to fix my sink quickly and professionally, I can go get a burger and leave him to it. It's only when I don't have that trust that I have to pay the additional overhead of having someone I do trust watching to make sure he's not napping under the sink.

Want to see a business manager go into a dreamy fantasy? Ask them what it'd be like to be able to trust their software developers (in house or not). The more experience they've had with developers the more intense the fantasy.

The Rubber Meets the Road

We have a couple of areas of friction in businesses that exacerbate this situation. The main disconnect with business managers is that we have borrowed terminology and tools from other disciplines without understanding that our processes are fundamentally different. It's tricky because the temptation to use manufacturing terminology is immense. After all, we are creating a product of sorts. This makes so much sense on an intuitive level that it's hard to realize that the comparison is misleading and potentially dangerous.

I wish we could retrain everyone to make analogies to other business specialties. Scientific research or law come to mind as potentially useful analogies because both are similarly plagued by the impact of unique situations, changing ground rules, and unforeseen complexities. It would be interesting to investigate how managing software development like a patent application or drug research would change how we look at the problems involved. We might have stumbled onto iterative cycles and responding to altered requirements a whole lot sooner, for example.

Paying Attention

The real problem, though, is that most developers (and even most development managers) don't take the time to learn about common friction points. Nor do they take the time to build relations with their business counterparts so that you have some political capital (aka trust) to use when it is needed. It's easy to forget that much of the progress in software development practices are pretty recent in terms of business processes. After all, business managers don't move at the speed of light and changes tend to take time to penetrate those layers.

Which means that a whole lot of industry advances aren't even theory yet in the board room.

And the fact of the matter is that you cannot expect a business manager to understand what makes Agile practices work. Or the reason that strong unit testing saves time over the long run even though it takes more time up front. Learning to communicate at a level that is sufficiently detailed for smart business decisions without getting bogged down into the jargon inherent in any specialty is an invaluable skill, and one best learned earlier than later. That means thoroughly understanding those theories yourself--not just on the surface or in buzzword compliance. It also means learning to communicate that understanding from orbit, 30,000 ft, 5,000 ft, and right on the ground. This is hard to do. It takes practice. It also takes exposure to business manager types. I'm not sure which is harder...

Something to think about, though: not learning this skill leaves you at the mercy of those who do learn it.

My point, though, is that it takes both. You have to learn your profession so thoroughly that you can deconstruct its "best practices" ("design patterns", whatever) and rebuild them from basic principles on the fly. AND you have to learn to communicate that understanding comfortably to people of varying familiarity with software development in a business environment.

That's what it takes to be a true professional. It's easy to let those two skills fall out of balance. Individuals who understand both are invaluable to a company. Also rare. Companies who discover someone capable of both are often surprised at how much smoother things run with that person placed where they can do the most good--a point Jeff Atwood's latest on becoming a better programmer drives home.

So I don't have a formula for quick and accurate estimates. Just a lot of hard work. Still, here's a tip for free: anyone asking for a firm delivery date is inherently assuming BDUF. Once you know that, you know where to start your answer.

29. January 2007 18:19 by Jacob | Comments (4) | Permalink

Creating a Domain Publisher Cert for a Small Internal Software Shop

The trend towards increasing security introduces a number of intricacies for medium-sized business software shops using Active Directory Domains. An internal domain with more than a dozen workstations can introduce issues that are old hat for larger shops, but way beyond anything a small business will have to deal with. I ran into one such issue recently when I decided it'd be a cool thing for one of my apps to actually run from the network.

The Problem

The first sign I had a problem was when a module that worked fine locally threw a "System.Security.SecurityException" when run from a network share. It told me that I had a problem at "System.Security.CodeAccessPermission.Demand()" when requesting "System.Security.Permissions.EnvironmentPermission". Since it worked fine while local, I figured I had a code trust problem and that I could probably get around it in the .Net Framework Configuration settings and push a domain policy that would update everyone.

I knew this because I had run into something similar once before (deploying a VSTO solution on the network).

Here's where it pays to be a real (i.e. lazy) developer: since I've run into this before, wouldn't it be nice to come up with a solution that will make it easier when I run into this in future? There are four ways to do this, I figure (well, that I could think of, there are probably more).

  1. Create some kind of scripting solution for deploying future projects that automatically creates policies (and propagates them) for each new assembly.
  2. Create a standard directory on the network that can be marked as "trusted" and deploy any trusted code into that directory.
  3. Use a "developer" certificate as your trusted publisher.
  4. Figure out how to get a publisher cert to use to sign your code and then propagate a rule certifying that publisher as trusted.

Some developers would go with number 1. Which makes me shudder. Anyone using the first option isn't someone I want to code with or after (barring some quirky deployment requirement that makes it more attractive, of course). Number 2 would probably be the most common solution because it's pretty simple and most medium-sized businesses are used to security compromises that use "special knowledge" and a lack of being an attractive target for security trade-offs. Number 3 would be a little more "upper-crust", mainly from people who had tried 4 and run into difficulties. And frankly, for most cases Number 3 is likely adequate. The problem is that using number 4 has a couple of not insignificant hurdles.

The Issues with Certificates

There are a couple of obstacles in your way if you want to produce a valid publisher certificate for use in signing code.

  • For a smaller internal shop, going the "official" route of contacting one of the major certificate stores (Thawte, Verisign, et. al.) is overkill with a price tag.
  • Setting up a private Certificate Authority isn't that hard, but unless you're running Windows 2003, Enterprise Edition, you cannot customize certificate templates.
  • The settings on the Code Signing template marks the private key as "unexportable".

That last is the most significant problem. You see, if you cannot export your private key, you cannot export to a "pfx" file (aka "PKCS #12"). You could export a .cer file (public key only) and then convert that to an spc using cert2spc.exe but that leaves you with a file that pretty much anyone can use to sign code. There's a reason Visual Studio Help warns that cert2spc.exe is for testing only.

If I lost you in all the security acronyms, don't worry about it. The important thing to note is that a) non-pfx files don't need a password to use in signing assemblies and b) there's no easy way to create a non-developer created pfx file signed by your organization's CA.

How to Get Your CA to Issue an Exportable Certificate

There is, however, a loophole you can exploit to con your CA into giving you a Code Signing certificate that you can export into a valid .pfx file. I'll skip the setup stuff on the CA. It is important to make sure that your CA makes the Code Signing template available (it isn't by default). Making it available is pretty straightforward, so I won't go into that here.

The first thing you'll need to do is use makecert.exe to create both a private and public key. A basic commandline to do so would be:

makecert -n "CN=Text" -pe -b 12/01/2006 -e 12/01/2012 -sv c:\test.pvk c:\test.cer

You can hit the help file for other fields you might want to set (or use the -? and -! switches to get two lists of available options). This command will pop up a GUI prompt for your private key password. Note that I typoed "CN=Text". While I meant to make that "Test", it turns out to be a good way to illustrate what that value is so I decided to keep it in the following examples. Also note that "-pe" is what makes the private key exportable. After running this command, you'll have two files in your root directory. The pvk is the private key file and the .cer is the public key.

Next you use a Platform SDK tool called pvk2pfx.exe. This wasn't in my regular path so I had to do a search to find it. I'm guessing that most development machines will have it already. If not, it's available from Microsoft. Here's the command I needed:

"C:\Program Files\Microsoft SDKs\Windows\v6.0\Bin\pvk2pfx.exe" -pvk c:\test.pvk -spc c:\test.cer -pfx c:\test.pfx

Like makecert, this command will give you a password dialog for the private key. Note that even though the command switch is "spc", it'll accept a .cer file just fine. Now, you might think that we're done because we have a valid pfx file. The problem is that this pfx file is derived from a CA of "Root Agency". In order to get this into your internal CA, you're going to need to use your certificate manager. You'll likely need to Run certmgr.msc to get to it. Once there, head to the Personal|Certificates node. This will let you play with certificates on your current workstation.

Right-clicking on "Personal" gives you an "Import" option. Follow the prompts to pull your certificate in. It'll prompt you for the private key password. Once you do this, you'll see your new private key and probably an auto-imported "Root Agency".

Here's where we find the handy loophole. While the default value for allowing private key exporting on the Code Signing template is false, you can use your handy new certificate to request a duplicate. Right-click that key and select "Request Certificate with Same Key". You can also use "Renew Certificate with Same Key". The functional difference seems to me to be that Renew keeps your password while Request provides an empty one (which is nervous-making, but rectifiable using a number of different tools including Visual Studio once the certificate is exported).

In the Wizard that follows, make sure you select the Code Signing template. What you'll receive back is a certificate from your CA for code signing that includes a private key that is marked exportable. At this point, I delete both the "Root Agency" and "Text" certs in order to avoid future confusion.

Use the Right-click|Export command to export this certificate to a pfx file. The pfx file has everything you need to be able to create a .Net Framework code policy using "publisher" as the distinguishing characteristic to mark your code trusted. Once that policy is propagated to all the domain workstations, you're good to go. You'll need to use the resulting pfx file to sign the assemblies (once they're ready for release), but you knew that already :).

A Final Note

After I had a valid certificate for signing, I actually ended up using .Net's ClickOnce technology to deploy the project. I still needed a certificate to create a strong-named assembly, but a weaker or temp certificate would have been adequate for internal deployment. The more robust certificate will let me eliminate a security prompt the first time a user runs the application, though. Since that prompt has a big red exclamation point in it, I'm just as happy to eliminate it.

4. December 2006 22:33 by Jacob | Comments (1) | Permalink

DataSets Suck

First off, a correction. In my recent post on OLTP using DataSets, I gave four methods that would allow you to handle non-conflicting updates of a row using the same initial data state. In reviewing a tangent later I realized that method 2 wouldn't work. Here's why:

The auto-generated Update for a datatable does a "SET" operation on all the fields of the row and depends on the WHERE clause to make sure that it isn't going to change something that wasn't meant to be changed. Which means that option 2 would not only not be a good OLTP solution, it'd overwrite prior updates without any notice. Much better to simply throw a DbConcurrencyException and let the application handle the discrepancy (or not).

Which also answers Udi's question of why it doesn't do that out of the box. It'd be nice if the defaults were implemented with a more robust OLTP scenario in mind, though. It'd be pretty complex, but that's because OLTP has inherent complexities. You would either have to generate the Update statement on the fly (thus breaking the new ADO.NET 2.0 batch option on the adapters) or put the logic at the field level (using an SQL "CASE" statement). I'm not sure how efficient CASE is on the server, but that could potentially fix my 2nd option.

But this brings me to my second and broader point again: the disdain that "real" programmers have for datasets. This was refreshed for me recently on a blog post by Karl Sequin at Code Better. I liked that post a lot (about using a coding test when evaluating potential hires) until I got to the bit about tell-tell signs he would look for. Right at the top?

Datasets and SqlDataSource are very bad

He has since amended that so:

Datasets and SqlDataSource are very bad (update: the dataset thing didn't go over too well in the comments ;) )

and added in the comments:

Sorry everyone...I've always had a thing against datasets...

He's not alone here. It's a common feature of highly technical programmers to hold datasets in contempt. Which would be fair enough if they were willing to give reasons or support for the position. If I felt that such statements came from an informed foundation, there wouldn't be much to quibble about. Unfortunately, too often this is simply not the case.

On those rare occasions when I can get one of these gurus to expound a bit, this attitude generally devolves back to a couple of bad experiences where datasets were used poorly or shoved into a situation where they didn't belong. Indeed, Karl goes on to give the kind of thing he doesn't want to see and I have to agree that he has a point. But while his example uses a dataset, it isn't the source of the problem. The problem is actually in his second point after datasets:

Data access shouldn't be in the aspx or codebehind

Since he's looking for strong enterprise-level coding habits, he's right that it'd be better encapsulated in its own class, and better still in its own library.

Again, it isn't the dataset he actually has a quibble with. He's just perpetuating a prejudice when he reflexively includes them as a first strike. To his credit, he's willing to own up to the prejudice. Unfortunately, he does so in a way that indicates that it is a prejudice he has no plans to explore or evaluate. That's what I hate about the whole anti-dataset vibe in the guru set. Particularly since these tend to be people who are proud of their rationality and expect others to listen to them when they expound on technical topics.


Technorati tags: , , ,
23. November 2006 16:47 by Jacob | Comments (0) | Permalink

DataSets and Business Logic

Whoa, that was fast. Udi Dahan responded to my post on DataSets and DbConcurrencyException. Cool. Also cool: he has a good point. Two good points, really.

Doing OLTP Better Out of the Box

I'll take his last point first because it's pure conjecture. Why don't DataSets handle OLTP-type functions better? My first two suggestions would, indeed, be better if they were included in the original code generated by the ADO.NET dataset designer. I wish that they were. Frankly, the statements already generated by the "optimistic" updates option are quite complex as-is and adding an additional "OR" condition per field wouldn't really be adding that much in either complexity or readability (which are both beyond repair anyway) and would add to reliability and reduce error conditions.

My guess is that it has to do with my favorite gripe about datasets in general: nobody knows quite what they are for. I suspect that this applies as much to the folks in Redmond as anywhere else. Datasets are obviously a stab at an abstraction layer from the server data and make it easier to do asynchronous database transactions as a regular (i.e. non-database, non-enterprise guru) developer. But that doesn't really answer the question of what they are useful for and when you should use them.

DataSets are, essentially, the red-headed step child of the .NET framework. They get enough care and feeding to survive, but hardly the loving care they'd need to thrive. And really, I think that LINQ pretty much guarantees their eventual demise. Particularly with some of the coolness that is DLINQ.

Datasets Alone Make Lousy Business Objects

As much as I am a fan of DataSets in general, you have to admit that they aren't a great answer in the whole business layer architecture domain.

I mean, you can (if you are sufficiently clever) implement some rudimentary data validation by setting facets on your table fields (not that most people do this--or even know you can). You can encode things like min/max, field length, and other relatively straight-forward data purity limitations. Anything beyond this, however, (like, say, when orders in Japan have to have an accompanying telephone number to be valid) would involve either some nasty derived class structures (if you even can--are strongly-typed DataTables inheritable? I've never tried. It'd be a mess to do so, I think), or wrapping the poor things in real classes.

One solution to this is to use web services as your business layer and toss DataSets back and forth as the "state" of a broader, mostly-conceptual object. This is something of a natural fit because DataSet objects serialize easily as XML (and do so much better--i.e. less buggy--in .NET 2.0). This de-couples methods from data, so isn't terribly OO. It can work in an environment where complex rules must work in widely disparate environments (like a call center application and a self-serve web sales application) when development speed is a concern (as in, say, a high-growth environment).

I think this leads to the kind of complexity Udi says he has seen with datasets. The main faultline is that what methods to call (and where to find them) are in design documents or a developer's head. This can easily lead to a nasty duplication of methods and chaos--problems that functionally don't exist in a stronger object paradigm.

That Said...

Here is where I stick my neck out and reveal my personal preferences (and let all the "real" developers write me off as obviously deluded): although DataSets make admittedly lousy business objects, most non-enterprise level projects just don't need the overhead that a true object data layer represents. For me, it's a case of serious YAGNI.

Take any number of .NET open source software projects I've played with: not one uses DataSets, yet not one needs all the complexity of their custom created classes, either. They aren't doing complex data validation and their CRUD operations are less robust than those produced automatically from the dataset designer. All at a higher expense of resources to produce.

Or take my current place of gainful employ. We have five ASP.NET applications that all have an extremely complex n-tier architecture--all implemented separately in each web application (and nowhere else--they're not even in a separate library). Each of the business objects has a bunch of properties implemented that are straight get/set from an internal field. And that is all they are. Oh, there's a couple of "get" routines that populate the object for different contexts using a separate Data Access Layer object. And an update routine that does the same. And a create... you get the point. It's three layers of abstraction that don't do anything. I shudder to think how much longer all that complexity took to create when a strongly-typed DataSet would have done a much better job and taken a fraction of the time. It makes me want to call the development police to report ORM abuse.

Which is to Say

Don't let all that detract from Udi's point, though. He's right that for seriously complex enterprise-level operations, you can't really get around the fact that you need good architecture for which datasets will likely be inadequate. Relying wholly on DataSets in that case will get you into trouble.

I personally think that you could get away with datasets being the communication objects between web services in most cases even so, but I also realize that there are serious weaknesses in this approach. It works best if the application is confined to a single enterprise domain (like order processing or warehouse inventory management). Once you cross domains with your objects, you incur some serious side-effects, not least of which is that the meaning of your objects (and the operations you want to perform on them) can change with context (sometimes without you knowing it--want an exercise in what I mean? Ask your head of marketing and your head of finance what the definition of a "sale" is--then go ask your board of directors).

So yeah, DataSets aren't always the answer. I'd just prefer if more developers would make that judgement from a standpoint of knowing what DataSets are and what they can do. Too often, their detractors are operating more from faith than from knowledge.*

*Not that this is the case for Udi. For all he has admitted that he isn't personally terribly familiar with datasets, his examples are pretty good at delineating their pressure points and that tends to indicate that he's speaking from some experience with their use in the wild.


21. November 2006 19:23 by Jacob | Comments (0) | Permalink

4 Solutions to DbConcurrencyException in DataSets

Following links the other day, I ran across this analysis of DataSets vs. OLTP from Udi Dahan. His clincher in favor of coding OLTP over using datasets is this:

The example that clinched OLTP was this. Two users perform a change to the same entity at the same time – one updates the customer’s marital status, the other changes their address. At the business level, there is no concurrency problem here. Both changes should go through.When using datasets, and those changes are bundled up with a bunch of other changes, and the whole snapshot is sent together from each user, you get a DbConcurrencyException. Like I said, I’m sure there’s a solution to it, I just haven’t heard it yet.

I thought about this for a minute and came up with four solutions for DbConcurrencyException in this scenario using DataSets (though the first two are essentially the same and differ only by who actually implements it). I'm sure there are others, but this should do for starters.

  1. Use stored procedures created by a competent DBA that utilizes parameters for the original and new column state. This means that you check each field with a "OR (<ds.originalValue> = <ds.updateValue>)". This solution passes the same two parameters per field as an "optimistic" pre-generated update statement but it makes the update statement larger by adding this new "OR" condition for each field.
  2. You can do the same by altering a raw update generated from the DataSet designer. This means sending a longer select to the database each update though this can be offset by setting your batch size higher if you have lots of updates you're sending (uh, you'd need ADO.NET 2.0 for that). I'd hesitate to use this method but that's mainly a personal taste issue than anything else (because I'd prefer using stored procedures and recognize that internal network traffic generally isn't the bottleneck in these kinds of transactions, though on-the-fly statement execution plan creation could be).
  3. Override the OnUpdating for the adapter to alter the command sent based on which fields have actually changed. This is probably the closest in effect to the OLTP solution envisioned by Udi. This solution is problematic for me simply because I've never actually tried to do it and I'm not sure you can hook into the base adapter updates each execution. If you can't, an alternative (in ADO.NET 2.0) would be to create a base class for the table adapters and create an alternative Update function in derived partial classes. In this case, you'd have "AcceptFineGrainedChanges" or some such function that you'd call. Once the alternative base class was created, custom programming per table adapter would be a matter of a couple moments. I've done something similar for using the designer for SyBase table adapters and it worked out pretty well. I'd have to actually try this to make sure it'd work though. Call this two half-solutions if you're feeling stern about it.
  4. This last would be useful if I have a relatively well-defined use case that isn't going to morph much or require stringent concurrency resolution. In this one, you deliberately break the one-for-one relationship from your dataset and database (i.e. one database table can be represented by multiple dataset tables). In Udi's concurrency example, the dataset would have a CustomerAddress table and a CustomerStatus table. Creating the dataset with custom selects would generate the tables pretty painlessly with appropriate paranoia. Now, this only really pushes his concern down a little, making it less likely to be an issue. It doesn't eliminate it. It'd probably handle most of the concurrency problems people are likely to run into. Or at least, push them out beyond where most people will ever experience it (not quite the same thing). It could be taken to a rediculous extreme where each field was it's own datatable (which is just silly, but I've seen sillier things happen) so a little balance and logical separation would be needed.

OLTP may seem more natural as a solution for many, but that's likely an issue of preference and sunken costs (because they've done it before and are comfortable with that solution space). It certainly isn't the only solution, though, nor is it a stumper for datasets.

Finally, I’ll add a caveat that I'm not saying that datasets are necessarily to be preferred over stronger object models. I just know that they get pretty short shrift from "real" developers in these kinds of discussions and want to make sure that the waters remain appropriately muddied. There may be a universal stumper for datasets I don't know about. There are certainly environments where a formal OLTP or ORM tool would be a legitimately preferred solution.


Technorati tags: , , , , ,
21. November 2006 05:35 by Jacob | Comments (0) | Permalink

Two Things I Regret

Have you ever been in an interview and gotten some variation on the question "What do you regret most about your last position?" Everyone hates questions like that. They're a huge risk with little upside for you. You're caught between the Scylla of honesty and the Charybdis of revealing unflattering things about yourself.

Still, such questions can be very valuable if used personally for analysis and improvement. In that light, I'll share with you two things I regret about my stay at XanGo. Since I've ripped on the environment there in the past, it's only fair if I elaborate on things that were under my control at the time--things I could have done better.

Neglecting the User

Tim was the Senior IT Manager (the position that became IT Director once XanGo had grown up a bit). He was the best boss I ever had. His tech skills were top-notch (if somewhat "old school"). In addition, he knew his executives and how to communicate with them on a level they understood. It was a refreshing experience to have someone good at both technology and management (and since he's no longer my boss, you can take that to the bank :)).

After a little break-in time as the new Software Development Manager, Tim and I discussed what we needed to do for the organization. Tim's advice was to establish a pattern of delivering one new "toy" for our Distributors each month. He said that the executive board are very attached to the Distributors and that keeping things fresh and delivering new functionality and tools to them each month would make sure that we had enough of the right kind of visibility. Goodwill in the bank, so to speak.

This sounded like a great idea, and frankly, "toy" was loosely defined enough that it shouldn't have been a hard thing to do. It turned out to be a lot harder than expected, however. In my defense, I'll point out that we were experiencing between 15% and 20% growth per month and that we had done so since the company had started a year and a half before. That growth continued my entire tenure there. Now, if you've never experienced that kind of growth, let me point out some of what that means.

First off, using the Rule of 72 (the coolest numeric rule I know) will tell you that we were doubling every 4 to 5 months (in every significant measure--sales, revenue, Distributors, traffic, shipping, everything).

In case you've never experienced that kind of growth, it feels like ice-skating with a jet engine fired up on your back. Even good architecture will strain with that kind of relentless growth. When this happens, you become hyper-vigilant for signs of strain. This vigilance has sufficient reality to be important to maintain. Unfortunately, it also makes it easy to forget your users.

Developers like to live in a pristine world of logic and procedure. Unfortunately, life, and users, aren't like that. If they were, there'd be less need for developers. Users don't see all the bullets you dodged. They take for granted the fact that a system originally designed for a small start up is now pulling off enterprise-level stunts. They don't see it, so it doesn't exist. It is very easy to get caught up in the technology and forget that often it is the little touches that make your product meaningful. Sometimes the new report you spent an hour hacking together means more than the three weeks of sweating out communication with a new bank transaction processor. And by means more, I mean "is more valuable than".

Not that you can afford to neglect your architecture or needed improvements to sustain the needs of the company and prepare for foreseeable events. If you ignore that little glitch in the payments processing this month, you have really no excuse when it decides to spew chunks spectacularly next month.

What I'm saying here is that you have to balance functionality with perceived value. You have to know your users and their expectations because if you aren't meeting those expectations, no amount of technical expertise or developer-fu is going to help you when things get rough. In the case of XanGo, I could have afforded to ease up on the architecture enough to kick out monthly toys for the users. Yeah, some things would have been a touch rockier, but looking back there was room for a better balance.

Premature Deprecation

When I arrived at XanGo, our original product (a customized vertical market app written in VB6 on MS SQL Server) was serving way beyond its original specifications. We'd made some customizations, many of them penetrating deep into the core of the product. Our primary concern, however, was the Internet application used by our Distributors in managing their sales. We spent a month or two moving it from ASP to ASP.NET and ironing out bugs brought on by the number of concurrent users we had to maintain. We also removed the dependence on a couple of VB6 modules that were spitting out raw HTML (yeah, I know. All I can say is that I didn't design the monster).

Anyway, after that was well enough in hand, we gave a serious look to that VB6 vertical market app. Since VB6 wasn't all that hot at concurrent data access and couldn't handle some of the functionality we were delivering over the new web app, we decided that it should be phased out. Adding to this decision was the fact that we had lost control of the customizations to that app and what we had wouldn't compile in its present state.

Now developers (and for any management-fu I may have acquired, I remain a developer at heart) tend to be optimistic souls, so we figured "no big", we'll be replacing this app anyway. And we set to work. Bad choice. In a high growth environment, the inability to fix bugs now takes on a magnified importance. Replacing an application always takes longer than you expect if only because it's so easy to take the current functionality for granted. Any replacement has to be at least as good as the current application, and should preferably provide significant, visible improvements.

The result of this decision was that we limped along for quite some time before we finally came to the conclusion that we absolutely must have the ability to fix the current app. We paid a lot of political capital for that lack. In the end, it took a top developer out of circulation for a while but once it was done, it was astonishing how much pressure was lifted from Development.

It's the Users, Stupid

No, I did not mean to say "It's the stupid users." When it comes right down to it, software exists to serve users, not the other way around. As developers, it is easy to acquire a casual (or even vehement) dislike of our users. They are never satisfied, they do crazy stuff that makes no sense, and they're always asking for more. It's tempting to think that things would be so much better without them.*

I got into computers because I like making computers do cool stuff. Whatever gets developers into computers, though, it's a good idea to poke your head up periodically and see what your users are doing. Get to know who they are. Find out what they think about what you've provided for them. Losing that focus can cost you. Sometimes dearly.

*I think one of the draws of Open Source is that the developer is the user. It's also the primary drawback. But that's a post for another day.


17. November 2006 20:10 by Jacob | Comments (0) | Permalink

More Validation

It's always nice to have your opinions confirmed by someone you respect. Joel Spolsky's latest series on recruiting developers has a final section that includes essentially the same point I made a couple months ago: that specific technologies aren't as important in hiring developers as general savvy is. He even issues the same caveat that you do need domain experts for leavening. Nice. As is usual with Joel, he includes a thorough analysis of the issue he explores so there's a lot of good rumination there.


8. September 2006 09:59 by Jacob | Comments (0) | Permalink


<<  October 2014  >>

View posts in large calendar