Friday, February 27, 2015

What happened to “when the application is fast enough to meet users’ requirements?”

On January 5, I received an email called “Video” from my friend and former employee Guđmundur Jósepsson from Iceland. His friends call him Gummi (rhymes with “who-me”). Gummi is the guy whose name is set in the ridiculous monospace font on page xxiv of Optimizing Oracle Performance, apparently because O’Reilly’s Linotype Birka font didn’t have the letter eth (đ) in it. Gummi once modestly teased me that this is what he is best known for. But I digress...

His email looked like this:


It’s a screen shot of frame 3:12 from my November 2014 video called “Why you need a profiler for Oracle.” At frame 3:12, I am answering the question of how you can know when you’re finished optimizing a given application function. Gummi’s question is, «Oi! What happened to “when the application is fast enough to meet users’ requirements?”»

Gummi noticed (the good ones will do that) that the video says something different than the thing he had heard me say for years. It’s a fair question. Why, in the video, have I said this new thing? It was not an accident.

When are you finished optimizing?

The question in focus is, “When are you finished optimizing?” Since 2003, I have actually used three different answers:
When are you are finished optimizing?
  1. When the cost of call reduction and latency reduction exceeds the cost of the performance you’re getting today.
    Source: Optimizing Oracle Performance (2003) pages 302–304.
  2. When the application is fast enough to meet your users’ requirements.
    Source: I have taught this in various courses, conferences, and consulting calls since 1999 or so.
  3. When there are no unnecessary calls, and the calls that remain run at hardware speed.
    Source: “Why you need a profiler for Oracle” (2014) frames 2:51–3:20.
My motive behind answers A and B was the idea that optimizing beyond what your business needs can be wasteful. I created these answers to deter people from misdirecting time and money toward perfecting something when those resources might be better invested improving something else. This idea was important, and it still is.

So, then, where did C come from? I’ll begin with a picture. The following figure allows you to plot the response time for a single application function, whatever “given function” you’re looking at. You could draw a similar figure for every application function on your system (although I wouldn’t suggest it).


Somewhere on this response time axis for your given function is the function’s actual response time. I haven’t marked that response time’s location specifically, but I know it’s in the blue zone, because at the bottom of the blue zone is the special response time RT. This value RT is the function’s top speed on the hardware you own today. Your function can’t go faster than this without upgrading something.

It so happens that this top speed is the speed at which your function will run if and only if (i) it contains no unnecessary calls and (ii) the calls that remain run at hardware speed. ...Which, of course, is the idea behind this new answer C.

Where, exactly, is your “requirement”?

Answer B (“When the application is fast enough to meet your users’ requirements”) requires that you know the users’ response time requirement for your function, so, next, let’s locate that value on our response time axis.

This is where the trouble begins. Most DBAs don’t know what their users’ response time requirements really are. Don’t despair, though; most users don’t either.

At banks, airlines, hospitals, telcos, and nuclear plants, you need strict service level agreements, so those businesses investment into quantifying them. But realize: quantifying all your functions’ response time requirements isn’t about a bunch of users sitting in a room arguing over which subjective speed limits sound the best. It’s about knowing your technological speed limits and understanding how close to those values your business needs to pay to be. It’s an expensive process. At some companies, it’s worth the effort; at most companies, it’s just not.

How about using, “well, nobody complains about it,” as all the evidence you need that a given function is meeting your users’ requirement? It’s how a lot of people do it. You might get away with doing it this way if your systems weren’t growing. But systems do grow. More data, more users, more application functions: these are all forms of growth, and you can probably measure every one of them happening where you’re sitting right now. All these forms of growth put you on a collision course with failing to meet your users’ response time requirements, whether you and your users know exactly what they are, or not.

In any event, if you don’t know exactly what your users’ response time requirements are, then you won’t be able to use “meets your users’ requirement” as your finish line that tells you when to stop optimizing. This very practical problem is the demise of answer B for most people.

Knowing your top speed

Even if you do know exactly what your users’ requirements are, it’s not enough. You need to know something more.

Imagine for a minute that you do know your users’ response time requirement for a given function, and let’s say that it’s this: “95% of executions of this function must complete within 5 seconds.” Now imagine that this morning when you started looking at the function, it would typically run for 10 seconds in your Oracle SQL Developer worksheet, but now after spending an hour or so with it, you have it down to where it runs pretty much every time in just 4 seconds. So, you’ve eliminated 60% of the function’s response time. That’s a pretty good day’s work, right? The question is, are you done? Or do you keep going?

Here is the reason that answer C is so important. You cannot responsibly answer whether you’re done without knowing that function’s top speed. Even if you know how fast people want it to run, you can’t know whether you’re finished without knowing how fast it can run.

Why? Imagine that 85% of those 4 seconds are consumed by Oracle enqueue, or latch, or log file sync calls, or by hundreds of parse calls, or 3,214 network round-trips to return 3,214 rows. If any of these things is the case, then no, you’re absolutely not done yet. If you were to allow some ridiculous code path like that to survive on a production system, you’d be diminishing the whole system’s effectiveness for everybody (even people who are running functions other than the one you’re fixing).

Now, sure, if there’s something else on the system that has a higher priority than finishing the fix on this function, then you should jump to it. But you should at least leave this function on your to-do list. Your analysis of the higher priority function might even reveal that this function’s inefficiencies are causing the higher-priority functions problems. Such can be the nature of inefficient code under conditions of high load.

On the other hand, if your function is running in 4 seconds and (i) its profile shows no unnecessary calls, and (ii) the calls that remain are running at hardware speeds, then you’ve reached a milestone:
  1. if your code meets your users’ requirement, then you’re done;
  2. otherwise, either you’ll have to reimagine how to implement the function, or you’ll have to upgrade your hardware (or both).
There’s that “users’ requirement” thing again. You see why it has to be there, right?

Well, here’s what most people do. They get their functions’ response times reasonably close to their top speeds (which, with good people, isn’t usually as expensive as it sounds), and then they worry about requirements only if those requirements are so important that it’s worth a project to quantify them. A requirement is usually considered really important if it’s close to your top speed or if it’s really expensive when you violate a service level requirement.

This strategy works reasonably well.

It is interesting to note here that knowing a function’s top speed is actually more important than knowing your users’ requirements for that function. A lot of companies can work just fine not knowing their users’ requirements, but without knowing your top speeds, you really are in the dark. A second observation that I find particularly amusing is this: not only is your top speed more important to know, your top speed is actually easier to compute than your users’ requirement (…if you have a profiler, which was my point in the video).

Better and easier is a good combination.

Tomorrow is important, too

When are you are finished optimizing?
  1. When the cost of call reduction and latency reduction exceeds the cost of the performance you’re getting today.
  2. When the application is fast enough to meet your users’ requirements.
  3. When there are no unnecessary calls, and the calls that remain run at hardware speed.
Answer A is still a pretty strong answer. Notice that it actually maps closely to answer C. Answer C’s prescription for “no unnecessary calls” yields answer A’s goal of call reduction, and answer C’s prescription for “calls that remain run at hardware speed” yields answer A’s goal of latency reduction. So, in a way, C is a more action-oriented version of A, but A goes further to combat the perfectionism trap with its emphasis on the cost of action versus the cost of inaction.

One thing I’ve grown to dislike about answer A, though, is its emphasis on today in “…exceeds the cost of the performance you’re getting today.” After years of experience with the question of when optimization is complete, I think that answer A under-emphasizes the importance of tomorrow. Unplanned tomorrows can quickly become ugly todays, and as important as tomorrow is to businesses and the people who run them, it’s even more important to another community: database application developers.

Subjective goals are treacherous for developers

Many developers have no way to test, today, the true production response time behavior of their code, which they won’t learn until tomorrow. ...And perhaps only until some remote, distant tomorrow.

Imagine you’re a developer using 100-row tables on your desktop to test code that will access 100,000,000,000-row tables on your production server. Or maybe you’re testing your code’s performance only in isolation from other workload. Both of these are problems; they’re procedural mistakes, but they are everyday real-life for many developers. When this is how you develop, telling you that “your users’ response time requirement is n seconds” accidentally implies that you are finished optimizing when your query finishes in less than n seconds on your no-load system of 100-row test tables.

If you are a developer writing high-risk code—and any code that will touch huge database segments in production is high-risk code—then of course you must aim for the “no unnecessary calls” part of the top speed target. And you must aim for the “and the calls that remain run at hardware speed” part, too, but you won’t be able to measure your progress against that goal until you have access to full data volumes and full user workloads.

Notice that to do both of these things, you must have access to full data volumes and full user workloads in your development environment. To build high-performance applications, you must do full data volume testing and full user workload testing in each of your functional development iterations.

This is where agile development methods yield a huge advantage: agile methods provide a project structure that encourages full performance testing for each new product function as it is developed. Contrast this with the terrible project planning approach of putting all your performance testing at the end of your project, when it’s too late to actually fix anything (if there’s even enough budget left over by then to do any testing at all). If you want a high-performance application with great performance diagnostics, then performance instrumentation should be an important part of your feedback for each development iteration of each new function you create.

My answer

So, when are you finished optimizing?
  1. When the cost of call reduction and latency reduction exceeds the cost of the performance you’re getting today.
  2. When the application is fast enough to meet your users’ requirements.
  3. When there are no unnecessary calls and the calls that remain run at hardware speed.
There is some merit in all three answers, but as Dave Ensor taught me inside Oracle many years ago, the correct answer is C. Answer A specifically restricts your scope of concern to today, which is especially dangerous for developers. Answer B permits you to promote horrifically bad code, unhindered, into production, where it can hurt the performance of every function on the system. Answers&nnbsp;A and B both presume that you know information that you probably don’t know and that you may not need to know. Answer C is my favorite answer because it is tells you exactly when you’re done, using units you can measure and that you should be measuring.

Answer C is usually a tougher standard than answer A or B, and when it’s not, it is the best possible standard you can meet without upgrading or redesigning something. In light of this “tougher standard” kind of talk, it is still important to understand that what is optimal from a software engineering perspective is not always optimal from a business perspective. The term optimized must ultimately be judged within the constraints of what the business chooses to pay for. In the spirit of answer A, you can still make the decision not to optimize all your code to the last picosecond of its potential. How perfect you make your code should be a business decision. That decision should be informed by facts, and these facts should include knowledge of your code’s top speed.

Thank you, Guđmundur Jósepsson, of Iceland, for your question. Thank you for waiting patiently for several weeks while I struggled putting these thoughts into words.

Friday, February 28, 2014

“How did you learn so much stuff about Oracle?”

In LinkedIn, a new connection asked me a very nice question. He asked, “I know this might sound stupid, but how did you learn so much stuff about Oracle. :)”

Good one. I like the presumption that I know a lot of stuff about Oracle. I suppose that I do, at least about some some aspects of it, although I often feel like I don’t know enough. It occurred to me that answering publicly might also be helpful to anyone trying to figure out how to prepare for a career. Here’s my answer.

I took a job with the young consulting division of Oracle Corporation in September 1989, about two weeks after the very first time I had heard the word “Oracle” used as the name of a company. My background had been mathematics and computer science in school. I had two post-graduate degrees: a Master of Science Computer Science with a focus on language design and compilers, and a Master of Business Administration with a focus in finance.

My first “career job” was as a software engineer, which I started before the MBA. I designed languages and wrote compilers to implement those languages. Yes, people actually pay good money for that, and it’s possibly still the most fun I’ve ever had at work. I wrote software in C, lex, and yacc, and I taught my colleagues how to do it, too. In particular, I spent a lot of time teaching my colleagues how to make their C code faster and more portable (so it would run on more computers than just one on which you wrote it).

Even though I loved my job, I didn’t see a lot of future in it. At least not in Colorado Springs in the late 1980s. So I took a year off to get the MBA at SMU in Dallas. I went for the MBA because I thought I needed to learn more about money and business. It was the most difficult academic year of my life, because I was not particularly connected to or even interested in most of the subject matter. I hated a lot of my classes, which made it difficult to do as well as I had been accustomed. But I kept grinding away, and finished my degree in the year it was supposed to take. Of course I learned many, many things that year that have been vital to my career.

A couple of weeks after I got my MBA, I went to work for Oracle in Dallas, with a salary that was 168% of what it had been as a compiler designer. My job was to visit Oracle customers and help them with their problems.

It took a while for me to get into a good rhythm at Oracle. My boss was sending me to these local customers that were having problems with the Oracle Financial Applications (the “Finapps,” as we usually called them, which would many years later become the E-Business Suite) on version 6.0.26 of the ORACLE database (it was all caps back then). At first, I couldn’t help them near as much as I had wanted to. It was frustrating.

That actually became my rhythm: week after week, I visited these people who were having horrific problems with ORACLE and the Finapps. The database in 1990, although it had some pretty big bugs, was still pretty good. It was the applications that caused most of the problems I saw. There were a lot of problems, both with the software and with how it was sold. My job was to fix the problems. Some of those problems were technical. Many were not.

A lot of the problems were performance; problems of the software running “too slowly.” I found those problems particularly interesting. For those, I had some experience and tools at my disposal. I knew a good bit about operating systems and compilers and profilers and linkers and debuggers and all that, and so learning about Oracle indexes and rollback segments (two good examples, continual sources of customer frustration) wasn’t that scary of a step for me.

I hadn’t learned anything about Oracle or relational databases in school, I learned about how the database worked at Oracle by reading the documentation, beginning with the excellent Oracle® Database Concepts. Oracle sped me along a bit with a couple of the standard DBA courses.

My real learning came from being in the field. The problems my customers had were immediately interesting by virtue of being important. The resources available to me for solving such problems back in the early 1990s were really just books, email, and the telephone. The Internet didn’t exist yet. (Can you imagine?) The Oracle books available back then, for the most part, were absolutely horrible. Just garbage. Just about the only thing they were good for was creating problems that you could bill lots of consulting hours to fix. The only thing that was left was email and the telephone.

The problem with email and telephones, however, is that there has to be someone on the other end. Fortunately, I had that. The people on the other end of my email and phone calls were my saviors and heroes. In my early Oracle years, those saviors and heroes included people like Darryl Presley, Laurel Jamtgaard, Tom Kemp, Charlene Feldkamp, David Ensor, Willis Ranney, Lyn Pratt, Lawrence To, Roderick Mañalac, Greg Doherty, Juan Loaiza, Bill Bridge, Brom Mahbod, Alex Ho, Jonathan Klein, Graham Wood, Mark Farnham (who didn’t even work for Oracle, but who could cheerfully introduce me to anyone I needed), Anjo Kolk, and Mogens Nørgaard. I could never repay these people, and many more, for what they did for me. ...In some cases, at all hours of the night.

So, how did I learn so much stuff about Oracle? It started by immersing myself into a universe where every working day I had to solve somebody’s real Oracle problems. Uncomfortable, but effective. I survived because I was persistent and because I had a great company behind me, filled with spectacularly intelligent people who loved helping each other. Could I have done that on my own, today, with the advent of the Internet and lots and lots of great and reliable books out there to draw upon? I doubt it. I sincerely do. But maybe if I were young again...

I tell my children, there’s only one place where money comes from: other people. Money comes only from other people. So many things in life are that way.

I’m a natural introvert. I naturally withdraw from group interactions whenever I don’t feel like I’m helping other people. Thankfully, my work and my family draw me out into the world. If you put me into a situation where I need to solve a technical problem that I can’t solve by myself, then I’ll seek help from the wonderful friends I’ve made.

I can never pay it back, but I can try to pay it forward.

(Oddly, as I’m writing this, I realize that I don’t take the same healthy approach to solving business problems. Perhaps it’s because I naturally assume that my friends would have fun helping solve a technical problem, but that solving a business problem would not be fun and therefore I would be imposing upon them if I were to ask for help solving one. I need to work on that.)

So, to my new LinkedIn friend, here’s my advice. Here’s what worked for me:
  • Educate yourself. Read, study, experiment. Educate yourself especially well in the fundamentals. So many people don’t. Being fantastic at the fundamentals is a competitive advantage, no matter what you do. If it’s Oracle you’re interested in learning about, that’s software, so learn about software: about operating systems, and C, and linkers, and profilers, and debuggers, .... Read the Oracle Database Concepts guide and all the other free Oracle documentation. Read every book there is by Tom Kyte and Christian Antognini and Jonathan Lewis and Tanel Põder and Kerry Osborne and Karen Morton and James Morle all the other great authors out there today. And read their blogs.
  • Find a way to hook yourself into a network of people that are willing and able to help you. You can do that online these days. You can earn your way into a community by doing things like asking thoughtful questions, treating people respectfully (even the ones who don’t treat you respectfully), and finding ways to teach others what you’ve learned. Write. Write what you know, for other people to use and improve. And for God’s sake, if you don’t know something, don’t act like you do. That just makes everyone think you’re an asshole, which isn’t helpful.
  • Immerse yourself into some real problems. Read Scuttle Your Ships Before Advancing if you don’t understand why. You can solve real problems online these days, too (e.g., StackExchange and even Oracle.com), although I think that it’s better to work on real live problems at real live customer sites. Stick with it. Fix things. Help people.
Help people.

That’s my advice.

Friday, April 5, 2013

NoSQL and Oracle, Sex and Marriage

At last week’s Dallas Oracle Users Group meeting, an Oracle DBA asked me, “With all the new database alternatives out there today, like all these open source NoSQL databases, would you recommend for us to learn some of those?”

I told him I had a theory about how these got so popular and that I wanted to share that before I answered his question.

My theory is this. Developers perceive Oracle as being too costly, time-consuming, and complex:
  • An Oracle Database costs a lot. If you don’t already have an Oracle license, it’s going to take time and money to get one. On the other hand, you can just install Mongo DB today.
  • Even if you have an Oracle site-wide license, the Oracle software is probably centrally controlled. To get an installation done, you’re probably going to have to negotiate, justify, write a proposal, fill out forms, ...you know, supplicate yourself to—er, I mean negotiate with—your internal IT guys to get an Oracle Database installed. It’s a lot easier to just install MySQL yourself.
  • Oracle is too complicated. Even if you have a site license and someone who’s happy to install it for you, it’s so big and complicated and proprietary... The only way to run an Oracle Database is with SQL (a declarative language that is alien to many developers) executed through a thick, proprietary, possibly even difficult-to-install layer of software like Oracle Enterprise Manager, Oracle SQL Developer, or sqlplus. Isn’t there an open source database out there that you could just manage from your operating system command line?
When a developer is thinking about installing a database today because he need one to write his next feature, he wants something cheap, quick, and lightweight. None of those constraints really sounds like Oracle, does it?

So your Java developers install this NoSQL thing, because it’s easy, and then they write a bunch of application code on top of it. Maybe so much code that there’s no affordable way to turn back. Eventually, though, someone will accidentally crash a machine in the middle of something, and there’ll be a whole bunch of partway finished jobs that die. Out of all the rows that are supposed to be in the database, some will be there and some won’t, and so now your company will have to figure out how to delete the parts of those jobs that aren’t supposed to be there.

Because now everyone understands that this kind of thing will probably happen again, too, the exercise may well turn into a feature specification for various “eraser” functions for the application, which (I hope, anyway) will eventually lead to the team discovering the technical term transaction. A transaction is a unit of work that must be atomic, consistent, isolated, and durable (that’where this acronym ACID comes from). If your database doesn’t guarantee that every arbitrarily complex unit of work (every transaction) makes it either 100% into the database or not at all, then your developers have to write that feature themselves. That’s a big, tremendously complex feature. On an Oracle Database, the transaction is a fundamental right given automatically to every user on the system.

Let’s look at just that ‘I’ in ACID for a moment: isolation. How big a deal is transaction isolation? Imagine that your system has a query that runs from 1 pm to 2 pm. Imagine that it prints results to paper as it runs. Now suppose that at 1:30 pm, some user on the system updates two rows in your query’s base table: the table’s first row and its last row. At 1:30, the pre-update version of that first row has already been printed to paper (that happened shortly after 1 pm). The question is, what’s supposed to happen at 2 pm when it comes time to print the information for the final row? You should hope for the old value of that final row—the value as of 1 pm—to print out; otherwise, your report details won’t add up to your report totals. However, if your database doesn’t handle that transaction isolation feature for you automatically, then either you’ll have to lock the table when you run the report (creating an 30-minute-long performance problem for the person wanting to update the table at 1:30), or your query will have to make a snapshot of the table at 1 pm, which is going to require both a lot of extra code and that same lock I just described. On an Oracle Database, high-performance, non-locking read consistency is a standard feature.

And what about backups? Backups are intimately related to the read consistency problem, because backups are just really long queries that get persisted to some secondary storage device. Are you going to quiesce your whole database—freeze the whole system—for whatever duration is required to take a cold backup? That’s the simplest sounding approach, but if you’re going to try to run an actual business with this system, then shutting it down every day—taking down time—to back it up is a real operational problem. Anything fancier (for example, rolling downtime, quiescing parts of your database but not the whole thing) will add cost, time, and complexity. On an Oracle Database, high-performance online “hot” backups are a standard feature.

The thing is, your developers could write code to do transactions (read consistency and all) and incremental (“hot”) backups. Of course they could. Oh, and constraints, and triggers (don’t forget to remind them to handle the mutating table problem), and automatic query optimization, and more, ...but to write those features Really Really Well™, it would take them 30 years and a hundred of their smartest friends to help write it, test it, and fund it. Maybe that’s an exaggeration. Maybe it would take them only a couple years. But Oracle has already done all that for you, and they offer it at a cost that doesn’t seem as high once you understand what all is in there. (And of course, if you buy it on May 31, they’ll cut you a break.)

So I looked at the guy who asked me the question, and I told him, it’s kind of like getting married. When you think about getting married, you’re probably focused mostly on the sex. You’re probably not spending too much time thinking, “Oh, baby, this is the woman I want to be doing family budgets with in fifteen years.” But you need to be. You need to be thinking about the boring stuff like transactions and read consistency and backups and constraints and triggers and automatic query optimization when you select the database you’re going to marry.

Of course, my 15-year-old son was in the room when I said this. I think he probably took it the right way.

So my answer to the original question—“Should I learn some of these other technologies?”—is “Yes, absolutely,” for at least three reasons:
  • Maybe some development group down the hall is thinking of installing Mongo DB this week so they can get their next set of features implemented. If you know something about both Mongo DB and Oracle, you can help that development group and your managers make better informed decisions about that choice. Maybe Mongo DB is all they need. Maybe it’s not. You can help.
  • You’re going to learn a lot more than you expect when you learn another database technology, just like learning another natural language (like English, Spanish, etc.) teaches you things you didn’t expect to learn about your native language.
  • Finally, I encourage you to diversify your knowledge, if for no other reason than your own self-confidence. What if market factors conspire in such a manner that you find yourself competing for an Oracle-unrelated job? A track record of having learned at least two database technologies is proof to yourself that you’re not going to have that much of a problem learning your third.

Wednesday, April 3, 2013

And now, ...the video

First, I want to thank everyone who responded to my prior blog post and its accompanying survey, where I asked when video is better than a paper. As I mentioned already in the comment section for that blog post, the results were loud and clear: 53.9% of respondents indicated that they’d prefer reading a paper, and 46.1% indicated that they’d prefer watching a video. Basically a clean 50/50 split.

The comments suggested that people have a lower threshold for “polish” with a video than with a paper, so one of the ideas to which I’ve needed to modify my thinking is to just create decent videos and publish them without expending a lot of effort in editing.

But how?

Well, the idea came to me in the process of agreeing to perform a 1-hour session for my good friends and customers at The Pythian Group. Their education department asked me if I’d mind if they recorded my session, and I told them I would love if they would. They encouraged me to open it up the the public, which of course I thought (cue light bulb over head) was the best idea ever. So, really, it’s not a video as much as it’s a recorded performance.

I haven’t edited the session at all, so it’ll have rough spots and goofs and imperfect audio... But I’ve been eager ever since the day of the event (2013-02-15) to post it for you.

So here you go. For the “other 50%” who prefer watching a video, I present to you: “The Method R Profiling Ecosystem.” If you would like to follow along in the accompanying paper, you can get it here: “A First Look at Using Method R Workbench Software.”

Thursday, November 15, 2012

When is Video Better?

Ok, I’m stuck, and I need your help.

At my company, we sell software tools that help Oracle application developers and database administrators see exactly where their code spends their users’ time. I want to publish better information at our web page that will allow people who are interested to learn more about our software, and that will allow people who don’t even realize we exist to discover what we have. My theory is that the more people who understand exactly what we have, the more customers we’ll get, and we have some evidence that bears that out.

I’ve gotten so much help from YouTube in various of my endeavors that I’ve formed the grand idea in my head:
We need more videos showing our products.
The first one I made is the 1:13 video on YouTube called “Method R Tools v3.0: Getting Started.” I’m interested to see how effective it is. I think this content is perfect for the video format because the whole point is to show you how easy it is to get going. Saying it’s easy just isn’t near as fun or convincing as showing it’s easy.

The next thing I need to share with people is a great demonstration that Jeff Holt has helped me pull together, which shows off all the tools in our suite, how they interact and solve an interesting and important common problem. But this one can’t be a 1-minute video; it’s a story that will take a lot longer to tell than that. It’s probably a 10- to 15-minute story, if I had to guess.

Here’s where I’m stuck. Should I make a video? Or write up a blog post for it? The reason this is a difficult question is that making a video costs me about 4 hours per minute of output I can create. That will get better over time, as I practice and accumulate experience. But right now, videos cost me a lot of time. On the other hand, I can whip together a blog post with plenty of detail in a fraction of the time.

Where I need your help is to figure out how much benefit there is, really, to creating a video instead of just a write-up. Some things I have to consider:
  • Would a 15-minute video capture the attention of people who Do people glaze over (TL;DR) on longish printed case studies on which they’d gladly watch about 15 minutes of video?
  • Or do people just pass on the prospect of watching a 15-minute case study about a problem they might not even realize they have yet? I ask this, because I find myself not clicking on any video that I know will take longer than a minute or two, unless I believe it’s going to help me solve a difficult problem that I know I have right now.
So, if you don’t mind giving me a hand, I’ve created a survey at SurveyMonkey that asks just three simple questions that will help me determine which direction to go next. I’m eager to see what you say.

Thank you for your time.

Tuesday, August 28, 2012

Is “Can’t” Really the Word You Want?

My friend Chester (Chet Justice in real life; oraclenerd when he puts his cape on) tweeted this yesterday:
can’t lives on won’t street - i’m sure my son will hate me when he’s older for saying that all the time.

I like it. It reminds me of an article that I drafted a few months ago but hadn’t posted yet. Here it is...

When I ask for help sometimes, I find myself writing a sentence that ends with, “…but I can’t figure it out.” I try to catch myself when I do that, because can’t is not the correct word.

Here’s what can’t means. Imagine a line representing time, with the middle marking where you are at some “right now” instant in time. The leftward direction represents the past, and the rightward direction represents the future.


Can’t means that I am incapable of doing something at every point along this timeline: past, present, and future.


Now, of course, can’t is different from mustn’t—“must not”which means that you’re not supposed to try to do something, presumably because it’s bad for you. So I’m not talking about the can/may distinction that grammarians bring to your attention when you say, “Can I have a candy bar?” and then they say, “I don’t know, can you?” And then you have to say, “Ok, may I have a candy bar” to actually have candy bar. I digress.

Back to the timeline. There are other words you can use to describe specific parts of that timeline, and here is where it becomes more apparent that can’t is often just not the right word:
  • Didn’t, a contraction of “did not.” It means that you did not do something in the past. It doesn’t necessarily mean you couldn’t have, or that you were incapable of doing it; it’s just a simple statement of fact that you, in fact, did not.
  • Aren’t, a contraction of “are not.” It means that you are in fact not doing something right now. This is different from “don’t,” which often is used to state an intention about the future, too, as in “I don’t smoke,” or “I do not like them, Sam-I-am.”
  • Won’t, a contraction of “will not.” It means that you will not do something in the future. It doesn’t necessarily mean you couldn’t, or that you are going to be incapable of doing it; it’s just a simple statement of intention that you, right now, don’t plan to. That’s the funny thing about your future. Sometimes you decide to change your plans. That’s ok, but it means that sometimes when you say won’t, you’re going to be wrong.
Here’s how it all looks on the timeline:


So, when I ask for help and I almost say, “I can’t figure it out,” the truth is really only that “I didn’t figure it out.”

“...Yet.”

...Because, you see, I do not have complete knowledge about the future, so it is not correct for me to say that I will never figure it out. Maybe I will. However, I do have complete knowledge of the past (this one aspect of the past, anyway), and so it is correct to speak about that by saying that I didn’t figure it out, or that I haven’t figured it out.

Does it seem like I’m going through a lot of bother making such detailed distinctions about such simple words? It matters, though. The people around you are affected by what you say and write. Furthermore, you are affected by the the stuff you say and write. Not only do our thoughts affect our words, our words affect our thoughts. The way you say and write stuff changes how you think about stuff. So if you aspire to tell the truth (is that still a thing?), or if you just want to know more about the truth, then it’s important to get your words right.

Now, back to the timeline. Just because you haven’t done something doesn’t mean you can’t, which means that you never will. Can’t is an as-if-factual statement about your future. Careless use of the word can’t can hurt people when you say it about them. It can hurt you when you think it about yourself.

Listen for it; you’ll hear it all the time:
  • “Why can’t you sit still?!” If you ask the question more accurately, it kind of answers itself: “Why haven’t you sat still for the past hour and a half?” Well, dad, maybe it’s because I’m bored and, oh, maybe it’s because I’m five! Asking why you haven’t been sitting still reminds me that perhaps there’s something I can do to make it easier for both of us to work through the next five minutes, and that maybe asking you to sit still for the next whole hour is asking too much.
  • “You can’t run that fast.” Prove it. Just because you haven’t doesn’t mean you never will. Before 1954, people used to say you can’t run a four-minute mile, that nobody can. Today, over a thousand people have done it. Can you become the most commercially successful and critically acclaimed act in history of popular music if you can’t read music? Well, apparently you can: that’s what the Beatles did. I’m probably not going to, but understanding that it’s not impossible is more life-enriching than believing I can’t.
  • “You can’t be trusted.” Rough waters here, mate. If you’ve behaved in such a manner that someone would say this about you, then you’ve got a lot of work in front of you. But no. No matter what, so far, all you’ve demonstrated is that you have been untrustworthy in the past. A true statement about your past is not necessarily a true statement about your future. It’s all about the choices you make from this moment onward.
Lots of parents and teachers don’t like can’t for its de-motivational qualities. I agree: when you think you can’t, you most likely won’t, because you won’t even try. It’s Chet’s “WONT STREET”.

When you think clearly about its technical meaning, you can also see that it’s also a word that’s often just not true. I hate being wrong. So I try not to use the word can’t very often.

Thursday, June 7, 2012

An Organizational Constraint that Diminishes Software Quality

One of the biggest problems in software performance today occurs when the people who write software are different from the people who are required to solve the performance problems that their software causes. It works like this:
  1. Architects design a system and pass the specification off to the developers.
  2. The developers implement the specs the architects gave them, while the architects move on to design another system.
  3. When the developers are “done” with their phase, they pass the code off to the production operations team. The operators run the system the developers gave them, while the developers move on to write another system.
The process is an assembly line for software: architects specialize in architecture, developers specialize in development, and operators specialize in operating. It sounds like the principle of industrial efficiency taken to its logical conclusion in the software world.


In this waterfall project plan,
architects design systems they never see written,
and developers write systems they never see run.

Sound good? It sounds like how Henry Ford made a lot of money building cars... Isn’t that how they build roads and bridges? So why not?

With software, there’s a horrible problem with this approach. If you’ve ever had to manage a system that was built like this, you know exactly what it is.

The problem is the absence of a feedback loop between actually using the software and building it. It’s a feedback loop that people who design and build software need for their own professional development. Developers who never see their software run don’t learn enough about how to make their software run better. Likewise, architects who never see their systems run have the same problem, only it’s worse, because (1) their involvement is even more abstract, and (2) their feedback loops are even longer.

Who are the performance experts in most Oracle shops these days? Unfortunately, it’s most often the database administrators, not the database developers. It’s the people who operate a system who learn the most about the system’s design and implementation mistakes. That’s unfortunate, because the people who design and write a system have so much more influence over how a system performs than do the people who just operate it.

If you’re an architect or a developer who has never had to support your own software in production, then you’re probably making some of the same mistakes now that you were making five years ago, without even realizing they’re mistakes. On the other hand, if you’re a developer who has to maintain your own software while it’s being operated in production, you’re probably thinking about new ways to make your next software system easier to support.

So, why is software any different than automotive assembly, or roads and bridges? It’s because software design is a process of invention. Almost every time. When is the last time you ever built exactly the same software you built before? No matter how many libraries you’re able to reuse from previous projects, every system you design is different from any system you’ve ever built before. You don’t just stamp out the same stuff you did before.

Software is funny that way, because the cost of copying and distributing it is vanishingly small. When you make great software that everyone in the world needs, you write it once and ship it at practically zero cost to everyone who needs it. Cars and bridges don’t work that way. Mass production and distribution of cars and bridges requires significantly more resources. The thousands of people involved in copying and distributing cars and bridges don’t have to know how to invent or refine cars or bridges to do great work. But with software, since copying and distributing it is so cheap, almost all that’s left is the invention process. And that requires feedback, just like inventing cars and bridges did.

Don’t organize your software project teams so that they’re denied access to this vital feedback loop.