Archive for June, 2011

Thoughts on Agile and Scrum

2011/06/26

The 2001 Agile Manifesto (http://agilemanifesto.org/ ) reads:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions – over processes and tools
Working software – over comprehensive documentation
Customer collaboration – over contract negotiation
Responding to change – over following a plan

That is, while there is value in the items on the right, we value the items on the left more.”

There are 12 Principles behind the Agile Manifesto (http://agilemanifesto.org/principles.html):

  1. Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
  2. Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.
  3. Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
  4. Business people and developers must work together daily throughout the project.
  5. Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.
  6. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.
  7. Working software is the primary measure of progress.
  8. Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.
  9. Continuous attention to technical excellence and good design enhances agility.
  10. Simplicity–the art of maximizing the amount of work not done–is essential.
  11. The best architectures, requirements, and designs emerge from self-organizing teams.
  12. At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

My particular view of software development history is that the “waterfall” method (requirements → design → implementation → test → delivery → maintenance) failed, because it simply took too long. In fact each step took too long and still never got it 100% complete and correct. Thus, reality was that the waterfall method really looked like (imperfect requirements → imperfect design → imperfect implementation → imperfect test → delivery of an imperfect product → extraordinarily expensive maintenance with less than satisfied customers).  I’ve seen quotes that over 70% of waterfall projects failed before any kind of customer delivery. I’m not sure of the number, but it feels right. In addition, over the long duration of the project, of course the real requirements changed!

The industry reacted to this reality with many efforts prior to Agile: throw one (version) away, rapid prototyping, spiral development, structured programming, etc. Agile also has some competing methodologies.  All of these approaches have a philosophy.  All have requirements, designs, implementations, testing, delivery, and maintenance.  Pundits retell the tiresome joke that the nice thing about software development philosophies is that there are so many to choose from!  It is perhaps a bit cynical to say that the agile philosophy became popular because the word “agile” was so pregnant with meaningful reaction to the real problems that needed to be addressed.

The big problems today are:

  1. Requirements frequently change. Technology, competition, markets, and internal funding changes are among the many changes that influence changing requirements. The fundamental reality is two fold: After initial requirements and designs are in place and are reviewed by management and customers, the development process must embrace such changes, and there must be very frequent (weeks, not months) deliveries of working code (and hardware if required) that are reactive to requirement changes as the development proceeds. Such deliveries must be reviewed, face-to-face (at least electronically for geographically disbursed teams), with the development team, the business team, and a significant number of customer representatives. In this open-source world where the product depends on independent third party software, then that development needs to be represented. Clever companies often “volunteer” an employee to work for “free” on such external projects, and these volunteers can then participate in the reviews. These reviews will most likely change not only the requirements but also the priorities for the next deliverable. The good and the bad news is that people get a clearer understanding of the product once they see some functionality in action. They will see what they like and also what they either don’t like or what they feel needs improvement. Of course, requirements changes imply new tasks to update requirements and design documents as well as the current implementation. These need to be reviewed again. The hope is that early functionality and early reviews will cause all this changing work to converge. (If it doesn’t converge, the program will run out of money and will fail.)
  2. Not only management, but also customers need to be intimately involved with the development team reviews. (I disagree with the word “daily” in Principle 4, and prefer the word “regularly” instead.) In the waterfall, maybe or maybe not end-customers were involved in the requirements phase. Often this was where marketers and product managers intervened to represent the “voice of the customer.”  This was too little, too late, and too prone to misinterpretation. After that, the customer was usually not involved until just before delivery with alpha and beta tests at the customer site. Of course alpha and beta testing was too late to significantly change the requirements, the design, or most aspects of the implementation. The product delivered often therefore was only a partial step to meeting a typical customer’s needs. It takes a little selling (or “pre-selling”) to convince potential customers to invest the time that it takes to participate in the development of a product. The primary selling technique is a commitment that the product will directly address the customer’s needs, rather than having the customer pay a fee to customize and/or integrate the product into the customer’s systems. Note that this is not a promise to provide the product for “free.” The product should be good enough that the customer is still willing to pay for it. Other techniques, especially for start-ups, are to put key potential customers on the board, and/or to get them to invest in the company by purchasing stock. To protect their investment, they will often get involved with the development process.
  3. Processes become too burdensome. As waterfall based programs tended to fail, some reacted by tightening up the waterfall development processes. Whereas there was clear value to this, e.g. specification reviews, design reviews, code walk-throughs, quality standards, etc. the main downside was that the formalization added time, effort and expense. It made requirements changes even more likely! The tone of the 2001 Agile Manifesto might be viewed as “down on process”, but in reality any development methodology needs process to be effective. There are a couple fundamental problems here. The first is that large programs, say over a couple dozen developers, often need large processes to coordinate all the subprograms and their inter-dependencies. The second is that most process mavens preach customization of processes, and continuous customization is usually required. One insane waterfall type wanted the program manager to formally customize the corporation’s processes (which of course required a corporate approval process) prior to the start of the program, before the program manager even understood what process features would be needed. In reality without process, one has chaos, and even if a rare gem arises out of chaos, management and customers hate chaos.  Also, process tends to balance resources and keeps the development team efficient.You’d think that the Agile proponents would invent “Agile processes.” Some did, but some did something much more clever. They adopted Scrum, which came out of the automotive and consumer products space. (I’m told that “scrum” refers to how a rugby team restarts the game after an infraction. I don’t follow rugby. In any case, “scrum” is not an acronym.) Scrum allows work tasks to be spun off in an ad hoc fashion into “sprints” of fixed short duration. This allows the program to be capable of responding to changing requirements and priorities. Sprints are managed by daily ~15 minute meetings where the developers tell of progress since the last meeting, work planned for today, and impediments. Higher level scrums can be formed (a “scrum of scrums”) to discuss overlaps, integration issues, and global impediments. A representative of each sprint team attends this meeting. A “ScrumMaster” is responsible for resolving impediments (outside the meetings.) General reviews and planning sessions are also defined. (I recommend/require that minutes of these meetings be kept. Minutes allow people who can’t attend a meeting to catch up.) The point is to layout and manage the work needed to be done to achieve the next working release. Various tools and techniques are used to keep track of requirements not yet implemented (“product backlog”), work needed for the next release but not scheduled as a sprint (“sprint backlog”), backlog priorities, and progress towards the next release.

    A “product owner” makes final decisions on priorities and represents the stakeholders. The product owner is responsible for gathering “stories” which describe how requirements are to be used. The story format is based on a user type wanting to do something to achieve some result. The scrum philosophy is that stories give more meaning to requirements. The stories are similar to “Use Cases” in Unified Process (UP) systems. The team’s engineering manager usually serves as ScrumMaster, and the program manager (or product manager) usually serves as the product owner.

    The main added value of Scrum is the flexibility of spinning off work tasks or sprints that weren’t part of the original project plan. In fact, a project plan is still useful to define releases and release dates, but it needs to stay at a high level so that it can be easily modified as requirements change. The degree of granularity in the project plan provides the same degree of granularity for progress reports. The priorities, schedule estimates, and work estimates can be added as well as task dependencies. This allows for multiple earned value reports at this level of granularity. The reports are valid until requirements change and the development is rescheduled. With Agile and Scrum, this should occur rather frequently and usually after an intermediate release review.

  4. Security is essential. It needs to be clearly defined as part of the requirements, and the working releases need to address security requirements very early in the program. At the very least, a security model of which users can access what data and how this access is authenticated must be clearly defined in the requirements. Good security comes in defined layers such that if an outer layer is compromised, then there is still protection from the inner layers. Usually security needs to be explained to the prospective customers. There are plenty of examples in the recent literature as well as in this blog! Requirements for misuse of all interfaces also needs to be defined, especially as to what the system does (rather than crash) when a violation is detected, and as a matter of security, all misuses must be detected. If security is part of the requirements, it should be no surprise that security needs to be an explicit part of the design and of course the implementation.
  5. High Availability is essential. Among my friends, Leslie Lamport gets credit for the definition of a “distributed system” as one which can fail for a user due to a failure in a computer that the user never heard of. Of course, everyone wants to make their product as bug-free as possible, but life isn’t always kind. Power outages, network failures, server failures, etc. often conspire to cause users much grief. Users want their systems not to lose or corrupt their data, even in the presence of such failures. They also want such failures to be nearly invisible to them. By “nearly” I mean that the failure is at most noticed for at most a second or so. Data transfers should restart, and the amount of re-typing the user must do is trivial. The average time it takes to recover from such a failure is called the Mean Time To Repair (MTTR). The average length of time between such failures of such a system is called its Mean Time To Failure (MTTF). Availability is the ratio MTTF/(MTTF+MTTR). It is the percent of time that the system is fully functional. Good availability is something like 0.999 (“three nines”). High availability is usually something better than 0.9999 (four nines) and Very High Availability is better than 0.99999 (five nines). Air traffic control systems want an availability better than 0.999999 (six nines). If data is corrupted, then the repair time must include the time to fix the corrupted data. Usually, a system won’t meet any reasonable availability requirements if data can get corrupted. [Cf. other post on high availability in this blog.]Much like for security, availability needs to be put into the requirements and have that requirement flow to the design and implementation. High availability is difficult to achieve, and Agile implementation cycles need to estimate availability at every iteration. Vendors need to be involved including network and computing server vendors, and even power companies need to be involved. Usually availability commitments need to be made from these vendors that include inventory of spares, 24 hour service personnel, and various failover features of their system.
  6. Verification/Testing is as important as development. (“Verification” encompasses testing; it usually also includes requirements reviews, design reviews, code reviews, and test reviews. It definitely includes a review of all test results.) At the very least spend as much money and expertise on testing as are on development. Some might say more of each, and some estimate that testing should be as much as 75% of the development budget. (How much verification/testing does an air flight controller system for airports require?) Include hiring a hacker to try to break into the system at every stage of development. Agile emphasizes the quality of the engineers on the team (a stronger form of Principle 5). Don’t skimp on the quality of the test engineers. In fact, mixing up assignments on test sprints and development sprints is a good idea (Agile Principle 11). Make the design of testing part of the design of the system, and make sure tests can be automated. Every Agile working delivery must include a working test suite to verify it! The worst part of the waterfall method is that testing is explicitly the last step before delivery. Keep in mind that delivery must include both “white box” testing, where the design of the software is known to the test developers, and “black box” testing, where the tests try to break the system with only knowledge of what is does and without knowledge of how it works. Black box testing usually includes “stress testing”, which consists of using a wide variety of both legitimate and illegitimate input data. Organize the legitimate input data so that stories are tested. Be sure that enough (story) testing is done to get the required test coverage. Testing a more complex story is essentially a simulation. Some products might require hundreds of hours of simulation, and this makes it clear that the automated test suites need to consider how the output of the suite of tests can be analyzed and how reports for management and customers can be automatically generated.Bugs will be found in the product through normal usage that are not found by the suite of tests. Fix these bugs only after fixing or augmenting the test suites so that the tests can find the bug. This is the most fundamental aspect of good regression testing. You need to be sure you can detect such a bug if it re-occurs after some other software change.

Agile and Scrum, despite the detractors who hate the words “sprint” and “ScrumMaster”,  can be used to build modern applications. Changing requirements will force changes to requirement and design documents as well as changes to existing implementation code. This allows these documents and code to be initially incomplete, and the development process should be designed to allow reviews that include business management and customers to complete and perfect them. Scrum provides flexible, customizable processes that allow for such iterative refinement work. Deeper requirements such as security and availability need to be included at the beginning and each iterative deliverable needs to address them. Verification/testing must be treated as requirements that include appropriate requirements for black and white box testing, for test coverage, and for robust simulations. All tests should be automated with output appropriate for the required analysis of results and the generation of reports suitable for management and customers. Verification and testing needs to be given at least the same or higher levels of quality and quantity of resources. Projects can be managed with the usual tools, including earned value; however, expect projects to be replanned frequently per the Agile philosophy.

Advertisements

RSA Hacked; Tokens Stolen

2011/06/12

If this is your first visit to this blog, please start here.

Apparently the RSA computers were hacked in March 2011 and stolen most likely were the generators and algorithms for the 6 digit tokens that are generated every minute or so by those cute SecurID frobs.  The tokens are used as part of an additional layer of security when accessing a remote site.

Later L-3 Communications claimed that hackers attacked the L-3 site using these stolen tokens. This particular attack was apparently thwarted by noticing the attack and turning off all remote access.  Northrup Grumman recently shut down its remote access, but didn’t say why.  Lockheed confirmed in early June that hackers had compromised a single account using stolen SecurID data, but Lockheed claimed their quick action stopped anything significant being stolen.

Not much information has been released as to what was stolen from RSA or how the attackers use the stolen information.  My guess is that they somehow are able to generate the same key that the RSA SecurID token generator does, without the physical SecurID frob.  Knowing how the SecurID token is used, will allow the attacker to defeat one layer of security.   The next layer is typically a username/password dialog.

The moral here is that one should never have a weak password, and you should never by tricked into revealing it, e.g. via a Phishing trick.  Your password is your last line of defense!

RSA (owned by EMC) is replacing the SecurID frobs with new ones with a different generating algorithm.  An open letter from RSA is posted here.

-gayn

Another Security Breach at Sony

2011/06/07

If this is your first visit to this blog, please start here.  This particular discussion starts here.

Just as Sony was trying to reassure Congress at a data security hearing, hacker group LuizSec posts another 1 million names, addresses, date of birth, and passwords.  LuizSec stated, “Why do you put such faith in a company that allows itself to become open to these simple attacks?”  This must not look good for the new security officer at Sony!

LuizSec stated that a single “injection” allowed this access.  OK, so there’s a security hole, but why wasn’t the data encrypted?  This lack of encryption boggles my mind!

Another Amazon Outage

2011/06/07

Amazon claims that their cloud computing expertise came from running their successful web sales site amazon.com.   Enter, from stage far left, Lady Gaga.  Her new CD gets priced on Amazon for only 99 cents, and for one day only.  Well this turned out to be so popular that the high volume rendered amazon.com as effectively crashed and inoperative.  In other words, Lady Gaga’s CD sales generated a volume that exceeded the expansion capacity of amazon.com.

So what is the moral of this story?  The moral is NOT that high volume can exceed the capacity of a site.  We’ve known that for years.  The moral is also NOT that Amazon should have had more capacity.

The moral is that Cloud vendors need to be able to wall off the expansion capacity of one virtual system in such a way that other virtual users of that Cloud  can still function when the first overloads.  Your Service Level Agreement (SLA) should address this.

-gayn

Solar Panels: PV vs. CPV vs. ???

2011/06/01

[I promise to post – sometime in the near future – my old notes on how PV and CPV solar panels work.  Hopefully this post won’t require much in the way of prerequisites.  -gayn]

While the history lesson of the Betamax has not escaped me, I continue to believe that Concentrating PV’s (CPVs) will win out over PV’s where, of course, PV = photo-voltaic solar cell.

The basic issue is that PV’s are simpler, cost-less (per square foot), and yet have lower efficiency and hence for a given panel size, produce less electricity.  In contrast CPV’s are way more complex, cost more (per square foot of panel), and have almost triple the efficiency producing far more electricity for a given panel size.  The additional complexity also adds more maintenance for a CPV panel.  Finally, the PV team shouts, “We’re cheaper now!”

OK, so why am I so optimistic about CPV?  Let’s consider the three parts to a CPV panel.  First the optics – either parabolic mirrors or Fresnel focusing lenses – are made of cheap glass.  With volumes, this piece will become extremely cheap.

Second, the high tech metal alchemy that produces the stack of three solar diodes that capture three different wavelength ranges of light from the (concentrated) sunlight is going to plummet in cost.  While there is still a race to find the right semi-conductor metals and the right doping combinations for maximum efficiency, there is a theoretical limit on efficiency for such cells.  Soon enough the competitors will converge, if not on one solution, then on similar enough solutions that the manufacturing folks can optimize the process of creating them and benefit from the increasing volumes.

Semiconductor process technology will also make these chips smaller, which is greatly to the advantage of the folks with the concentrating optics.  Currently there is around a 600 to one focusing of sunlight cross-sectional area to the area of a solar cell, and this will surely go to 1000 to one, with the obvious economic benefit.

Third, the original clunky dual axis solar trackers necessary to keep the CPV panels orthogonal to the sun’s rays are already getting lighter, cheaper, and more reliable.  Another economic benefit.

The CPV vendors, due to the cost and complexity of their initial products seem to have been going after the big utility installations. As the above economic benefits materialize, smaller installations will be more reasonable, e.g. commercial and residential rooftops.  This will drive the volumes up and the prices down.  It may also permit the CPV vendors to do what the PV vendors are doing now by financing an installation.

So what could make all this crystal ball gazing go wrong?  Well there are signs of unbelievably cheap PV materials – not based on Silicon dioxide chips.  I could imagine it so cheap that every building would just drape this material around on every surface that faces the sun.  (I just don’t see this happening very soon, if at all.)

Another possibility would be that another form of Green Energy comes along that is an order of magnitude more cost effective than solar.  There are lots of candidates, from making energy from ocean waves, to making it from trash.  Again, I don’t see any of these hitting the shores like a tsunami.

Thus, my money is on CPV technology – at least for the foreseeable future.

 

-gayn