Perils of Copy-Paste Programming

Double oops

Interestingly, my change “solved” the perceived bias in the shuffling code, but it actually introduced real bias in an otherwise correct implementation of the Knuth-Fisher-Yates shuffle algorithm.

This was pointed out to me yesterday (11 March 2017) in the comments on the answer I copy-pasted by Steve Marx. His comment included a link to a post by Jeff Attwood on his Coding Horror blog, aptly titled The Danger of Naïveté.

Sooooo…. In this case the perils of copy-paste programming didn’t materialize, and succumbing to the “your shuffle is biased” peer-pressure actually led to introducing a change that introduced an actual bias that was only perceived bias before.

The morale of this story still stands: don’t copy-paste code you don’t understand. I am now adding to it: don’t change your code just because someone tells you there is something wrong with your code. Let them show it to be wrong.


Picard and Riker double facepalm
Couple of weeks ago, you were looking for a quick way to shuffle the names of your teammates so you could make the order in which everybody got a turn in the standup a bit more unpredictable and could also make appointing the “designated driver” for the standup a bit of a lottery.

Rush

You were in a bit of a rush. So you did a quick search and one of the first Google results looked to provide exactly what you needed: How to randomize (shuffle) a JavaScript array?

/**
 * Randomize array element order in-place.
 * Using Durstenfeld shuffle algorithm.
 */
function shuffleArray(array) {
    for (var i = array.length - 1; i > 0; i--) {
        var j = Math.floor(Math.random() * (i + 1));
        var temp = array[i];
        array[i] = array[j];
        array[j] = temp;
    }
    return array;
}

That should do the trick.

A little voice inside your head spoke up: “You know what you always say: don’t copy-paste what you don’t understand.”

You told it to shut up.

Admittedly, you don’t really understand what’s going on in the Math.floor() statement, but what could be wrong with this tiny bit of code. Besides, it’s coming from StackOverflow. Not exactly the worst site in the world. And from an answer with 301 upvotes! What can be wrong with something like that?

So you copy paste the function provided in the answer in your page, call it from your own shuffleAndShow() function that you already coded to show an unsorted list of names.

Job done.

You and your teammates now enjoy a little unpredictability in the order in which you speak during your standup and in determining who is going to be the “designated driver” for the next standup.

Yay!

Favoritism

Couple of weeks later, some of your teammates are starting to complain. The shuffle page seems to have a distinct preference for the lower part of the alphabet.

Doesn’t ruffle the feathers with regard to the speaking order, but being the designated driver (chairing the meeting, presenting the board, opening issues, etc so everyone in the call is on the same page) is seen as a bit of a duty that most would rather avoid.

Hmm.

You decide to look into it. There is really only one line in that piece of code that could be the culprit.

var j = Math.floor(Math.random() * (i + 1));

What could be wrong with it?

Google leads you to Generating random whole numbers in JavaScript in a specific range?.

The answer by Ionuț G. Stan has a really comprehensive answer. Reading makes you understand what is going on and why the function you copy-pasted is favoring the lower end of the array passed into it.

To get a really random number between a min and a max, you need

/**
 * Returns a random number between min (inclusive) and max (exclusive)
 */
function getRandomArbitrary(min, max) {
    return Math.random() * (max - min) + min;
}

And to get a random integer value between min and max, you need:

/**
 * Returns a random integer between min (inclusive) and max (inclusive)
 * Using Math.round() will give you a non-uniform distribution!
 */
function getRandomInt(min, max) {
    return Math.floor(Math.random() * (max - min + 1)) + min;
}

Hmm. Ok. So the code you copied did get the use of floor() correct.

But the i + 1 it uses to multiply the result of the math.random() call doesn’t seem to make any sense in the light of Ionut’s explanation.

It should be:

var j = Math.floor(Math.random() * (max - min + 1)) + min;

Where max the index of the last item in the array passed to the shuffle function and min is the index of the first item in any array.

The index of the last item in any Javascript array is array.length() - 1.
The index of the first item in any Javascript array is 0.

Which leads to:

var j = Math.floor(Math.random() * ((array.length() - 1) - 0 + 1)) + 0;
//                                  ......max...........  min       min

And:

var j = Math.floor(Math.random() * array.length());

Morale of the story

This of course was a very insignificant piece of code and not much (if any) harm was done. But it does illustrate an important point.

It’s ironic because, it happened to me and I am known to speak out against copy-paste programming. That is, I don’t mind copy pasting, but at least understand what you are copying. The first time I succumb to the allure of copy-paste programming without understanding what the code is doing, I get bitten in the behind. Ouch.

On the positive side: serves me right and reinforces my own message to me.

Don’t copy-paste anything you don’t understand.


What have you copy-pasted that has come back to haunt you?

Posted in Development Practices
Tags: , , , ,
3 comments on “Perils of Copy-Paste Programming
  1. Hugo Logmans says:

    Aangezien de kans groot is dat je zelf achteraan het alfabetisch gesorteerde lijstje staat, geloofde zeker niemand dat het niet met opzet was…

    Wel grappig dat ik direct zag dat met het teruglopen van de iterator alleen met een lager element gewisseld werd, maar mijn eerste reactie was: oh ja, daar was een reden voor. Ga je net zo hard de fout in.

  2. Steve Marx says:

    The code you found on Stack Overflow is actually correct, and your proposed fix will introduce bias. See https://blog.codinghorror.com/the-danger-of-naivete/ for a nice writeup of this mistake.

    • Marjan Venema says:

      Thanks Steve. Edited the post to point this out more pointedly 😉 then buried in the comments.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Show Buttons
Hide Buttons