I wrote a post on the lab blog about what I see as the Documentation Dilemma. How much time should you spend making sure every piece of your work is individually useful vs. pushing on to a final product?
In so many parts of our lives, our focus becomes a tunnel towards just ‘making it work’ at whatever cost. For now, I’m going to avoid the snake filled bog that is the exploration of that idea in the realm of morality. But in engineering and science, dogged focus on a solution can be incredibly valuable but also very dangerous. Of course, it can lead to rousing success. However, the solution can end up built on hidden assumptions, forgotten in the dash to a solution. If they are undocumented, they are a giant trap for someone who tries to use your result. There are many examples, but I’ll go with a personal one:
The danger was particularly salient for me today. I’ve been trying to get QuIRK-E (my eddy-current model) to match the measurements of a paper that experimentally investigated eddy-current forces.
It’s so easy to go down the rabbit hole: ‘well, it should work if I just add this term and tune the constants. Oh, that didn’t work, but got me closer, let’s add this other term.’ If I hadn’t been forcing myself to write down every step I took and why I took it, I could easily have gotten lost in the warren. I had to say to myself ‘stop. You’ve completely removed the model from physical reality, just to exactly replicate this ONE SITUATION.’
It’s so easy for this to happen:
The model (in green) looks like it really captures real life (blue), doesn’t it? Let’s zoom out a bit.
Now, if you’re only going to use the model in the smaller first region and are explicit about that fact (aha! The value of documenting assumptions raises its head) then it’s a fine model. But claiming that the model ‘captures the system’ is just untrue.
I worry that this happens all the time in many areas in science, engineering, and the rest of life as well. Someone develops a model or heuristic that works very well for a certain area or set of data. But the specialized purpose of the model is lost, either because they don’t record the fact or acknowledge it in the first place.
It’s time for some tinfoil-hat-line-skirting ideas! I’ve been mulling the idea of automatic documentation.
Getting into the ‘flow state’ is awesome. You feel like you’re rocketing through a problem – one task leading seamlessly to the next until you skid to a stop and say ‘whoah.’ The flow state can also be dangerous for long-term progress. After the fact, it can be very hard to figure out exactly what you were doing and why. This makes it hard to build upon your past work.
Imagine a process that runs in the background and pays attention to your work. The documenter would make note whenever you make a large change (creating a new function while programming, generating new constants when modeling, composing three new pages when you’re writing, you name it.) At that point it could pop up and ask you to make a note about what’s going on or quietly log the event and associated actions for future analysis (not interrupting your flow.)
There are plenty of ways to vaguely measure what you’re doing on your computer, but the problem is that like a lot that’s going on with the sensing/measuring/big data, they’re not very good at distilling what’s going on, especially in real time. The key to making this documenter successful would be the ability to both record what you’re doing in a useful way, and be able to recognize important changes.
Ideally, this would eventually be valuable to anybody who does work on a computer. The group that I think could benefit the most might be novice programmers, who haven’t gotten the habit of documentation hammered into their heads. (I’m still working on that hammering.)
There would of course be privacy issues – a company could use it to learn exactly what you’re doing on Facebook, the government could confiscate the logs and use them to indict you- but like any technology it would be a tool that could be used for good OR bad.
I’ve been working on the Quirk-E project, in particular, tuning parameters so that it produces numerical results that resemble reality. [link to blog] I plan to share it on Github soon, but until then, I wanted to bring up some important points about models. I may have made these points before, but I see so few people who actually think about them that I’m going to risk repeating myself.
The rise of cheap computing has allowed numerical models to explode into basically every domain. Some wonderful discoveries have come of this. It’s just important to remember both that they are only models and not reality and to keep in mind how much human discretion goes into modeling.
Every model is influenced by the discretion of the modeler, from something as simple as fitting a line to data points to running a genetic algorithm. Even though the genetic algorithm nominally ‘figures it out for itself,’ the human still sets a number of parameters that heavily influence the outcome. And it’s so tempting to tweak parameters without justification except that they give you the answer you want.
I’m trying to make sure I don’t fall into this trap by meticulously noting whenever I change a number and why I did it. Even ignoring any malicious intent, it’s so easy to get into a flow state of ‘if I just tweak this one more number it will work!’ The proliferation of models and the inevitable tweaking temptation increases the need for something like automatic documentation.
What you should take away is that any time ‘a computer told us so’ a human directed it in how to do that. It’s often forgotten that scientific models are just that – models. Nobody thinks that a model castle captures all of reality, we realize that instead, it’s a useful representation, whether for discovery or illustration. We should remember to treat scientific models the same way.
Computer models are like the yin to human intuition yang. Models bolster our intuition’s weaknesses and can teach us a lot of things. But it’s important to remember that like yin, a model can’t stand on it’s own and all models have a little dot of human choice in them.
Whenever I spend a lot of time in lab, I inevitably run into some bizarre equipment-related challenge, and can’t help but think of how Hayek’s knowledge problem manifests itself in engineering land.
The knowledge problem isn’t unique to engineering – there is distributed, circumstantial knowledge that can’t be easily transferred everywhere. However, I find it especially frustrating in the engineering domain because of engineering’s normally excellent breadth, spanning the spectrum from theoretical to practical.
The problem is that theory is like a zoomed out map, smoothing out the details and giving you a broad overview of how to get somewhere. It doesn’t tell you anything about potholes or deer crossing the street. So while a lot of my colleagues ‘know’ how to do a task, that knowledge is like knowing which turns to make from looking at a map, when I actually need advice on when to swerve to avoid deer and potholes.
Programming is an interesting example where I think a technical field has managed to combat the knowledge problem. Many problems you encounter while programming give you an error message that you can literally copy and paste into Google. You can also post your exact code. This allows others to know that they have the same problem and for the problem to be very specifically defined for someone giving an answer.
Other aspects of engineering don’t have that luxury. How do you describe the problem with your sensor where it gives the correct signal on average, but randomly shoots low or high every couple of microseconds? Or that you need to figure out how to get a specifically shaped piece of plastic out of a tight hole (and you think the material is Delran, but aren’t certain and the material properties actually matter in terms of not breaking anything.) It’s a question that really needs a conversation to transmit, let alone get a good answer to.
In many situations, I know there’s someone who would know how to deal with the problem immediately but I have no idea how to find that person. It seems like this doesn’t have to be the case.
As engineering, like everything, gets more complicated, I see two possible solutions to the knowledge problem: improved networking – connecting specialists with others more efficiently and changes in documentation paradigms.
I wanted to delve a little deeper into practical habits I find useful when you’re tinkering like a madman.
Documentation. One oft cited reason for studying history is that ‘we can’t move forward unless we understand where we’ve been.’ It’s also true in tinkering – a surprisingly hard habit to internalize. If you think planning where you are going is boring, writing down where you’ve been is even worse. If it was a success, huzzah! It works and you just want to make the next thing. If it was a failure you just want to scratch that off the list and try the next option. But if you don’t write down what you tried, why you tried it, and why you think it worked or failed, chances are you will forget at least some details that may be incredibly relevant later – you just don’t know.
Before I forced this habit on myself, I would sometimes circle back on already-asked questions and failed attempts in design space. Remember that Edison, the tinkerer to end all tinkerers, kept meticulous notes. Do you think without them he would have been able to hone in on a specific type of bamboo from Japan as the correct material for light bulb filaments? I doubt it.
Different tiers of difficulty – Try to mentally categorize the small steps into rough ‘tiers’ of difficulty and have an idea of how much activation energy/planning, programming, machining, and testing (the four big time eating categories) it will take to jump from one to the next. Here, I can’t but evoke the image of evolving Pokémon – apologies to anybody who didn’t grow up in the 90’s. It’s an important exercise because unknowingly skipping tiers can lead to disaster and disappointment. Remember, though, this categorization should be a quick mental exercise with maybe a note or two: a five-minute session, not the day’s project.
For example: Our lab is trying to test the effects of spinning permanent magnets for eddy-current actuation. Tier one is sticking the magnet on the end of a drill bit and going to town: an easy proof of concept – no machining or programming necessary. Tier two is building a set-up with the magnet at the end of a powered motor – eliminating precariously held drills: medium – there is machining and unavoidable planning: you need to make sure the motor is powerful enough and can actually be mounted usefully, otherwise the machining will have gone to waste. And finally, tier 3 is controlling the motor with an Arduino and an H-bridge, which will require some amount of basic electronics and programming.
I said it before, but I really want to stress that it’s essential to keep a realistic assessment of your strengths and weaknesses in mind. For example, it’s relatively easy for me to get activation energy to jump up and do a test, but a task that requires coding in something besides Matlab is like tar as I slog through documentation. Thus, I try to substitute a bunch of small tests for something that is nominally more efficient, but involves coding in C. This may be the complete opposite for someone with a different skill set.
So ask yourself: what really trips me up? It’s great to shore up those weaknesses, but unless that’s one of the goals of the project, the madman stage of prototyping is not the time.