Preparing for the ACCU conference and making a few tweaks to my up-coming book in-line with some feedback, I haven’t had a chance to write an editorial. I still dream of creating an automatic editorial generator, but need to learn a lot more about natural language processing to make any progress with that. So what has been eating my time?
At work, I am trying to rejuvenate and repurpose some old FORTRAN code. Originally, it was used to model software reliability. In fact, there are several models with the subtle differences in approach, but on a high level, they each take a list of event times and use these to predict when future events might happen. In terms of software reliability, these events mean finding a bug in a code base, or other fault, and fixing it. Realists may say you can’t find and fix a bug instantaneously. In modelling terms, you should never let reality get in the way, so could use the time when the bug fix is committed to the code base instead. If the event times are getting further apart, things are improving. In theory, you can fit these to some statistical models and decide how likely it is that all the bugs have been removed from a code base. Overviews are available, for example, Reliability Growth: Enhancing Defense System Reliability has a chapter giving an overview of the models [NRC]. Our team is trying to find ways to apply these models to cybersecurity. The theory is a security problem is an event with a timestamp, so sending in event times to the models allows you to predict when the next events might happen.
I have eight models, with over fifteen data sets along with previously generated output files for some of these inputs. There are no unit tests, which is no surprise, however the expected outputs for given inputs provide a way to test the code. I wrote a script to build the models and run them against all the inputs. When I ran this, several of the models exploded in various exciting ways. My instinct to automate this first left me drowning in error messages. A smaller script to check how many input files have corresponding outputs would have been revealing. Several inputs did not having matching outputs. It turns out those without outputs were crashing and causing my well-intentioned script to fail. Once I limited the inputs to those that didn’t explode, I ended up with different formats in the output, so a simple diff wouldn’t work. Queue another script to compare files of numbers. The generated files have rows of numbers for each event. The original files have blocks of up to four numbers on a row, so you need a bit of guess work to figure out when you are actually on a new event. The numbers don’t match exactly either, which will need a meeting to discuss. I haven’t yet checked running the same code on the same input provides the same output. That’s another story. The highlight here is writing scripts to automate this seemed like a good idea. If I had tried one or two things by hand first, I may have noticed the missing outputs for some of the inputs and actually made progress quicker by slowing down. I may have found my superhero name by doing this. I recently found out that ‘kilogirl’ was an early unit of computing power, equivalent to a thousand hours of manual computing labour. This seems to come from a book called Broad Band: The Untold Story of the Women Who Made the Internet by Claire L Evans [Evans18]. Manually comparing the columns in the output files would probably be a thousand hours. Kilogirl it is.
At the recent ACCU conference, Daniele Procida advocated a Zen approach to problem solving, saying “Don’t just do something. Sit there.†[Procida18]. Charging straight in without thinking first is often asking for trouble. In Overload 128, I considered what you should automate. [Buontempo15]. The origins of computers and fear of machines taking over the world, or at least taking people’s jobs is old. And yet, history repeats itself. And indeed, my errors of signalling NaNs also repeated themselves when I ran my automatic script. If I had listened to my own advice, I would have tried one input at a time for a while to get a feel for the unfamiliar code base and file format, before writing a script. Automate all the things, but not yet.
I am not the only person wondering if I have too much automation. A recent article claims Elon Musk regards too much automation at Tesla Inc as a mistake [Hull18]. He says, “We had this crazy, complex network of conveyor belts, and it wasn’t working, so we got rid of that whole thing.†My unfamiliar FORTRAN code base feels like that and I am tempted to ditch it and re-write it in another language. I might do that as a learning exercise; however, my mission is to make this work, so can’t get rid of the whole thing. Musk also said, “Humans are underrated†[Musk18] in response to the discussion. Automation is supposed to save humans form error-prone, boring repetitive tasks that machines are more suited for, however the article points out, automation can be:
Expensive and is statistically inversely correlated to quality. One tenet of lean production is ‘stabilize the process, and only then automate.’ If you automate first, you get automated errors.
You always need to think about what you are trying to achieve and measure to check this is happening. A brief discussion around this [Malone18] was nailed on the head:
I think they’re saying too much automation too soon is an expensive mistake that the Germans and Japanese have learned. But they still automate when products mature.
Some of us know more about the manufacturing processes than others, but most of us have encountered interviews. Can that process be automated? Have you ever have been sent a Hackerrank or Codility test as a pre-screen? Or a coding exercise that you suspect gets sent through a test suite? These take time; you are often given 90 minutes to complete the online exercises. Do you have 90 minutes spare where you can be certain no one will interrupt? Maybe, maybe not. What happens if you fail the automatic tests? Furthermore, why don’t these websites allow you to use a test suite? Or version control? Whatever the reason, this means writing code in a strange way for many of us. Some companies will still look at your submissions even if you don’t get 100%, and use your solutions as a basis for discussion. There’s a tension between companies avoiding wasting their employees’ time when recruiting and them wasting interviewees time. There’s also a need to make sure the process is ‘fair’ by ensuring everyone has been asked the same questions. On paper that sounds sensible, but different people have different skill sets. Some people are quicker than others too. I was quicker when I was younger, and am beginning to wonder how much of the lack of diversity in some organisations is down to their recruitment process. Do you want a quickly coded, hand-rolled algorithm, or easy-to-follow code in version control complete with unit tests? Or someone who is good at bug hunting? Or team building and mentoring? How would you test these skills automatically?
Before you get as far as being tested, you usually submit a CV. A script can search for skills in this, which leads to suggestions of hiding buzz words in white on a white background to get picked out by the bots. Automatic CV screening can exclude people who may not have followed a conventional route. I do not have a Comp. Sci. degree so some automatic process will instantly exclude me. Some may insist on straight As at school. I got half As and half Cs. Again, #fail. I do have a PhD, but an automated process may not value this. Some people do not have a degree but can still code.
I understand that some British universities are dropping a requirement for ‘A’ level physics as a prerequisite to study engineering, in order to encourage diversity. The theory is that some girls avoid physics classes since they would be the only female in the room. That doesn’t mean they aren’t interested or capable, just that they don’t want to be the odd one out. One gentleman, prior to a diversity and inclusion in STEM talk I attended, informed me that women aren’t interested in science. He was partially right; many men aren’t in the slightest bit interested either. If women instantly get excluded in the UK because they can’t face being the odd one out at school, claiming they are not interested enough in the subject is missing the point. Our bias and assumptions do filter into our processes and can make the situation worse if the process is automated. As Musk said, humans are underrated. Sometimes. Ms Teedy Deigh mentioned ‘BIBO’; bias in, bias out, in her article for our last issue [Deigh18]. I used this phrase in my keynote to the ACCU conference last year, so am pleased she was there listening. I am always amused that feedforward neural networks usually have a bias neuron. Its purpose is to allow all zero inputs to map to non-zero outputs, or more generally shift the inputs up or down [Stackoverflow]. Yet there it is, right in the middle of a common AI technique: bias, built in.
A recent post on Medium by Michael Jordan considers AI [Jordan18]. He tells a concerning tale of his pregnant wife’s ultrasound test revealing markers for Down syndrome. They were told the risk of the baby having the syndrome had increased to 1 in 20, but the risk of an amniocentesis test killing their baby was 1 in 300. Being a statistician, he dug into the numbers, and concluded the increased number of pixels on current machines meant the health care officials might be modelling white noise, since the original numbers were based on white spots showing up on older displays. Using old data and models as the world changes, especially if you bring automation into the mix, is asking for mistakes. He notes that before civil engineering, people still built bridges and some collapsed. He sees the need for a new engineering discipline around AI, requiring human perspectives. He notes some early neural networks were used to optimise the thrust of the Apollo spaceships and are now used to power decisions Amazon, Facebook and other large tech companies make. Amazon didn’t realise the London Marathon runs along the end of my road, so their clever logistic algorithm didn’t pick a different day to make a delivery for me. Sometimes, humans need to call out that new information has changed the situation. Furthermore, a narrow set of voices might be causing bias in the topics researched and the outcomes. He says, “There is a need for a diverse set of voices from all walks of life, not merely a dialog among the technologically attuned. Focusing narrowly on human-imitative AI prevents an appropriately wide range of voices from being heard.†He ends by saying, “Let’s broaden our scope, tone down the hype and recognize the serious challenges ahead.â€
AI seems like the ultimate way to automate all the things, but various the scripts and tooling we write ourselves takes us part-way there. My scripts to attempt to verify the modelling code are one small, slightly failed part. Online coding tests to filter candidates are arguably another. Automating a build is more sensible. Having scripts to deploy releases is wise. Encouraging a QA department to automate some of the process is a good idea. Automating all the things can slow you down and introduce bias. Inclusivity and diversity matter. This means different things to different people. Different countries have differing problems. One keynote at this year’s conference was about diversity and inclusivity. It became apparent, at least to me, that attendees from some countries don’t believe there is a problem with few women studying or involved with STEM. A lightning talk by Robert Smallshire showed the variation between countries, in particular mentioning 49% of STEM students being women and shared a graph of the global index gender gap against the percentage women among STEM graduates [Sossamon18]. What happens after graduation is another matter [UNESCO]. We need to be careful about context and assumptions when we talk to each other. There is a problem in the UK. Particularly with guys, and I mean guys, turning up to inclusivity talks telling me women don’t like tech subjects.
Automating everything is asking for trouble. Being aware of context and doing some fact finding is important. The ACCU conference also mentioned the Include CPP group [IncludeCpp]. I’ve joined their discord chat channel, which is another excuse for failing to write an editorial. However, if you want people to chat to and some support or encouragement, do get involved.
References
[Buontempo15] Frances Buontempo (2015) ‘Semi-automatic Weapons’, Overload 128, Aug 2015 https://accu.org/index.php/journals/2133
[Deigh18] Teedy Deigh (2018) ‘Ex Hackina’, Overload 144, https://accu.org/index.php/journals/2484
[Evans18] Claire L. Evans (2018) Broad Band: The Untold Story of the Women Who Made the Internet, Portfolio, ISBN 9780735211759
[Hull18] Dana Hull (2018) ‘Musk Says “Excessive Automation Was My Mistakeâ€â€™, Bloomberg 13 April 2018, https://www.bloomberg.com/news/articles/2018-04-13/musk-tips-his-tesla-cap-to-humans-after-robots-undercut-model-3
[IncludeCpp] http://www.includecpp.org/
[Jordan18] Michael Jordan (2018) ‘Artificial Intelligence – The Revolution Hasn’t Happened Yet’, https://medium.com/@mijordan3/artificial-intelligence-the-revolution-hasnt-happened-yet-5e1d5812e1e7
[Malone18] Dylan Malone (2018) https://twitter.com/dylanmalone/status/986321420761235456, posted 17 April 2018
[Musk18] Elon Musk (2018) https://twitter.com/elonmusk/statuses/984882630947753984, posted 13 April 2018
[NRC] National Research Council (2015) Reliability Growth: Enhancing Defense System Reliability. Panel on Reliability Growth Methods for Defense Systems, Committee on National Statistics, Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press. https://www.nap.edu/read/18987/chapter/11#122
[Procida18] Daniele Procida (2018) ‘Fighting the controls: tragedy and madness for pilots and programmers’, presentation at the ACCU Conference 2018: https://conference.accu.org/2018/sessions.html#XFightingthecontrolstragedyandmadnessforpilotsandprogrammers
[Stackoverflow] ‘Role of Bias in Neural Networks’ https://stackoverflow.com/questions/2480650/role-of-bias-in-neural-networks
[Sossamon18] Jeff Sossamon (2018) ‘In countries with higher gender equality, women are less likely to get STEM degrees’, World Economic Forum, https://www.weforum.org/agenda/2018/02/does-gender-equality-result-in-fewer-female-stem-grads
[UNESCO] UNESCO (2018) ‘Improving access to engineering careers for women in Africa and in the Arab States’, http://www.unesco.org/new/en/natural-sciences/science-technology/engineering/infocus-engineering/women-and-engineering-in-africa-and-in-the-arab-states/
Overload Journal #145 - June 2018 + Journal Editorial
Browse in : |
All
> Journals
> Overload
> o145
(5)
All > Journal Columns > Editorial (221) Any of these categories - All of these categories |