The one thing guaranteed to make me laugh out loud is Boston Dynamics’ video of their robot-dog, SpotMini, slipping on a pile of banana peels. Another video of SpotMini’s “robustness test” shows an equally funny fail-point as a man armed with a hockey stick attempts to dissuade SpotMini from completing its task. He is flat out defeated.

Even though these are rudimentary fail-points of neural and artificial intelligence (AI), the pratfalls and indignation may be harbingers of a larger problem on the horizon. The deep learning aspects of advanced generative technologies that we can’t see, control or explain – the black box – has become a blind spot of trust, ethics and accountability.

Several years ago when AI gained ground with its ability to process and analyze vast amounts of data into predictive results, the benefits appeared to be tremendous for many, if not all, sectors. But these earlier generations of AI’s inner workings were easy to verify and validate, and its constructs and conclusions were understood.

Unlike today, the algorithms that replicated human thought processes were developed and controlled by development teams. Engineers and testers ensured that objectives, requirements, functions and success criteria, defined by humans, were met throughout a traditional development lifecycle. In production, the engineers knew what the inputs were, what decisions were made, and what activities formed each output.

Already an elaborate logic model, things became complicated when deep learning and machine learning launched off the simpler (if there is such a thing) generative AI. The traditional development lifecycle morphed into an infinite development continuum that continued outside the labs and without human control.

That brings us to the current day problem and theoretical conundrum. Advanced generative technologies are using extensible algorithmic logic and compounded learning processes that evolve in a black box shielding its complex conclusive analysis. A black box that no one, not even its engineers, can efficiently peer into.  

Not being able to isolate this cause and effect has created an uneasy stir in regulated industries that are driven by accountability and preponderance of the evidence. Not only a necessity, it is often a legal right. The very part of AI that was a technological Holy Grail has become a tenuous issue and potential barrier for adoption for some.

When Technology Runs Rogue

Nvidia’s experimental autonomous vehicle, BB8, is a perfect example of the technological veil that shields the inner workings of these technologies. Rather than following sets of instructions programmed by engineers, BB8 relies on an algorithm that learns how to drive by observing a human. Evolving through heuristic learning, BB8’s reasoning and decision-making are obfuscated and so complex that its engineers have struggled in deconstructing it.

Built on human-like memory and learning structures, similar to Neural Turing Machines (NTM), BB8’s behaviours are not copied, but developed and learned autonomously. Incrementally learning and continuously refining, memory cells retrieve complete vectors using heuristic-based patterns and assign priority, just like the human memory.

That’s why, on the surface, BB8 operates as if a human were driving it, seemingly making all of the right decisions. But does it? Always? Once out “in the wild,” disentangling the decision processes behind the behaviours of deep neural networks is a forensic nightmare. With no way to verify intent and causality, if the vehicle crashed into a tree, it may be impossible to determine why (to their credit, Nvidia engineers have made some progress).

Some Experts Are Sounding the Alarm

Recently at an Atlantic Council conference on AI, Frederick Chang, former director of research at the National Security Agency, stated: “There has not been a lot of work at the intersection of AI and cyber,” and “Governments are only beginning to understand some of the vulnerability of these systems,” resulting in an increased size of attack surface.

At the same conference, Omar Al Olama, the minister of state for artificial intelligence for the United Arab Emirates, was more direct in his warnings, charging that the “ignorance in government leadership” is leading to the adoption of AI without impartial scrutiny and that “sometimes AI can be stupid.” So, there you go. Suddenly the foreboding predictions on AI by the late Stephen Hawking and Elon Musk don’t seem so farfetched.

As always, there are two sides to a debate. A faction of academic and industry researchers don’t see what the big deal is – black boxes are not new and have been studied in other sciences for decades.

Nick Obradovich, MIT Media Lab researcher, observed that, “We’ve developed scientific methods to study black boxes for hundreds of years now and can leverage many of the same tools to study the new black box AI systems.” Obradovich’s paper proposes studying AI systems by using empirical observation and experimentation, as science has done with animal and human studies.

Not entirely fantastic examples, in my view, since it has not been uncommon for human and animal studies to have been heavily flawed and sometimes fully retracted as the sciences advance (e.g. David Mech’s alpha wolf study that piggy-backed on Schenkel’s flawed work, was retracted years later by Mech, but has persisted for decades and is still cited today).

The point put forth by the former group is that not knowing the problem to be solved or adopting AI without extensive risk analysis raises the stakes substantially. With advanced generative technologies already tasked with solving critical problems using image captioning, voice recognition, language translation and video intelligence (think, ‘deep fakes’), larger questions arise that require expansive foresight.

How would AI decisions that affect societies and individuals be overturned? What would be accepted as an evidentiary challenge – other AI systems’ analysis? Who decides what is ethical or moral, and where do ideological and cultural values fit in? More importantly, as the cyber civil space increases, will decision-makers be capable of providing technological governance and garnering societal trust?

Peace, War and Public Safety

Both public safety and national security already rely on specialized technologies to manage critical data with extraordinary integrity and assurance. Military and policing stand to benefit greatly from AI, especially for intelligence analysis, adaptive offensive and defensive responses, deployment of resources and correlation of integrated information – but not without calculated risks.

In the United States, the U.S. Department of Defense (DOD) has begun to roll out new strategies, partnerships and budgets to develop and adopt advanced generative technology for military use. Laying the groundwork for trust, DOD officials have been putting AI’s use in context, communicating and assuring that humans will be the decision-makers for any lethal actions. A good start – inevitably more challenges will percolate to the top, whether between humans and machines or between government, the public, private sector or other governments.

Policing will see similar concerns but on a more dynamic level. Wrestling with longstanding problems around best practices at the psycho-social level and managing and converting analogue and electronic data into usable intelligence, these advanced generative technologies are attractive.

Since data is preeminent in policing, data storage limits, integrated data, whether individual or statistically informing, has enormous value. Similar to the military, the sheer volume of data and uncoordinated data schemas makes these arduous, budget-draining tasks. AI’s generative processes become a game-changer – correlating these large amounts of data and applying ‘human-like’ thinking through heuristics, predictive and adaptive outputs can be produced.

A possible gateway for AI may be evidenced-based policing (EBP) practices that scientifically gather qualitative and quantitative evidence from operational practices and analyze it in a controlled framework to improve policies and procedures. Once data is catalogued and collated, AI could enhance its usability with other data, including temporal crime data, victim/offender characteristics, spatial and GPS data, body-worn camera video, biometrics, intelligence or field data, evidence and forensics.

Opening the Black Box

The more ‘neural’ the technology becomes, the more ethics and privacy issues will arise from AI’s ambiguous processes. For all its potential, the lack of root-cause insight and inability to examine the guts of advanced generative technologies leaves a gaping hole in the credibility of autonomy and risks of unknown vulnerabilities.

With much hope pinned to other critical areas, like the diagnosis and pathology of diseases, prediction and regulation of economies and markets and societal problems, black box oversight will be a balance between functional need and acceptance of some ambiguity and errors, only where it makes sense. Keeping in mind, as we do not fully understand human memory, advanced generative technologies will further leverage cognitive psychology and neuroscience to improve our understanding of these subjective, experiential and unique processes.

As our physical, biological and social systems collide and these technologies become more like us, transparency will not serve all layers of trust. All stakeholders will need to be at the table – technologists and scientists, government, industry and the public – in order to form a deeper, critical perspective on how we use advanced generative technologies and which black boxes must be opened.