When we imagine robots going rogue, we often picture sinister possibilities — the murderous droid in Robocop, or Skynet, the neural network in the Terminator films that plots a nuclear attack. These stories are variations on the old Frankenstein trope: We brought a new consciousness into the world. We thought it would be nice. It turned out evil.
When entrepreneurs think about AI going rogue, though, their chief fear isn’t that the technology will rebel. They worry, rather, that it will try to do right and fail. This conundrum is called the alignment problem: We want robots to do our bidding, but how can we be sure they really get it?
Robots aren’t people. That’s why, when people program robots or give them commands, the nuances can get lost — or, to use the technical language, incentives can get misaligned.
As the founder of tech startup Applied Brain Research, which spun out of his work at the University of Waterloo, Chris Eliasmith has witnessed this alignment problem up close. Eliasmith’s team builds neural networks that process time-series data — that is, data that changes over time.
AIs can be shockingly bad at even basic time-series challenges, like watching a video advertisement on YouTube and comprehending what happened in it. They just don’t see things the way we do.
The problem is data overload. Think of the amount of data in even a tiny snippet of film, like the flying bicycle scene at the end of ET. There’s a deluge of sensory information: the bikes, which move in the foreground; the sun and the trees, which remain static in the background; the characters’ facial expressions; the streets below; the clouds; the music; the camera angles; the colour variations. And that’s just in one frame. During 30 seconds of film, 720 frames will flash before your eyes.
When faced with this surfeit of information, machines get overloaded and confused. But the human brain has developed a clever hack: It picks out the relevant details and ignores the rest, thereby reducing many stimuli to a comprehensible data set.
Eliasmith and his team want machines to mimic this process. They train their AIs on a variety of time-series datasets: heartbeat monitors, Wi-Fi signals, scenes from films. The AIs are programmed both to compress these datasets — separating the bits they deem relevant from the bits they don’t — and then to analyze the resulting material.
Imagine a bot that is being trained using the ET scene. It needs to pick out the relevant narrative details and discard the irrelevant information, reducing the data set to a manageable size. If it does that correctly, it should be able to describe what happened in the scene. If it gets the scene wrong, the researchers will correct it and run the test again and again, training the bot to better distinguish signals from noise.
The result of this work is a neural network that is already vastly better than many of its peers at understanding time-series data. The technology could be used in various applications, from healthcare (monitoring changes in a patient’s heart rate) to communications (providing real-time language translation via a device in a user’s ear). “We can encode a billion time-series data points, whereas most of these models will struggle with 8,000,” says Eliasmith. By reducing a vast field of information to a smaller package, the machine is turning an overwhelming cognitive task into a manageable one. We do the same thing every minute of the day.
Eliasmith’s approach may be high-tech, but sometimes alignment is just a matter of good communication. Jamie Alexander is the founder and CEO of Rubies, an online retailer of clothes and beachwear for trans girls and women. The Toronto-based company is named after Alexander’s daughter, who began her social gender transition at age nine. When he started Rubies, Alexander thought it would be a simple apparel company, but parents of trans youth started reaching out for advice. Soon, there were more requests than he could reasonably field.
Alexander’s background is in tech (he’s founded three other startups), so he created a tech-based solution: the Rubies Gender Journey Chatbot, which offers advice to parents of trans or gender-questioning kids. The bot is built atop an existing large-language model, which Alexander leases for a monthly fee. Much like ChatGPT, the Rubies Chatbot can respond coherently to natural-language prompts. Unlike ChatGPT, it has a specific tone, purpose and field of expertise. In short, it has an identity.
Alexander developed that identity via a months’-long “tuning” process. “You tell the bot what kind of job it has,” says Alexander. “You tell it what kind of personality it has.” He explained that the bot should be friendly, queer-positive and eager to help. Then he began asking sample questions (“What do I do if my child says they’re trans?” “How do I help my child come out at school?”) to assess its responses.
Of course, the initial responses were imperfect, but those imperfections could be addressed through further tuning. The bot’s language was stiff and cold, so Alexander instructed it to lighten up and use emojis. It sometimes adopted a self-righteous tone (it would admonish users with phrases like “it’s important to remember…”) so Alexander recommended downplaying the judgment. Frequently, the bot acted like it feared getting sued. “It was overly cautious, telling people to go to a therapist every time they asked a question,” Alexander recalls. “So I said, ‘Don’t recommend a therapist unless it’s really important.’ People are using the bot because they don’t have access to therapy.”
With each iteration, the bot has become more aligned with Alexander’s vision. But it doesn’t need to be perfect to be useful. It just needs to be significantly better than the alternatives: social-media forums, which can be toxic and rancorous, or Google, which is a free-for-all.
Alexander continues to monitor and finesse the bot’s real-world interactions. But the overall quality of the AI responses has surpassed expectations. Occasionally, a troll will ask provocative questions, but Alexander isn’t too bothered. Through tuning, he has delineated the boundaries of the bot’s expertise (it doesn’t comment on culture-war controversies), and he has limited the pool of online information it can draw from. “If you want to jailbreak the bot and try to get it to say stupid things, go ahead,” he says. “But if people ask questions in earnest, it won’t go rogue.” He’s taught his bot well. It knows who it is.
Teaching bots who they are is Mike Murchison’s specialty, too. Murchison is the CEO and founder of Ada, a Toronto company specializing in AI customer service. When you call a customer-service line and talk to a bot, the experience is typically a waste of time: You get a barrage of generic or off-topic responses, until, at last, you get connected to a human. But, Murchison says, AI operators should, in theory, be better than people. They work all hours in any language, and they can manage an extraordinary range of tasks. “With AI, knowledge becomes centralized,” he says. “So there’s no need for customers to be handed off to a million different people.”
Ada leases its technology to some of the biggest names in communications and tech: — Meta, Verizon, Square. Its main product suite comprises three AI applications, all built atop GPT-4. The first is the customer-service bot itself. When a person calls into a help line or types a request into a chat box, the virtual agent figures out a response, often by reading the company’s instructional materials.
The second application is a quality-assurance program, which checks that the first application is doing its job well. (If the first application is a virtual customer-service representative, the second one is a virtual supervisor.) It analyzes language from a representative sample of customer calls and annotates them for quality. Did the bot give dangerous advice? Did its responses reflect the materials it drew from? Did they reflect the customer’s demands? “We make sure the calls are safe, accurate and relevant,” says Murchison.
Having a bot monitor another bot may seem like having a fox guard a henhouse. Can either be fully trusted? The answer, at least initially, is no. That’s why human oversight is essential. The client company has access to all the annotations produced by the quality-assurance bot. Employees can then re-annotate those calls to see if their assessments align with the bot’s. If they catch an error, it’s time for an intervention.
That’s where Ada’s third program comes in — a coaching interface, whereby humans can train the virtual agent. The work requires no coding ability. Imagine a fintech company where an AI customer-service bot has been mistakenly issuing applications for elite credit cards — the kind exclusively geared toward high-net-worth individuals — to people with average incomes. An employee would fix this problem by typing a natural-language command into the coaching software: “Make sure you collect the client’s gross annual income before you make any recommendations.” That’s it. An instruction has been delivered. An error has been corrected.
The process is iterative. Ada doesn’t guarantee perfection from the get-go, but it does promise results that get exponentially better over time. It released its GPT-backed software trio earlier this year and has seen the percentage of successful customer calls shoot up from 30 to 70. “By the end of the year, one of our clients will hit 100 percent,” Murchison predicts.
To make AIs more aligned is, at its core, to make them more like us — to narrow the gap between the way they think and the way we do. It’s little wonder, then, that the people working in the field are often contemplating what it means to be a person in the world. Alexander is training his bot to dispense the kind of advice he’d want to give and receive. Murchison’s coaching process resembles a typical employee-boss relationship: You observe the employees’ work and tell them how they could do better. Eliasmith and his team, meanwhile, are programming their AIs to mimic human modes of thought.
Ironically, while AI technology may seem futuristic, it’s modelled on an ancient cognitive system — the one inside our skulls.
Learn more about the latest AI-driven innovations at MaRS Impact AI, an in-person and online event featuring panels, workshops and demonstrations, on Feb. 22, 2024.
Illustration: Monica Guan
This website uses cookies to save your preferences, and track popular pages. Cookies ensure we do not require visitors to register, login, or share any identity information.