Microsoft AI CEO Mustafa Suleyman: Our AI Will Differentiate Via Personality
Suleyman touts new memory, assistive, and research capabilities as Microsoft pushes forward in the crowded personalized AI assistant space.
Microsoft announced a slew of product upgrades to its Copilot AI bot on Friday, including longer memory and deeper personalization features intended to make the bot feel more like a companion and less like a Wikipedia-retrieval machine.
Mustafa Suleyman, Microsoft’s AI CEO, believes this is the primary way AI companies will differentiate their products. It all comes down to personality, he told me on Big Technology Podcast Friday afternoon.
With Amazon, Anthropic, Google, Microsoft, and OpenAI all working to build the same style of AI — one that is contextually aware, assistive, and high-EQ — the differentiator, Suleyman said, will be how it relates to users. So, with its new updates, Microsoft is working to establish its Copilot as the most personable AI companion, and it wants to push forward in this regard quickly.
“We're going to be different by leaning into the personality and the tone very, very fast,” Suleyman said. “We want it to feel like you're talking to someone who you know really well.”
Suleyman also addressed Microsoft’s relationship with OpenAI, headlines about it pulling back from data center leases, and the timeframe he expects to see AGI in an in-depth, wide ranging conversation.
You can read the Q&A below, edited for length and clarity, and listen to the full episode on Apple Podcasts, Spotify, or your podcast app of choice.
Alex Kantrowitz: You’re introducing a bunch of new AI product upgrades including better memory, the capability to take action like booking a flight, and potentially some sort of avatar in the future. All of this makes your Copilot assistant more personable, is that where AI goes next?
Mustafa Suleyman: We're transitioning from the end of the first phase of this new era of intelligence into the very beginning of the next phase. Over the last couple of years, we've all been blown away by the basic, factual, succinct Q&A style responses that these chatbots give us. That's awesome and has been incredible.
You can think of that as its IQ, its basic smarts. That's magical, but the majority of consumers really care about its tone. They care if it’s polite and respectful. Is it occasionally funny in the right moments? Does it remember not just my name, but how to pronounce my name? And when I correct it, does it remember that correction? That's a really hard problem. These subtle details make up its emotional intelligence. That's what we're taking small steps towards today as we launch a bunch of new features around memory, personalization, and actions.
So how long will the memory go back? Will I have to tell it like every couple of months who I am? I feel like I'm living in The Notebook every time I'm trying to talk to one of these chatbots.
Unfortunately, it's not going to be perfect, but it is a big step forward. It's going to remember all the big facts about your life. You may have told it that you're married; you have kids; that you grew up in a certain place; you went to school at a certain place; and, over time, it's going to start to build a richer understanding of who you are and what you care about. What kind of style you like; what sort of answers you like: longer or shorter bullets? Conversational or more humorous?
Although it won't be absolutely perfect, it will be a different experience and it's the number one feature that is going to unlock a different type of use. Because every time you go to it, you'll know that the investment that you made in the last session isn't wasted and you're building on it, time after time.
Along with memory, you're releasing “actions.” These are things like booking flights and restaurants reservations, space at a restaurant. Do you think that if the AI bot knows me well, then maybe I’ll feel better about having it take my credit card and go book that flight. Is that the idea?
Exactly. Getting access to knowledge in a succinct way is all well and good. Doing it with a tone and a style that is friendly, fun and interactive is also cool. But, what we want these things to be able to do is to, like you said, buy things, book things, plan ahead, take care of the administrative burden of life. That's always been the dream and why I've been motivated to build these personal AIs, going back as far as I can remember in 2010 when I first started DeepMind. That's what we're going after: taking time and energy off your plate and giving you back moments where you can do exactly what you want, with more efficient action.
What we want these things to be able to do is to, like you said, buy things, book things, plan ahead, take care of the administrative burden of life.
Copilot will now be able to take control of your mouse on Windows and navigate around.
It can show you, for example, where to turn on a particular setting or how to fill out a form. You may not know how to edit a photo and it'll point out where you need to adjust the slider or where to click on a drop down menu. It's going to make things have a little bit less friction and make it a little bit easier to get through your digital life.
And you're also going to release avatars at some point?
That this is definitely going to be one of those that, as we would say in the UK, is like Marmite. Marmite…
You like it or you don't.
You like it or you don't. And for some people, they absolutely love it. In testing, it completely transforms the experience. Some people love a text-based experience. They like the facts. They like to get in and out. They want to know what's what and they're done. Some people like an image-based experience or a video-based experience.
Other people really resonate when their Copilot shows up with its own name, its own visual appearance, its own expressions and style, and it feels much more like talking to you or I now.
Its eyebrows adjust, its eyes open or close, its smile changes. We're just experimenting. We're not launching anything today, but we are showing a little bit of a hint of where we're headed. It's super exciting and this is going to be the next platform of computing — just as we had desktops, laptops, smartphones and wearables. Over time, we're going to have deep and meaningful lasting relationships with our personal AI companions.
But Mustafa, everybody listening [and reading] will ask the same question. Amazon, Google, OpenAI are also building personal assistants, so how will Microsoft differentiate? By the basis of personality?
We're going to be different by leaning into the personality and the tone very, very fast. We want it to feel like you're talking to someone who you know really well, that is really friendly, kind and supportive, but also reflects your values. If you have a certain type of expression that you prefer or a certain value system, it should reflect that over time. It’ll feel familiar to you and friendly.
At the same time, we also want it to be boundaried and safe. We care a lot about it being a straight up, simple individual. We don't want to engage in any of the chaos. The way to do that, we found, is that it stays reasonably polite and respectful, super even-handed. It helps you see both sides of an argument. It's not afraid to get into a disagreement. We’re starting to experiment at the edges of that side of it.
Continued below after a word from our sponsor, Vanta…
To scale your company, you need compliance. And by investing in compliance early, you protect sensitive data and simplify the process of meeting industry standards—ensuring long-term trust and security.
Vanta helps growing companies achieve compliance quickly and painlessly by automating 35+ frameworks—including SOC 2, ISO 27001, HIPAA, GDPR, and more
And with Vanta continuously monitoring your security posture, your team can focus on growth, stay ahead of evolving regulations, and close deals in a fraction of the time.
Start with Vanta’s Compliance for Startups Bundle, with key resources to accelerate your journey.
Step-by-step compliance checklists
Case studies for best-in-class examples from fast-growing startups
On-demand videos with industry leaders
So is it really just making it more personable than the others? That's the way to differentiate?
We are at the very beginning of a new era where there are going to be as many Copilots or AI companions as there are people. There are going to be agents in the workplace that are doing work on our behalf. Everyone is going to be trying to build these things. What is going to differentiate is real attention to detail and true attention to personality design.
I've been saying for many years now, we are personality engineers. We're no longer engineering pixels — we're engineering tokens that create feelings, that create lasting meaningful relationships. That’s why we've been obsessed with memory, personal adaptation, style and declaring that it is an AI companion, not a tool.
We're engineering tokens that create feelings, that create lasting meaningful relationships.
A tool is something that does exactly what you intend and direct it to. Whereas, an AI companion is going to have a much richer, more emergent, dynamic and interactive style. It will change every time you interact with it. It will give a slightly different response. It's going to feel quite different to past waves of technology.
It's kind of wild to think that you might just go shopping for your flavor of AI companion. Is that the right way to look at it?
Yeah, you are. You're going to pick one that has its own values and style and one that suits your needs and one that adapts to you over time. As it gets used to you, it'll start to feel like a great companion, like your dog feels like a part of the family often. Over time, it's going to feel like a real connection. I can already see that, in hearing from users.
We do a lot of user research and I do a user interview every week with someone who uses the product, one of our power users. I am listening to them tell stories about how it makes you feel more confident, less anxious, more supported, more able to go out and do stuff.
I was chatting to a user last week who is 67 and she was out fixing her front door, which the hinge had broken and it needed repainting. Every time she repainted it, it was coming up with bubbles. So she phoned Copilot, and had a long conversation about how to sand it down in the right way.
She ended up going to Home Depot, forgetting what paint to get, called Copilot again, and had a chat about it.
It sounds mundane, but it's quite profound. It's incredible that people are relying on Copilot every day to help them feel “unblocked,” in her words. I thought that was an amazing story. It gives an insight into how this is already happening. It's already transforming people's lives every day.
it doesn't sound mundane to me at all. Most people don’t have many friends they can call to talk about Home Depot problems, so they bot is in your inner circle right away.
To me, it seems like if you're building this product you have to be ready for the fact that people are going to fall love literally with your product. Not just “I love my iPhone,” they will literally love Microsoft Copilot. Are you prepared for that?
That's a question of how we design it. It's about how you design the AI to draw boundaries around certain types of conversations. If you don't draw those boundaries, then you essentially enable the user of the technology to let those feelings grow and go down that rabbit hole.
That's not something that we do and it's not something we're going to do. In fact, we have classifiers that detect that kind of interaction in real time and will very respectfully and very clearly and firmly push back, before anything like that develops. We have a very, very low instance rate of that. You can try it yourself when you chat to Copilot.
If you try to flirt or say, I love you, you'll see it tries to pivot the conversation in a really polite way without making you feel judged or anything. To your earlier question, “What is going to differentiate the different chatbots?” Some companies are going to choose to go down different rabbit holes and others won't. The craft that I'm engaged with now, is to design personalities that are genuinely useful and super supportive, but are really disciplined and boundaried.
If it comes to the point where people want to build that deeper relationship. Maybe it's not a person to person relationship, maybe it's a third type of relationship. Are you willing to lose because you wouldn't go that route?
I like your empathy there. It's important to keep an open mind and be respectful of how people want to live their life. All I can tell you is that here at Microsoft AI, we're not going to build that and we'll be quite strict about the boundaries that we do impose there.
You can still get the majority of the value out of these experiences by being a supportive hype man, being there for the mundane questions of life, being there to talk to you about that lame boring day that you had or that frustration you had at work. That is a detoxification of yourself. It's an outlet, a way to vent and then show up better in the real world as a result. I see that a lot in the user conversations that I have as well. People feel like they've got what they needed to get out and they can show up as their best self with their friends and their family in the real world.
Right now, a theme I'm hearing in the AI world is the bots have been refusing too much. And you see OpenAI recently with their recent image generation reveal, they refuse a less. They allow you to do an image in the style of Studio Ghibli and make images of celebrities and public figures. How do you find the middle ground between wanting something to be robust and personable, but also holding true to your values?
It's something I think about a lot. It's not a bad thing that there are refusals in the beginning and that over time we can look at those refusals and decide: are we being too excessive? Are we going overboard? Or, have we got it in the right spot? Going the other way around too, early on, has its own challenges and I like that we've taken a pretty balanced approach because the next question that we're going to be asking is, how much autonomy should we give it in terms of the actions that it can take in your browser? As we're showing today, it is unbelievable to see Copilot actions operate inside of a virtual machine, browse the web independently with a few key check-ins where it gets your permission to go a step further.
But the interesting question is: how many of those degrees of freedom should it be granted? How long could it go off and work for you independently? It's healthy to be a little bit cautious here and take sensible steps rather than be too gung-ho about it. At the same time, the technology is really magical. This is working and in that environment, we should be trying to get it out there to as many people as possible as fast as possible. That's the balancing act that we've got to strike.
Let me read one more bit of product news and get your quick reaction. You're allowing people to check their memories and interact with their memories once the bot has built this memory database. You're also doing AI podcasts. You're launching your own version of Deep Research. You're introducing a Pages product to organize your notes.
Are these disparate updates or is it again all about building that AI personality?
All of those things that you mentioned enable you to get stuff done. The IQ and the EQ are about its intelligence and kindness. What people care about is: Can it edit my documents? Can it rewrite my paragraphs when I want it to? Can it generate a personalized podcast so the first thing in the morning it plays is exactly how I want it? Can I ask a question about my search result and interact in a conversational way?
All of those things sum up to bringing your computer and your digital experience to life so that you can interact with it and it can interact proactively. That's the big shift that's about to happen.
So far, your computer only ever does stuff when you click a button or you type something in your keyboard.
Now, it's going to be proactive. It'll proactively publish podcasts to you. It'll generate new personalized user interfaces that no one else has, entirely unique to you. It'll show you a memory of what it knows. All those things are about it switching from reactive mode to proactive mode. And to me, that's companion mode. A companion is thoughtful. It tries to pave the way for you ahead of time to smooth things over. It knows that you're taking the kids out on Saturday afternoon. That you've been too busy at work, you haven't booked anything—it suggests that you could go to the science museum, but then it second guesses it for itself because it knows that the science museum is going to be jam-packed. It's this constant ongoing interaction that's trying to help you out. And that's why I always say it's on your side, in your corner, it's got your back, looking out for you.
This is a vision we've heard, again, from Microsoft, from Amazon, from Apple, from Google. No one's fully delivered it. What makes it so difficult to build?
The world is full of open-ended edge cases as people have found for the last 15 years in self-driving cars. We're really at the very first stages. That's why I said we haven't nailed memory. It's not perfect. We certainly haven't nailed Actions. But you can start to see the first glimmers of the magic.
Do you remember back in the day when OpenAI first launched GPT-3 and when I was at Google and we had LaMDA? Most of the time it was kind of garbage, and it was crazy, but occasionally it produced something that was really magical. That's what great product creation is all about: locking in the moments when it works and really focusing on increasing those moments, addressing all the errors.
Having been through this cycle a few times, I can see that we're nearly there with memory personalization and Actions. It's at the GPT-3 stage. It's really buggy, but when it works, it's breathtaking. It reaches out at just the right time. It shows that it's already taking care of a bunch of things in the background. That is a very, very exciting step forward.
Is this the point where the models are saturated and now you're building the products? The conventional wisdom is they're at the point of diminishing returns at least.
No way, Jose.
We have got so much further to go. What happens is, people get so excited, they jump onto the next thing and they gloss over all of the hard fought gains that happen when you're trying to optimize something which already exists. Let's take, for example, hallucinations and citations. Clearly that's gotten a lot better over the last two or three years.
But it's not a solved problem. It's got a long way to go. With each new model iteration, all the tricks that we're finding to improve the index of the web, the corpus that it is retrieving from; the quality of the citations; the quality of the websites we're using; the length of the document that we're using to source from. There's so many details that go into increasing the accuracy from 95% to 98 % to 99% to 99.9%.
It's just a long march. People forget that last mile is a real battle. Often, a lot of the mass adoption comes when you move the needle from 99% accuracy to 99.9%. That's happened in the background in the last two or three years with dictation and voice. I've really noticed that across all the platforms, voice dictation has got so good, and yet that technology has been around for 15 years.
Some of us used it when it was like 80% accuracy, I certainly did. But now I'm seeing my mom was using it the other day and I'm like, “How did you learn how to do that?” And she was like, “You can just press this button.” That's that's kind of incredible.
That's just on the dictation side on the voice conversation side. We see much, much longer, much more interesting, much deeper conversations taking place when somebody phones Copilot. It's super fast. It feels like you're having a real world conversation. You can interrupt it almost perfectly. It's got real time information in the voice as well. It's aware of the latest sports result or the traffic in the area or the weather and stuff like that. I know people use it in their car on the way home or on the way to work or when they're washing up and they're in a hands-free moment and they just have a question.
It's a weird thing because it sort of lowers the barrier to entry to getting an idea out of your head. Weird things occur to us during the day, we're all like, “oh, I wonder about this. I wonder about that,” and then you go look it up on your phone. Whereas now, there is a modality that I'm increasingly seeing where people turn to their AI and be like, “hey, what was the answer to that thing? Or how does that work?” And, it might be a shorter interaction that could turn into a long conversation because the modality is enabling a different type of conversation, a different type of thought to be expressed. It's a super interesting time. We're figuring it out as we go along.
Let me ask the previous question a little bit differently. Are there diminishing returns on AI pre-training right now?
Maybe in pre-training, it's been a little slower than it was in the previous four orders of magnitude. But the same computation, the same FLOPs, or the units of calculation that go into turning data and compute into some insight into the model — that is a different application of the compute.
We're using compute at a different stage. We're either using it at post-training,
or we're using it at inference time where we generate lots of synthetic data to sample from. Net-net, we're still spending as much on computation. But, we're using it in a different part of the process. As far as everyone else should be concerned, aside from the technical details, we're still seeing massive improvements in capabilities, and that's for sure going to continue.
Okay, then can you help me understand some headlines I've been seeing about Microsoft?
Probably not, I doubt it [laughs]
Well, I'm going to ask anyway and you tell me what you think.
Reuters says Microsoft has pulled back from data center leases in the US and Europe that would two gigawatts of electricity in the US and Europe in the last six months due to an oversupply relative to its current demand. How does that make sense in context of what you just said?
It's funny, I did ask our finance guy who's responsible for all these contracts on Friday morning.