All Episodes

January 28, 2025 21 mins

On Monday, the stock market tanked, seemingly in reaction to the emergence of DeepSeek, an open source AI model developed in China. Nvidia, the semiconductor giant that has been the largest winner of the AI boom, erased $589 billion in market cap, for the biggest one-day wipeout in US stock-market history. Other chipmakers and big tech giants also swooned. So how did DeepSeek do it? Is it a big threat to the American AI giants like OpenAI and Anthropic? What does this say about export restrictions on US chips? On this special emergency session of the podcast, we spoke with Zvi Mowshowitz, an AI expert who authors the excellent Substack, Don’t Worry About the Vase. He answered all our questions and more to help understand what it means.

Read more: 
AI-Fueled Stock Rally Dealt $1 Trillion Blow by Chinese Upstart
World’s Richest People Lose $108 Billion After DeepSeek Selloff

Only Bloomberg.com subscribers can get the Odd Lots newsletter in their inbox — now delivered every weekday — plus unlimited access to the site and app. Subscribe at bloomberg.com/subscriptions/oddlots

      See omnystudio.com/listener for privacy information.

      Mark as Played
      Transcript

      Episode Transcript

      Available transcripts are automatically generated. Complete accuracy is not guaranteed.
      Speaker 1 (00:02):
      Bloomberg Audio Studios, Podcasts, Radio News.

      Speaker 2 (00:18):
      Hello and welcome to another episode of the Odd Lots podcast.

      Speaker 3 (00:22):
      I'm Joe Wisenthal and I'm Tracy Alloway.

      Speaker 2 (00:24):
      Tracy the Deep Seek sell off.

      Speaker 3 (00:27):
      That's right, it's pretty deep. Has anyone made that joke yet.

      Speaker 1 (00:30):
      We're in Deep Seek?

      Speaker 2 (00:31):
      Yeah, I don't think anyone who's made that joke.

      Speaker 3 (00:33):
      I will say, like, you know, it's bad in markets
      when all the headlines are about standard deviation, yes, right,
      And then you know it's really bad when you see
      people start to say it's not a crash, it's a
      healthy correction. Yes, that's the real cope.

      Speaker 2 (00:49):
      But just for like real scene setting. You know, We've
      done some very timely interviews about tech concentration in the
      market lately and how so much of the market is
      this big concentrated bed on AI et cetera. Anyway, on Monday,
      I think people will be listening to this. On Tuesday,
      markets got clobbered in video one of the big winners
      as of the time I'm talking about this three thirty

      (01:10):
      pm on Monday, down seventeen percent. We're talking major laws
      is really across the tech complex. Basically, it seems to
      be catalyzed by the introduction of this high performance, open
      source Chinese AI model called deep Seek. I was born,
      from what we know, out of a hedge fund. Apparently
      it was very cheap to train, very cheap to build.

      (01:31):
      You know, the tech constraints at this point didn't seem
      to be much of a problem. They may be a
      problem going forward, But yes, here is something the entire
      market betting on a lot of companies making AI and
      are now concerns about, of course, a cheap Chinese competitor.

      Speaker 3 (01:45):
      I just realized, Joe, this is actually your fault, isn't it.
      This last week you wrote that you were a deep
      Seek aibro and look what you've done. You've wiped five
      hundred and sixty billion dollars off of in videos market.

      Speaker 2 (01:58):
      Yeah, might be that's you anyway. One of the interesting
      questions though, is that this was sort of announced in
      a white paper in December. Why did it take for
      till January twenty seventh for related to freak people out?
      Big questions? Anyway, let's jump right into it. We really
      do have the perfect guest, someone who's was here for
      our election Eve Special, a guy who knows all about

      (02:20):
      numbers and AI and quant stuff, and he writes a
      substack that has become for me a daily absolute must
      read where he writes an extraordinary amount. I don't even
      know how he writes so much on a given day.
      We're going to be speaking with Zvi Mashowitz. He is
      the author of the Don't Worry about the Vase blog
      or substack. ZV. You're also a deep seki brill. You've

      (02:41):
      switched to using that.

      Speaker 1 (02:43):
      So I use a wide variety of different ais. So
      I will use quad paranthropic, I will use one from
      ta GPT, from open Ai. I'll use Gemini sometimes, and
      I'll use Perplexity for web searches. But yeah, I'll use
      R one, the new deep seat model for certain type
      queries where I want to see how it thinks and
      like see the logic laid out, and then I can judge,

      (03:06):
      like did that make sense? Do I agree with that?

      Speaker 3 (03:08):
      So one of the things that seems to be freaking
      people out as well as the market is that purportedly
      this was trained on like a very low cost, something
      like five point five million dollars for deep Seek V three,
      although I've seen people erroneously say that the five point
      five million was for all of its R one models,

      (03:30):
      and that's not what it says in the technical paper.
      It was just for V three. But anyway, oh I
      should mention it also seems like a big chunk of
      it was built on Mama, so they're sort of piggybacking
      off of others investment. But anyway, five point five million
      dollars to train, is that a realistic and then b

      (03:50):
      do we have any sense of how they were able
      to do that.

      Speaker 1 (03:53):
      So we have a very good sense of exactly what
      they did because they're unusually open and they gave us
      technical papers, they tell us what they did. They still
      hid some parts of the process, especially with getting from
      V three, which was trained for the five point five
      million two R one, which is the reasoning model for
      additional millions of dollars, where they tried to make it
      a little bit harder for us to duplicate it by
      not sharing their reinforcement learning techniques. But we shouldn't get

      (04:16):
      over anchored or carried away with the five point five
      million dollar number. It's not that it's not real, it's
      very real. But in order to get that ability to
      spend five point five million dollars and get the model
      to pop out. They had to acquire the data, they
      had to hire the engineers, they had to build their
      own cluster, they had to over optimize to the bone
      their cluster because they're having problems of chip access thanks

      (04:36):
      to our export controls. And they were training on eight hundreds.
      And the way they did this was they did all
      these sorts of mini optimism, little optimizations, including like just
      exactly integrating the hardware, the software, everything they were doing
      in order to train as cheaply as possible on fifteen
      trillion tokens and get the same level of performance or

      (04:58):
      you know, close to the same level performance as other
      companies have gotten with much much more compute. But it
      doesn't mean that you can get your own model for
      five point five million dollars, even though they told you
      a lot of the information. In total, they're spending hundreds
      of millions of dollars to get this result.

      Speaker 2 (05:11):
      Wait, explain that further. Why does it still take hundreds
      of millions And does this mean if it takes hundreds
      of millions of dollars that the gap between what they're
      able to do versus the say American labs is perhaps
      not as wide as maybe people think.

      Speaker 1 (05:24):
      Well, what deepseek is doing is they have less access
      to chips. They can't just buy Navidiot chips the same
      way that you know open ai or Microsoft or and
      throb it can buy Nvidiot chips. So instead they had
      to make good use, very very efficient, killer use of
      the chips that they did have. So they focused on

      (05:44):
      all these optimizations and all of these ways that they
      could save on compute. But in order to get there,
      they had to spend a lot of money to figure
      out how to do that and to build the infrastructure
      to do that. And you know, once they knew what
      to do, it cost them five point five million dollars
      to do it. They've shared a lot of that information
      and this has dramatically reduced the cost of somebody who
      wants to follow in their footsteps and train a new

      (06:06):
      model because they've shown the way of many of their
      optimizations that people didn't realize they could do or didn't
      realize how to do them. That can now very easily
      be copied. But it does not mean that you are
      five point five million dollars away from your own V three.

      Speaker 3 (06:19):
      So the other thing that is freaking people out is
      the fact that this is open source, right, we all
      remember the days when OpenAI was more open and now
      it's moved to closed source. Why do you think they
      did that? And like how big a deal is that?

      Speaker 1 (06:35):
      So this is one of those things where they have
      a story and you can believe their story. You're not
      with their story, but their story is that they are
      essentially ideologically in favor of the idea that everyone should
      have access to the same AI, that AI should be
      shared with the world, especially that China should help pump
      out its own ecosystem and they should help grow all
      of the AI for the betterment of humanity. And they're

      (06:57):
      going to get artificial general intelligence and they are going
      to open source that as well, and this is their
      the main point of deep Sea. This is why deep
      Seak exists. They disclaiming even having a business model really
      and you know they're they're an outgrowth of a hedge fund,
      and hedge fund makes money and maybe they can just
      do this if they choose to do that, or maybe

      (07:17):
      they will end up with a different business model. But
      it was obviously very concerning from a lot of angles
      if you open source increasingly capable models, because you know,
      artificial general intelligence means something that's you know, as smart
      and capable as you and I as a human, and
      perhaps more so. And if you just hand that over

      (07:37):
      in open form to anybody in the world who wants
      to do anything with it, then we don't know how
      dangerous that is, but it's existentially risky at some limit
      to unleash things that are smarter and more capable, more
      competitive than us, that are then going to be free
      and loose to you know, engage in whatever any human
      directs them to do.

      Speaker 3 (07:58):
      I have a really dumb question, but I hear people
      say artificial general intelligence all the time. AGI, what does
      that actually mean?

      Speaker 1 (08:07):
      There is a lot of dispute over exactly what that means.
      The words are not used consistently, but it stands for
      artificial general intelligence. Generally, it is understood to mean you
      can do any task that can be done on a
      computer that can be done cognitively only as well as
      a human.

      Speaker 2 (08:26):
      I mean, it does most of these things do things
      much better than me. I don't know how to code,
      and so, but I get that there are still some things.
      Maybe they wouldn't be as good as proving some of
      the are you human tests? Everyone to talk about Jevins
      paradox and so we see in video and broadcom shares
      these chip companies, they're getting crumbled today. And one of
      the theories like, oh no, with all these optimizations and
      so forth, in researchers will just use those and they'll

      (08:50):
      still have max demand for compute, and so it won't
      actually change the ultimate end for compute. How are you
      thinking about this question?

      Speaker 1 (08:58):
      So I'm definitely a Jevans pro right now from the
      perspective of this, you.

      Speaker 2 (09:03):
      Don't think it'll have a negative impact and just the
      amount of compute demanded.

      Speaker 1 (09:08):
      The tweet I sent this morning was Navidio down eleven
      percent pre market on news that his chips are highly useful.
      And I believe that what we've shown is that, yes,
      you can get a lot more in some sense out
      of each Navidia chip than you expected. You can get
      more AI. And if there was a limited amount of
      stuff to do with AI, and once you did that stuff,

      (09:29):
      you were done, then that would be a different story.
      But that's very much not the case. As we get
      further along towards AGI, as these ais get more capable,
      we're going to want to use them for more and
      more things, more and more often, and most importantly, the
      entire revolution of R one and also Open Eyes O
      one is inference time compute. What that means is every

      (09:49):
      time you ask the question, it's going to use more compute,
      more cycles of GPUs to think for longer, to basically
      use more tokens or words to figure out what the
      best possible answer is. And this scales not necessarily with
      out limit, but it scales very very far. So Opening
      Eyes new three is capable of thinking for you know,
      many minutes. It's capable of potentially spending you know, hundreds

      (10:11):
      or even in theory thousands of dollars or more on
      individual query. And if you knock that down by an
      order of magnitude, that almost certainly gets you to use
      it more for a given result, not use it less,
      because that is effect starting to get prohibitive. And over time,
      you know, if you have the ability to spend or
      markly vittle of money and then get things like virtual

      (10:33):
      employees and abilities to answer any question under the sun, yeah,
      there's basically unlimited demand to do that or to scale
      up the quality of the answers as the price drops.
      So I basically expect that as fast as the VIDIA
      can manufacture chips and we can put them into data
      centers and give them electrical power. People will be happy
      to pie those chips.

      Speaker 3 (10:54):
      At the risk of angering the Jeffons Paradox bros. Just
      to push on the point a little bit more so,
      my understanding of deepseek is that one of the reasons
      it's special is because it doesn't rely on like specialized components,
      custom operators, and so it can work on a variety
      of GPUs. Is there a scenario where, you know, AI

      (11:17):
      becomes so free and plentiful, which could in theory be
      good for Nvidia, But at the same time, because it's
      easy to run on a bunch of other GPUs, people
      start using you know, more like ACIK chips, like customized
      chips for a specific purpose.

      Speaker 1 (11:35):
      I mean, in the long run, we will almost certainly
      see specialized inference chips, whether from the Video or they're
      from someone else, and we will almost certainly see various
      different advancements that today's chips are going to be obsolete
      in a few years. That's how AI works, right, There's
      all these rapid advancements. But you know, I think in
      Video is in a very very good position take advantage
      of all of this. I certainly don't think that like

      (11:57):
      you'll just use your laptop to run the best agis
      and therefore we don't have to worry about buying TPUs
      is a porposition. It's certainly possible that rivals will come
      up with superior checks. That's always possible. The video does
      not have a monopoly, but the video certainly seems to
      be a dominantiation right now.

      Speaker 2 (12:29):
      It seems to me. I mean, I know there's others,
      but it seems to be in the US. There's like
      three main AI producers of models that people know about.
      There's Open Ai, there's Claude, and then there's Meta with Lama.
      And it's worth knowing that Meta is green today, that
      the stock is actually up as of the time I'm
      talking about this one point one percent. Just go through

      (12:51):
      each one real quickly, how the sort of deep seek
      shock affects them and their viability and where they stand today.

      Speaker 1 (12:59):
      I think the most amazing thing about your question is
      that you forgot about Google.

      Speaker 2 (13:02):
      Oh yeah, right, yeah, that's very tilling.

      Speaker 1 (13:05):
      But everyone else has forgotten about Yeah, surprising Semini flash
      thinking their version of one and R one got updated
      a few days ago, and there are many reports that
      it's actually very good now and potentially competitive and effectively.
      It's free to use for a lot of people on
      AI studio, but nobody I know has taken the time

      (13:26):
      to check and find out how good it is because
      we've all been too obsessed with being deep seep roads.
      Google's had its like rhetorical lunch eaten over and over
      and over again December. Like open a I would come
      up with advance after advance after Advance, then Google would
      love Advance after advanced after advance, and Googles would be
      seemingly actually, if anything, more impressive. And yet everyone will
      always just talk about open a eyes, so this is
      not even new. Something is going on there. So in

      (13:46):
      terms of open Ai, Open Ai should be very nervous
      in some sense, of course, because they have the reasoning models,
      and now the reasoning model has been copied much more
      effectively than previously, and the competition is a hell of
      a lot cheaper Open Eye is charging, so it's a
      direct threat to their business model for obvious reasons, and
      it looks like their lead in reasoning models is smaller

      (14:07):
      and faster to undo than you would expect. Because if
      deep Sea can do it, of course Anthropic and Google
      you know, can do it. And everyone else can do
      it as well, and Thropic, which produces Claude, has not
      yet produced their own reasoning model. They clearly are operating
      under a shortage of compute in some sense, so it's
      entirely possible that they have chosen not to launch a

      (14:27):
      reasoning model even though they could, or not focused on
      training one as quickly as possible until they've addressed this problem.
      They're continuously taking investment. We should expect them to solve
      their problems over time, but they seem like they should
      be dressed directly concerned because they're less of a directly
      competitive product in some sense, but also they tend to
      market to effectively much more aware people, so their people

      (14:49):
      will also know about deep Seak and they will have
      a choice to make. If I was Meta, I would
      be far more worried, especially if I was on their
      Genai team and wanted to keep my job, because Meta's
      lunch has been eaten massively here right, Meta with Lama
      had the best open models, and all the best open
      models were effectively fine tunes of Lama, and now deep

      (15:12):
      Seat comes out, and this is absolutely not in any
      way a fine tune of Lama. This is their own product,
      and V three was already blowing everything that Meta had
      out of the water. Are one. There are reports that
      it's better than their new version that they're training now,
      it's better than Lava four, which I would expect to
      be true. And so there's no point in releasing an

      (15:33):
      inferior open model if everyone on the open model community
      just be like, why don't I just use deep Sea Tracy.

      Speaker 2 (15:38):
      It's interesting that, as V said, the people who should
      be nervous are the employees of Meta, not Meta itself,
      because Meta is up, and so you gotta wonder. It's like, well,
      maybe they don't. I don't know, maybe they don't need
      to invest as much in their own open source AI
      if there's a better one out there now the stock
      is up.

      Speaker 1 (15:56):
      Anyway, The market has been very strange from my perspective
      on how it reacts to different things that Meta does.
      For a while, Meta would announce we're spending more in AI,
      we're investing in all these data centers, we're training all
      of these models, and the market would go, what are
      you doing? This is another metaverse or something, and we're
      gonna hammer your stock and we're gonna drag you down.
      And then with the most recent sixty five billion dollar

      (16:16):
      announce spend. Then then Meta was up. Presumaly, they're gonna
      use it mostly for inference effectively in a lot of
      scenarios because they had these massive inference costs to want
      to put ail over Facebook and Instagram. So you know,
      if anything, like you know, I think the market might
      be speculating that this means that they will know how
      to train better lamas that are cheaper to operate, and

      (16:38):
      their costs will go down, and then they'll be in
      a better position, and that theory isn't.

      Speaker 3 (16:42):
      Crazy since we all just collectively remembered Google. I have
      a question that's sort of been on the back in
      the back of my mind. I think Joe has brought
      this up before as well. But like when Google debuted,
      it took years and years and years for people to
      sort of catch up to the search function, and actually

      (17:04):
      no one ever really caught up, right, So Google has
      like dominated for years. Why is it when it comes
      to these chatbots there aren't like higher wider moats around
      these businesses.

      Speaker 1 (17:18):
      So one reason is that everyone's training on roughly the
      same data, meeting the entire Internet and all of human knowledge,
      so it's very hard to get that much of a
      permanent data edge there unless you're creating synthetic data off
      of your own models, which is what Opening Eye is
      plausively doing. Now. Another reason is because everybody is scaling
      as fast as possible and adding zeros to everything on

      (17:39):
      a periodic basis in calendar time. It doesn't take that
      long before your rival is going to have access to
      more compute than you had, and they're copying your techniques
      more aggressively. They's just a lot less secret sauce there's
      only so many algorithms. Fundamentally, everyone is relying on the
      scaling laws. It's called the bitter lesson is the idea
      that you know, you just scale more, you just use
      more compute, you just use more data, you just use

      (18:00):
      more parameters and deep seek. You're saying, maybe you don't.
      You can do more optimizations, you can get around this
      problem and still get a superior model. But mostly, yeah,
      there's been a lot of just I can catch up
      to you by copying what you did. Also that I
      can see the outputs, right, I can query your model,
      and I can use your model's outputs to actively train

      (18:22):
      my model. And you see this in things like most
      models that get trained. You ask them who trains you,
      and they will often say, oh, I'm from Open Ai and.

      Speaker 2 (18:33):
      The internet has gotten so weird. I just the internet
      is so weird to speak. Mashavitz, thank you so much
      for running over to the Odd Lots and helping us
      record this emergency pod on the Deep Seek selloff though.
      It was fantastic.

      Speaker 1 (18:45):
      All right, thank you, Tracy.

      Speaker 2 (18:58):
      I love talking to v We got just sort of
      make him our Ai or our Ai guy.

      Speaker 3 (19:04):
      I mean, to be honest, we could probably have him
      back on again because there's gonna be stuff happening.

      Speaker 2 (19:09):
      Maybe we will, and obviously it's we could go a
      lot longer. This is a really exciting story. This is
      a really exciting story, and things are just getting really
      weird these days.

      Speaker 3 (19:19):
      It is kind of crazy how fast all of this is. Yap,
      And then the other thing I would say is just
      the bitter lesson. Great name for a band.

      Speaker 2 (19:29):
      Oh, totally totally great. Maybe when we do our Ai
      themed proud rock band. True, Yes, that could be our name.

      Speaker 3 (19:36):
      Yes, let's do that. Okay, shall we leave it there?

      Speaker 2 (19:38):
      Let's leave it there.

      Speaker 3 (19:39):
      This has been another episode of the Odd Thoughts podcast.
      I'm Tracy Alloway. You can follow me at Tracy Alloway.

      Speaker 2 (19:45):
      And I'm Jill Wisenthal. You can follow me at the Stalwart.
      Follow our guests Vimashovitz, he's at this v Also definitely
      check out his free subs deck. It's a must read
      for me. Don't worry about the v OZ, really great stuff
      every single day. Follow our producers Carmen ra Rigaz at
      Kerman armand dash O Bennett at Dashbot and kill Brooks
      at Kilbrooks. For more oddlocks content, go to Bloomberg dot

      (20:07):
      com slash odlocks. We have transcripts, a blog in a newsletter,
      and you can chat about all of these topics twenty
      four to seven in our discord Discord dot gg slash Odlots.

      Speaker 3 (20:15):
      Maybe we'll give zv to do a Q and A
      in there with oh yeah, that'd be great. And if
      you enjoy Oddlots, if you like it when we roll
      out these emergency episodes, then please leave us a positive
      review on your favorite platform. Thanks for listening.
      Advertise With Us

      Hosts And Creators

      Joe Weisenthal

      Joe Weisenthal

      Tracy Alloway

      Tracy Alloway

      Popular Podcasts

      Dateline NBC

      Dateline NBC

      Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

      The Bobby Bones Show

      The Bobby Bones Show

      Listen to 'The Bobby Bones Show' by downloading the daily full replay.

      Music, radio and podcasts, all free. Listen online or download the iHeart App.

      Connect

      © 2025 iHeartMedia, Inc.