Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Isar Meitis (00:00):
Hello, and welcome
to a weekend news episode of the
(00:03):
Leveraging AI podcast, a podcastthat shares practical, ethical
ways to leverage AI to improveefficiency, grow your business
and advance your career.
This is Isar Meitis, your host,and we have another jam packed
week of AI news.
It's really becoming ridiculousbecause every week, I think this
is the craziest week we everhad.
And then the following week,kind of proves me wrong, but we
(00:24):
are going to talk about threemain topics today, and then we
have a lot of rapid fire thingsto talk about because it'd been
a lot of new releases of newmodels and features, but these
will not make the top threeitems.
So the top three items, we'regoing to first talk about the
potential impact of AI on theglobal workforce, which is my
personal biggest concern on thenegative impacts in the short
term of AI.
(00:45):
And when I say short term, Imean, probably three to five
years.
The second topic is going to bethe aftermath of the DeepSeek
release that we spent a lot oftime talking about last week.
So if you missed last weekepisode, you can go and check
that out.
ANd the third big topic is goingto be safety and security
concerns and measures that havepopped up in this past week.
Some really big announcementsthat happened, both good and
(01:06):
bad, that impact the safety andsecurity of AI systems.
And then, like I said, a lot ofnew releases and a lot of other
good stuff to talk about.
So let's dive right in.
I'm going to start with aRelease of new models, even
though I said that's going towait for later.
But that specific release hasdirect implications on the
(01:28):
future of jobs.
So OpenAI has released two newfunctions.
One is O3 mini, which is a newreasoning model that they shared
with us and showed us in the 12days of OpenAI back around
Christmas time.
But we finally got access to it.
it's the next variation of allone.
So it is the most advancedreasoning model that open AI
(01:48):
gives us right now.
Now, together with it, they alsoreleased deep research, which is
a deep research agent that usesthe O3 model as the underlying
architecture, but it has thisagentic capability to go and
research the web and writereally long and detailed reports
based on a huge research thatcould be hundreds of websites,
similar to the same kind offunctionality from their
competitors, Google.
So the first time OpenAIactually comes up with the name
(02:11):
of something that actually makessense, they literally just
copied one to one the name ofthat feature from Google, which
I find funny and surprising.
But putting that aside, this newcapability, the combination of
deep research together with areally capable reasoning model,
provides extremely powerfulcapabilities in the hands of
Well, every one of us.
So the old three mini model isnow available for free for all
(02:33):
GBT users, which probably wasnot the plan, but with DeepSeek
around and Quinn 2.
5 around, I don't think they hadmuch of a choice.
And then this model is reallygood at coding math and
scientific tasks.
And as I mentioned, the researchversion of this also knows how
to do serious research andprovide results across public
data as well as PDFs and othervisual information that you can
upload to it.
So what does that have to dowith the future of work other
(02:55):
than the fact we got anotherreally capable model?
Well, Sam tweeted a lot ofthings, but two things that
caught my attention that arerelevant to the future of work
and how powerful these models.
So if you remember last week, Ishared with you that there's
this new effort that puttogether humanity's last exam
which was crowdsourced frommultiple scientists and really
smart people around the world tocome up with really difficult
(03:17):
questions across any topic thatyou can imagine.
So related to that test afteropen AI released O3, and shortly
after they released deepresearch, Sam tweeted the
following, way back on Friday,the high score on humanities
last exam was O3 mini at 13%.
Now on Sunday, deep researchgets 26.6.
To put things in perspective,the previous models before that
(03:38):
were in lower single digits.
So we went from lower singledigits to 26.
6 percent on the hardest examhumans can come up with in a few
days.
That kind of shows you howpowerful this model is, but this
still doesn't explain to you whyI'm saying it will have impact
on jobs.
Well, the other thing that Sam'stweeted, he started with
congratulating the team whodeveloped it, but then he said,
my very approximate vibe is thatit can do a single digit
(04:01):
percentage of all economicallyvaluable tasks in the world,
which is a wild milestone.
Now, I want to dissect that fora minute.
Sam is obviously a very smartand knowledgeable person.
He's saying that right now, O3with deep research can do single
digits percentage of Alleconomically valuable tasks.
Now, if we think about how manypeople are employed in the
(04:22):
world, I did a little bit ofresearch for you.
the recent numbers for thebeginning of 2025 is there about
3.
6 billion people that areemployed in the world in 2025.
Let's look at the gamut of whathe's saying single digits of
tasks.
So let's put these tasks intojobs.
And obviously it's not going tobe exactly like that, but just
putting things in perspective,if it's 1 percent of jobs that
are going to be lost because ofthe current model, not future
(04:44):
development, that's 36 millionjobs around the world.
If it's 9 percent that's over300 million jobs that the
current AI model can probably dobased on Sam's best assessment.
Now again, this is not going tobe jobs.
He's talking specifically abouttasks, so it will be across
jobs, but if across jobs we cansave.
This amount of tasks, then weneed less people to do other
(05:06):
stuff because these people havemore time to do other things.
So I think the outcome would bethe same thing.
That kind of gives you an ideawhere this goes, but let's move
on.
Let's take another point ofreference that can help us in
this process.
Y Combinator, which is probablythe most famous startup
accelerator has opened theirspring 2025 requests for
startups.
So those of you who don't know YCombinator many of the
successful companies that youknow today came out of Y
(05:28):
Combinator.
So companies like Airbnb andCruise and DoorDash Instacart
and Dropbox and many, many, manyothers.
So when they have their call forstartups, they're looking for
startups in specific areas ofdevelopment to make sure that
they're pushing the boundariesand that they can become
successful.
So they just opened theirregistration for their spring
2025 cohort of companies.
(05:50):
If you look at the type ofstartups they're looking for,
you can see the directions thatthey're going.
One of them is the personalstaff revolution.
So they are highlighting how AIwill democratize access to
personal professional services,things like accountants,
lawyers, money managers,personal trainers, private
tutors.
So people who provide servicesto us and they are looking for
startups who can use AI to dothese things, which means these
(06:13):
jobs will become either lessattractive or less needed or
obsolete, depending on how faryou're willing to push the
boundaries with your thoughts onthis process.
The next one is softwareengineers.
So in their statement peteCooman, who spoke specifically
about this topic, said thatLanguage models can already
write code better than mosthumans.
This is going to bring the costof building software down to
(06:34):
zero.
That's not me saying thatsomebody from Y Combinator who
is in charge of these types ofstartups.
So they're looking for ways todrive code writing even more
than it is today, when todayit's already better than most
humans.
Basically, what they're sayingis that they're saying that one
developer or one, it's probablynot going to be called
developer, but software engineerwill run hundreds or maybe
millions of these code writersthat are going to work for him
(06:56):
executing stuff on his behalf,More about code execution
capabilities in our new releasesand features of this week.
We're going to talk about thislater on in the episode.
Compliance and audit automation.
So they're looking at any typeof compliance and audit work,
which by the way is about 4million people just in the US
are doing this kind of work.
They're claiming that AI systemsWe'll replace manual document
(07:18):
reviewing, testing, and auditingmakes perfect sense.
These systems are very good atthat.
So they're looking for startupsin that part of our economy.
The other thing that they'relooking at is what they call
Vertical AI agents.
So vertical AI agents are agentsthat can work in a specific
vertical in the industry.
So tax accountants, medicalbillers, phone support agents,
compliance agents.
(07:39):
So people who do something veryspecific in the economy.
And they're claiming that justthis category can generate
another 100 unicorns.
So companies are worth a billiondollars or more.
And so just in that type of themarket, they're looking for
companies that will build AIthat will replace people who do
these jobs.
The flip side, which is the onlypositive.
impact on potential future jobsis they are saying that there's
(08:00):
a huge development and a hugedemand in the infrastructure for
AI, meaning they need AIoptimized hardware and software
solutions.
And they need companies to buildthese solutions, which will then
employ a lot of people to drivethat part of the economy.
So the only positive side rightnow is bringing more minds and
more engineers to developinfrastructure that is related
(08:22):
to AI deployment.
One of the interesting thingsthat they focused in is
inference optimized hardware.
So because of all of thesethinking models, there's a
bigger and bigger demand forinference, which is the time the
models actually run versus whenthe times versus the time that
these models needs to betrained.
And that is still behindcompared to the training world,
even though there's a few reallysuccessful startups in that
field.
We talked about Grok, severaltimes on this podcast, as an
(08:43):
example.
So this gives you an idea wherethe world is going, right?
On one hand, we have models thatare getting better and better
and cheaper and cheaper that cando based on Sam Altman already 1
percent of all tasks in theworld.
And then you see where YCombinator are going, but that's
just on white collar jobs.
But on the flip side, we seehuge advancements and big
announcement being made by thehumanoid robots companies in
(09:04):
that entire industry.
So as an example, figure AI,Which is one of the leading
robotics companies in the worldjust announced that they're
planning to produce 100, 000humanoid robots in the next four
years now They already havestrategic partnerships with bmw
and microsoft They already haverobots being tested and working
at actual bmw facilities.
Now They are obviously not theonly company in this field.
(09:26):
You have companies like Teslaand agility robotics and Boston
dynamics and unitary, are alldeveloping very capable robots.
China in this field as well isbeing a very strong competitor
With eight of the top world's 16humanoid robot companies coming
from China.
And so they are not alone, Ifyou remember, we talked about
Tesla planning to createmillions of these in the next X
number of years.
Again, that's Elon Musk.
(09:47):
And you need to put things inperspective, but yet there's
going to be hundreds ofthousands of these robots for
sure in the next four to fiveyears being deployed and taking
over blue collar jobs initiallyin factories and then in other
parts of the economy, includingsupporting us in our homes, in
our yards, in our neighborhoods,and so on.
The interesting otherannouncements from FIGR this
week is that they're going toabandon their partnership, with
(10:07):
OpenAI, which provided thesoftware to run these robots
until so far, and they'redeveloping their own in house
capabilities that they believewill be better tailored for the
needs of their robots.
That might be something that isdriven by the fact that open AI
are planning to develop theirown robotics capabilities, which
we talked about in previousepisodes.
And we're going to talk a littlemore about today.
So before we continue with thenews, with all that risk to
(10:29):
future of work that will impactpeople individually, as well as
organization, the question is,what can you do?
And the very first thing you cando, and you have to focus on
this, whether for your owncareer or for the success of the
organization that you'rerunning, whether it's a
department or an entire companyor a team, the most important
aspect is training.
And that's why I've been focusedon AI education and training
since I established.
Multiply, which is the companythat I'm running.
(10:49):
We have trained hundreds ofcompanies so far on AI
implementation and the nextcohort of our highly successful
and sought after AI businesstransformation course is
starting on February 17th So sothis is just over a week from
the time this episode goes live.
So you still have time to signup and join us.
We already have.
Multiple people who has joinedthis cohort.
This is probably the last timebefore May that we're running a
(11:12):
public course, because most ofthe courses that we're running
are private to specificorganization.
We invite us to teach theirpeople.
And so if you want to learn onhow to implement AI and to learn
across the board on tools,system processes, strategies,
mindset, everything you need toknow in just eight hours, come
join us.
It's four weeks every Monday.
Noon Eastern time, there's goingto be a link in the show notes.
So you don't forget, or ifyou're interested opening right
(11:33):
now, click on the link and comejoin us to really prep you much
better for the future with AI inthe workforce.
And now back to the news.
another company that is makingbig waves in this world is
Boston Dynamics have been aroundfor a very long time.
I'm sure you've seen videos ofthe robots dancing and doing
Christmas celebrations andclimbing things and so on.
So they just announced apartnership with the Robotics
(11:56):
and AI Institute known as ourAI, and the goal is to enhance
reinforcement learning for theirAtlas humanoid robots.
Now, both of these organizationwere founded by the same guy.
Mark Railbert, and he's a formerMIT professor, and he has been
at Postal Dynamics for 30 years,and now he's running the other
company.
And the goal is to createsimulation based environment and
other solutions to allowtraining robots faster, better,
(12:19):
and more efficient.
Very similar things to whatNVIDIA is doing with their
robotic infrastructure.
So a lot of progress has made inthat field, which will
accelerate the capabilities ofrobots to do basically any test
that we do, and beyond, in thenext few years.
Now, in a recent US PTOtrademark application, OpenAI is
revealing some of their plansfor the future.
So the one that is related tothis particular topic is robots.
(12:43):
They are planning and I sharedthat with you in the past.
They already hired people thatare robotic engineers and they
are developing their plan tobuild their own humanoid robots.
If you remember, I mentionedthat in previous episodes, they
actually had a humanoid robotsdepartment back in the past.
They deserted those plans, Ithink in around 2019 or 2020.
And now it's coming back becauseit's an, obviously a huge
(13:03):
economical potential for thefuture.
And so they're going down thatpath as well.
So we're getting more and morecompanies who are developing
robots that are becoming moreand more capable that can take
off.
blue collar jobs as well.
I'm personally not optimisticabout the outcome of this.
I know people are saying that AIlike any other revolution will
generate more jobs that we cananticipate than the jobs that
it's taking.
(13:24):
My personal opinion is that it'svery, very different than
previous revolutions on twodifferent measures.
One, it's the first time thatthe revolution creates
intelligence.
It creates systems that think,which is the only thing that
kept us creating new jobs beyondthe jobs that were replaced in
previous revolutions, right?
So if you think about theindustrial revolution, Oh, okay.
So now instead of going behindan ox or a horse, plowing my
(13:47):
field, I can have a tractor or Ican have a factory and I can do
the assembly faster.
But the thinking part of things,we're still human.
So we started doing more andmore stuff that has to do with
operating our brain thanoperating our bodies.
This is going to change.
The thing that allowed us tomake up new jobs that we can do
in the machines couldn't isgoing away in this particular
revolution.
The other thing is speed.
(14:08):
The industrial revolution took200 years.
The internet revolution took 10to 20 years, depending on
exactly how you measure it.
It gave us time to figure outwhat's next and to come up with
new capabilities.
This is happening in days andweeks and months, which does not
give us time to adapt.
And while it might generate morejobs than it takes away, It will
take a lot longer to generatethe jobs than to lose the jobs.
(14:30):
And then in between, we have avery long period of serious
uncertainty when it comes to thelivelihood of people and
societies and the economy,because if 30 percent
unemployment, most of it, whitecollar jobs of people making a
lot of money, the economy comesto a halt.
So the only solution for this issome global group of people that
involves.
Governments, regulators,international bodies, and
(14:51):
industry to potentially come upwith solutions for this problem.
So the only good news I haveabout this is that next week,
there's another major global AIsummit.
It is going to happen in Parisbetween February 10th and 11th.
And it's bringing a hundredcompanies who are participating
in this and over a thousandprivate sector and civil society
representatives.
And It is co chaired by Franceand India, but there's a lot of
(15:14):
really high profile individualsthat are going to participate
including the French president,the prime minister of India, the
U S vice president.
So JD Vance is going to bethere.
China's vice president, open AI,Sam Altman, Google's CEO, Sundar
Pichai and many other importantpeople.
Interestingly, the UK PrimeMinister is not going to be
there.
And another person who's notgoing to be there is Elon Musk.
I don't know if it's becausehe's doing other things, or
(15:35):
because he wasn't invited, orbecause Sam Altman is there.
Whatever the reason is, Musk isnot going to be there.
And I'm sure we will hear hisopinion about what he thinks
about this group.
they're going to discuss fivedifferent topics.
Public interest in AI future ofwork, which is the topic we just
talked about innovation andculture trust in AI and global
AI governance.
All are really important topics.
And I think the more we see thiskind of collaborations, and the
(15:58):
safer we are, and the higher thechances that we actually going
to harness and benefit from AIand hopefully minimize or maybe
eliminate, I'm not thatoptimistic though, the risks of
AI usage.
And we're going to talk aboutsafety and security, and you'll
see why I'm less optimisticabout this topic.
But I think the fact thatthey're meeting and it's like
the third time that they'remeeting in a year is very
important.
I really hope there's going tobe an ongoing body that will
(16:19):
include members of all thesedifferent organizations.
That we'll meet regularly andnot just once every six months
to discuss these issues and tryto come up with solutions or
prevention concepts as soon aspossible.
Our second topic, as Imentioned, is going to be the
aftermath of the DeepSeekrelease that we talked about
last week.
So quick recap, DeepSeek,Chinese companies released two
models.
(16:40):
One is called DeepSeek versionthree and the other one is
called DeepSeek is just athinking, reasoning.
version of their model.
And that model came out ofnowhere to the top 10 in the
chatbot arena, overpassing thebest reasoning model at that
moment, which was OpenAI 01.
The other crazy thing was thatthey claimed that they trained
this model on 5.
(17:00):
6 the billions that are beinginvested in the US.
They now released a new aspectto their model that is called
Janus Pro, which replaces theirprevious Janus capability, which
is their vision aspect of theirmodel.
This new model adds to themultimodality of their DeepSeek
capability, and it is very goodat generating images, and it's
even better at evaluating andunderstanding images.
(17:20):
And it outperforms OpenAI'smodels in both aspects, both in
the generation and understandingof visual content.
So in addition to all thecapabilities they introduced
last week that are verypowerful, now it's addressing
multi modal aspects as well,which will make their offering
even more attractive.
That being said, there were bigconcerns when the model came out
that this model comes from Chinaand how safe it is to use and so
(17:42):
on.
So there's a big backlash onthis.
And there's two aspects to this.
One is several differentcompanies who do vulnerability
testing on these models havetested deep six model and it
failed 100 percent ofvulnerability penetration tests.
so in one of the companies thatdoes this regularly to these
models, tried 50 different typesof malicious prompting, and all
50 went through and were notblocked.
(18:03):
So I don't know if you'refamiliar with the concept of
jailbreaking a model, but theidea is all these models come
with different safety featuresand jailbreaking basically means
I will find a way to trick yourmodel and still go beyond its
limitations to do stuff that themodel should not allow you to
do.
And as I mentioned, DeepSeekfailed every single jailbreaking
attempt that was thrown at it.
And it was in six differentcategories of general harm,
(18:24):
cybercrime, misinformation andillegal activities.
You can do all of that withDeepSeek model that is obviously
raising very, very seriousconcerns, especially that this
model is open source, meaningyou can take this model and run
it yourself and make whateverother manipulations you want to
it to make it even lessrestrictive.
And then you can do really badthings with it.
Now, the testers claiming thatbeyond the fact it gave these
(18:45):
answers, it became, what they'resaying.
And I'm quoting unusuallydetailed responses to restricted
topic that it's not supposed torespond to.
Now.
That's just add to the fact thatthis model is based in China and
that it's running on Chineseservers and hence the Chinese
government may or may not haveaccess to the data.
So major governments andgovernments bodies around the
world has banned the use ofDeepSeek and has blocked it in
(19:08):
their servers, in their IPs, intheir app stores and so on.
That includes the Pentagon,NASA, US Navy, the Congress.
Italy's data protectionauthority, Asian governments
like Taiwan, which makes sensewith their really great
relationship with China.
And Texas as a state all bannedthat and there's a list of
thousands of companies andcorporations who also banned
DeepSeek for all these reasons.
(19:28):
The EU in addition, has sharedtheir concern over GDPR
compliance.
And so this model, what it wasdeveloped very quickly and shows
really amazing capabilities andreally took the AR world by a
storm has.
Serious issues when it comes todata safety.
As I mentioned, the solution isobviously learning everything
you can from it or using themodel, hosting it on your own.
(19:49):
And then a lot of these concernsgo away.
So if you just want to benefitfrom the capabilities of this
model and enjoy the fact thatit's significantly cheaper and
that it's open source, you canliterally host it yourself.
Or it's already available on AWSand Azure.
So if you're running on thosemodels, you can have your own
variation on the model runningon your hosted environment on
these platforms.
Now still staying on the topicon the implications of DeepSeek,
(20:10):
one of the questions that wasasked is how the hell they did
this at 5.
6 million dollars.
I shared with you last week thatDario Amadei actually said
that's not a big deal and thatcurrent models will cost that
amount of money, or at least thesame ballpark.
And that actually distilling amodel, meaning training a model
and other models, which whateverybody believes they have
done using open AI's for, Oh,distilling a model and moving it
to a reasoning model is not thathard.
(20:32):
Stanford university togetherwith.
Researchers from the Universityof Washington were able to prove
that.
So they just released a newmodel called S1 that can compete
with OpenAI's O1 model andDeepSeq's R1 on math and coding
tests.
And they've used 50, yes, 50 incomputing time to train this
(20:53):
model.
So they have used Google'sGemini 2.
0 flash thinking model combinedwith Huen, which is the model
that Alibaba released last week,to train this new model, and
they've done it in 30 minutesusing 16 which ends up being
about 30 of compute to trainthis model.
Now, will it be really ascapable across the board?
(21:13):
Probably not, but their goal wasto prove a point.
They also released everythingthat they did, including the
code and the training data andthe training process as open
source, so now other researcherscan do the same thing.
So this has good news and badnews.
The good news is this may helpus to save huge amounts of money
and compute and power andpollution by using these
methodologies.
The bad news is it'saccelerating the development of
(21:35):
new models, even beyond thecrazy rate that it is today.
But it definitely proves thepoint that it's possible to use
an existing model, an existingpowerful model, to train a new
thinking variation of this modelfor almost free.
But that being said, the impacton the large companies in the U.
S.
are developing AI.
so Google Anthropic, Meta, etcetera, has been exactly the
(21:57):
opposite.
You would assume for a minutethat they would say, Ooh, we can
now save billions by using thesemethodologies.
Let's rethink our spending.
Well, exactly the opposite ishappening.
So Alphabet, Google's holdingcompany has announced that
they're planning a 75 billionCapEx investment in 2025 in a AI
infrastructure, a 42 percentincrease to the crazy AI
spending they had in 2024.
(22:18):
So each and every one of thecompanies are going to spend
tens of billions of dollars in2025 to maintain their lead in
the race.
Pichai, Google CEO said.
Part of the reason we areexcited about the AI opportunity
is we know we can driveextraordinary use cases because
the cost of actually using it isgoing to keep coming down, which
(22:39):
will make more use casesvisible.
So what he's saying, and I heardthat from multiple people across
the industry and beyond, is thatthe fact that it's becoming
cheaper is not gonna.
Reduce the AI usage.
It's actually going to make itexplode, which means they will
need more compute, more modelsand more everything just because
it will become relevant to a lotmore use cases and available to
(23:00):
practically everybody becausethe individual use case will be
practically free or very, veryclose to that.
Staying on the DeepSeek topic,DeepSeek's researchers actually
shared their thoughts aboutHuawei's Ascend technology 910C,
which is the latest AIprocessor, which is supposed to
compete with NVIDIA's offering.
And they're claiming that basedon their research, it achieves
60 percent of NVIDIA's H100inference performance.
(23:23):
And they're saying that bymanipulating how it works, they
can actually achieve Even betterresults.
Now, 60 percent may not soundlike a lot, but the gap was
significantly bigger a year ago.
So Huawei, which is Chineselargest cheap manufacturer, is
closing the gap on NVIDIA.
Again, they're still behind.
And moreover.
Those researchers also said thatwhile it's not bad on inference,
it's still far behind when itcomes to training reliability.
(23:46):
So at least on the trainingside, GPUs are still reigning
supreme, but the gap is beingclosing on inference is not as I
mentioned before.
Inference is becoming a big partof the game because these
reasoning models are actuallyusing a lot of compute in the
inference time when the model isactually being used and
generating tokens versustraining.
And this development of hardwarecombined with, we see
innovations in software likeDeepSeek and Quen are a direct
(24:08):
outcome of the U.
S.
ban of NVIDIA chips being usedin China.
So first of all, they find waysto use it.
As I mentioned, there's rumorsthat DeepSeek has used 50, 000
GPUs from NVIDIA, but even ifthey only have 10, 000, like
they're claiming, that's still alot, but putting that aside, it
is forcing.
China to come up with their owninnovation and to develop both
hardware and softwareinnovations that they probably
wouldn't develop.
(24:29):
And they would have beendependent on the U S otherwise.
So to summarize the DeepSeektopic, I will mention something
from a conversation I had thisweek.
I'm delivering an executiveeducation session in partnership
with the university in a coupleof weeks.
And the person who is in chargeof this effort has asked me in
an email this week, do I thinkthat DeepSeek changes everything
and do we need to change thetraining data?
And what I told him is I thinkthat DeepSeek in the big picture
(24:50):
changes very little.
If anything, it solidifies thesame point of view I had before.
We will have access toincredibly powerful intelligence
for Free or very, very close tothat.
And the implications of that areobviously profound, but that
doesn't change the trajectorythe DeepSeek event doesn't
change that, if anything, itaccelerates towards that even
faster.
As I mentioned, the third bigtopic that we're going to talk
(25:11):
about today is security andsafety.
So OpenAI has done research onhow persuasive their models are,
and they've done this on theChange My Views forum on Reddit.
So it's 3.
8 million members whoparticipate in debates trying to
convince one another.
They conducted 3, 000 differenttests and what they've learned
that their models are becomingmore and more persuasive to
(25:33):
humans.
So GPT 3.
5 was 38th percentilepersuasive, O1 77th percent, O1
mini 77th percentile compared toother humans, O1 80th percentile
and O3 mini 82nd percentile.
So O3 mini.
Not even the full O3 is betterin persuasion than 82 percent of
(25:55):
people who participate in thisdebate, meaning they are people
that by definition are betterthan the average because they
want to participate in thisprocess.
And so they are becoming very,very persuasive and it's just
going to keep on going becauseof their thinking capability and
analyzing capability andreasoning capability that they
did not have before.
That obviously raisessignificant concerns.
OpenAI shared why they're doingthis research and their
(26:18):
mitigation strategies and howthey're trying to monitor this
and make sure that they canprevent and restrict.
Political persuasion and otherspecifically targeted campaigns,
but can they really monitor ahundred percent of it?
I'm pretty sure they can't, andthat obviously raises a lot of
risks.
So think about foreigngovernments or terrorist
organizations using this toconvince Americans or people in
(26:39):
the free world with newopinions, because it will be
impossible to know what's comingfrom an AI and what's not.
It will be more persuasive thanwhat the humans can write to
contradict what the AI iswriting.
So that's where we are rightnow.
And it's only going toaccelerate.
And I'm going to go back andforth between good news and bad
news on the safety and securitything.
So Meta just unveiled a new riskassessment framework for their
(26:59):
AI development.
And they just announced it onFebruary 3rd and they call it
the Frontier AI Framework.
And the goal is to establishclear boundaries to AI system
development and deployment basedon potential risk.
and they've identified twolevels of risk.
One is called high risk systemsthat could enable, but not
guarantee successful cyberchemical and biological attacks.
And then the second tier iscritical risk systems that could
(27:22):
lead to, and I'm quotingcatastrophic outcomes, which
they didn't label, but based onthe fact they're talking about
cyber, chemical and biological,you can kind of know where this
is going and the biggestdifference is, can you mitigate
the risks once the model hasbeen deployed?
So basically what they're sayingis that the critical risk
systems once you deploy them,there is no way for you to
mitigate what's going to happenversus the other ones that you
might be able to prevent it onceit starts happening.
(27:44):
Their policies go beyond justchemical and so on.
So they're talking about thingslike automated compromise of
corporate scale environmentscreation and deployment of high
impact biological weapons andother scenarios that they're
deemed most urgent by thecompany.
So the policy defines bothinternal and external researcher
risk assessment for every one ofthese models.
They're including senior leveldecision makers have the final
(28:06):
review authority for everythingthat is said and done and
relying on multiple types oftasks across the different risk
levels instead of relying on onespecific test.
And the outcome is that highrisk systems will be limited
internal access and deploymentonly after risk mitigation and
critical risk systems will haveenhanced security protocols
internally and potentially willlead to development suspension.
(28:26):
So they're basically saying thatthey're going to stop the
development of AI systems ifthey deem that it could lead to
catastrophic critical risk.
I'm very.
Happy to hear that.
But that being said, that's justone company that decides on its
own what the risks are and thatare in huge competition with
other companies who are doingthe same things.
And as I mentioned, I'm going togo back and forth between good
(28:48):
and bad security announcements.
So Google just decided toabandon their 2018 AI weapons
ban.
So since 2018, they had arestriction.
on allowing weapons andsurveillance development based
on their AI models.
And as of February 4th of 2025,they are removing some key
components of that.
So their previous policyprohibited the usage of AI for
(29:10):
technologies causing overallharm, weapons or injury cause
causing implementations,surveillance violation,
international norms andtechnologies that could go
around international law andhuman rights.
And the justification they'reciting for why they're removing
this is that the need fordemocracies to lead AI
development involving industrystandards, geopolitical AI
competition, and that they'resaying that they're going to
(29:31):
focus on appropriate humanoversight rather than just
banning the whole thing.
Now, multiple employees inGoogle obviously are opposing
and are being very loud againstit, but this is the current
state and I shared with you thata few weeks ago, we were talking
about open AI starting to getinto these fields.
I think it's inevitable and it'svery scary.
But I do agree with Google thatif China and Russia and other
(29:55):
adversaries of the U.
S.
and the free world will startemploying AI capabilities in
their military and surveillancecapabilities, and we won't, we
will fall behind.
Now that again can lead tocatastrophic outcomes But this
seems to be the next cold war ornuclear race or whatever you
want to call it for worlddominance, and hopefully it will
stay balanced and keep everybodyintact from doing stupid things.
(30:16):
But these systems, AI systemswill be deployed and will
control deployment of militaryforces and military capabilities
around the world.
That could obviously leads to adoomsday scenario, but as I
mentioned, the alternative isnot necessarily better by
letting the other side have itwhen you don't,.
And ending on a positive note onthe safety and security before
we dive into a lot Of rapid fireitems.
(30:37):
So Anthropic just announced thatthe company has launched a new
security system designed toprotect Cloud AI from universal
attempt.
Now they achieved incrediblyimpressive results.
Again, very different than whatwe've heard about DeepSeek.
So they reduced success ratefrom 86 percent to 4.
4 percent using these measures.
And they did that while onlyincreasing false positives,
(30:59):
basically not breaking attemptsthat the system thinks are
breaking attempts and going tostop this by only 0.
38%.
So it basically hasn't preventedany action that is a legit
action and was able to prevent82 percent more jailbreak
attempts compared to theprevious capabilities that they
have.
Now that is costing them 23.
7 percent additional computecost, but they're planning to
(31:20):
continue optimize from it.
But Claude has always led thatdirection of safe AI usage.
And that's another way for themto prove that we're investing a
lot of compute time in makingour systems safe.
And they're doing this usingtheir constitutional AI
frameworks that they've builttheir systems around, plus a lot
of additional new capabilitiesthat they're adding in order to
do this.
And I don't know if you knowthat, but they have had a bounty
(31:41):
of 15, 000 called HackerOne Thatis going to be given to a hacker
that will be able to develop auniversal jailbreak against
Claude and that has been aroundsince August and nobody has
claimed it yet.
So they're definitely doing theright things in order to prevent
hacking and jailbreaking theirmodel.
And I really hope that everysingle company that develops
these frontier models willfollow the footsteps of
(32:02):
Anthropic on this particularcase, going back to the
international development andpartnership that I'm dreaming
that will happen.
I think these kind ofcapabilities and research must
be shared with everybody andevery company and every
organization who is developingthese kind of models to achieve
the same results that Claude isachieving.
And now let's dive into lots ofrapid fire items.
We're going to start with a lotof aspects from OpenAI.
(32:22):
So we already talked about theirbig releases with O3 mini and O3
for the professional version anddeep research.
But they also made a lot ofother interesting announcements
this week.
One of them is that Sam Altmanhas announced that he's been on
the wrong side of history whenit comes to open source.
Basically, and I don't know ifthat comes as, again, the
aftermath of DeepSeek oranything else, but in a Reddit
(32:43):
AMA Ask Me Anything, Sambasically admitted that the fact
that they went to completelyclosed source might have been
the wrong decision.
And he basically said that theyneed to figure out a different
open source strategy in the samelines.
Kevin whale, who is their chiefproduct officer has revealed
that their company isconsidering open sourcing older.
Non state of the art models,which they haven't done so far
(33:05):
now, is that going to beimpactful?
I'm not really sure because nowwe have open source models who
are competing with their stateof the art models So if they're
gonna open source, they're alltheir models that are not
competing.
They won't be able to compete inthe open source world So I'm not
really sure I understand thelogic behind it But I think they
understand that the open sourceworld is accelerating and it's
closing the gap on the closedsource world.
So the whole point of whatthey're doing, maybe not be that
(33:25):
impactful and effective.
And so they're obviouslyrethinking it.
What will that lead to?
I'm not a hundred percent sure,but I will update you as there's
stuff to update.
Now, in the same AMA, Samtouched on a few other things.
He did not give a specifictimeline on GPT 5.
He's saying that they're workingon it, but it's unclear when
will that become available.
There's going to be new featuresand new capabilities for the O3
(33:45):
model coming in more than a fewweeks, but less than a few
months.
He also mentioned that they'redeveloping a successor to Dal E,
which is their image generationtool that is way behind the
competition right now.
And he's saying, and I'mquoting, it's going to be worth
the wait.
I assume we're going to getanother extremely powerful AI
image generation tool.
We already have a bunch ofthose.
So right now there's probablyfive or six models that are
very, very good, that cangenerate completely realistic
(34:06):
images of anything you want.
And we're just going to get oneintegrated into the OpenAI
environment.
To connect it to one of theprevious points, OpenAI also
defended its collaboration withthe U.
S.
National Laboratories forNuclear Defense Research.
And they were explaining andexpressing the fact that they
have full confidence that thescientists will have responsible
use of the technology.
Whether that's true or not, timewill tell, but that's their
(34:27):
statement, which you wouldn'texpect them to have any other
kind of statement.
A very interesting and positivedevelopment from OpenAI is that
they announced a rollout of aneducation specific Chachapiti
version for California StateUniversity system that is going
to reach 500, 000 students.
Thousand students and facultyacross 23 campuses.
Now, as you probably know,OpenAI has created a Chachapiti
EDU version back in May of 2024,and it was deployed to some
(34:51):
Prestigious institutions such asWharton and University of Texas
at Austin and Oxford University.
but this is a much broaderdeployment of a high education
specific model that is going tobe deployed across the largest
public university system in theU.
S.
Now, the main features includepersonalized tutoring for
students, study guidesgeneration, as well as
administrative task automationfor faculty.
(35:11):
So it's going to help bothprofessors.
And students, I said that manytimes before, I think AI
represents the biggestopportunity education has seen.
We still perform education fromkindergarten all the way to PhD
in the same way it was done inthe last hundred years.
With a professor or a teacher infront of a classroom, teaching a
large, Group of students,assuming they're all the same.
(35:33):
It's a horrible way to trainpeople because people have
different needs.
They progress in different waysacross different aspects of the
learning.
They learn better by leveragingdifferent tools.
Some learn better by listening,some by watching, some by doing
some by using different games,some by watching videos, et
cetera, et cetera, et cetera,and AI will allow us to, for the
first time in history, toprovide any person, any kid on
the planet, personalizedtutoring aligned with their
(35:55):
needs, aligned with the thingsthey need or want to learn,
aligned with their capabilitieson different aspects and aligned
with how they learn best.
And I really hope that this iswhere this is going to go.
First, it will provide educationfor all, and second, it will
make the existing education inplaces that it's already
available significantly moreefficient, hopefully making
teachers more of mentors thanthe people who actually teach
(36:16):
the material and allowing peopleto and guiding people or kids
through the process in the mosteffective way.
Now I mentioned before thatOpenAI has filed some patents.
Well, one of the things that SamAltman confirmed this week in an
interview with the ELEC, whichis an organization in Japan,
that they are indeed developinga device and that they're
working in collaboration withthe legendary Apple designer,
(36:37):
Johnny Ive.
I've shared these rumors withyou in the past.
Well, now they've confirmed,these rumors.
Them emphasize the fact thatvoice will become the user
interface medium of the future.
I can tell you that right now, Ipersonally use the advanced
voice mode.
On chat, GPT, as well as thelive streaming on Gemini
regularly.
And it literally changed the wayI work with AI models, but I
(36:57):
can't wait to basically activateeverything with computers and
literally just haveconversations with them to do.
Everything that I need.
And I think it will over timebecome the user interface
instead of a keyboard, mouse, ora touch pad on every device that
we know, including, microwaves.
Why do I need to figure out howmuch time I need to put in?
I can literally tell it, or youwill see on its own with a
camera, what I'm putting insideof it.
And it will just run themicrowave based on what's the
(37:18):
best outcome that can happenwhile asking me what my
preference is for the level ofcooking or the temperature that
I want the output to be.
But.
There will be no need forphysical user interfaces on most
things interact with right now.
Another thing that Sam saidabout the device is that their
plan is not to replace thesmartphone, but actually to have
an extension of theirsmartphone.
So think about something similarto a smartwatch.
So it will connect to the phone.
(37:39):
But it will be a device thatwill allow you to talk to it and
use AI in order to then operatethe phone or operate through the
phone, things that you need todo.
So now let's switch gears andtalk about rapid fire items as
far as new releases of productsand features.
So we already mentioned open AIwith their O3 model in the pre
search.
Well, Google weren't standingstill either.
And they just announced onFebruary 5th that the company
(37:59):
has made the entire Gemini 2.
0 family generally available.
And there's going to be threedifferent models, 2.
0 flash, the smallest.
model with 1 million tokenscontext window, which is way
beyond everybody else.
Then you have Gemini 2.
0 pro, which is an advancedmodel that is available through
paid members with 2 milliontokens context window.
Again, now twice what you get inflash.
(38:20):
And then there's flashlight,which is built for cost
efficiency.
2.
0 flash will become the mid tiermodel.
Think like Claude 3.
5 Sonnet, on the Claude family.
And then you'll have the pro andthen you have the lighter
version that can run faster andprobably much, much, much
cheaper.
And these models have enhancedcoding capabilities, improved
complex prompt handling,advanced reasoning capabilities,
built in Google searchintegration, code execution
(38:40):
capabilities, a lot of stuffthat I shared with you in the
past before they made it widelyavailable.
One of the interesting thingsthat they mentioned about
flashlight model is that it'svery good at image understanding
and that it can processapproximately 40, 000 image
captions for less than 1 ofcompute going back to what we
already talked about before,we're going to have very high
capable intelligence acrosseverything we need for
(39:00):
practically three.
Now going back to our benchmark,which is the chatbot arena.
Gemini 2.
0 flash thinking is now numberone.
Gemini 2.
0 pro is number two, thenChachapiti 4.
0 latest, which is kind of likea model that they released in
late 2024, then DeepSeek R1.
So these are the top five modelsright now.
And then number six is anotherGemini model, which is Gemini 2.
(39:22):
0 flash.
And so what you can see, as Imentioned last week, is that
Gemini basically took.
Three out of the top six modelsin the world right now, as far
as the success of the resultsbased on usage of actual two
people, which is how the chatbotarena works.
And I anticipate like I did ayear ago when they were still
far behind that Google will keepon leading this race because
they have access to more ofeverything you need in order to
(39:44):
make this work more compute,more engineers, more money, more
data between Google search andYouTube and other platforms that
they control and hugedistribution across the entire
Google ecosystem.
Another company that made a bigrelease this week is Mistral.
We talked about Mistral manytimes in the past on this
podcast.
Mistral are an open sourceplatform from France, and they
just released a new multi modalversion of their Le Chat model.
(40:05):
They just released it onFebruary 6th.
And it has faster answers thanprobably any other open source
model right now.
They're claiming a thousandwords a second, advanced
document processing withsuperior OCR capabilities, code
interpreter to create and runcode on the platform, image
generation powered by Flux.
So not their own, but it'sintegrated into the models and
new memory capability similar towhat we have now in several
(40:27):
different platforms.
There's going to be a free tierand a paid tier for 15 bucks a
month.
And, very interesting, they'remore and more enterprise
relevant tools, which are notsurprising.
So secure environmentdeployment, custom tool
integration, tailored modelcustomization.
And also they're coming up withnew connectors to existing work
environments within enterprises.
So like all of these companies,they're pushing very, very hard
(40:48):
into the enterprise world.
And being an open sourcecompany, it actually makes a lot
of sense because it obviouslyallows you to host it on your
environment while keeping yourdata secure.
Another company that made a bigannouncement this week is
GitHub.
GitHub, one of the most commonlyused coding environments.
They just shared that they'rereleasing three different AI
capabilities.
One is called Agent Mode forAutonomous Coding.
The other is Copilot Edits.
(41:09):
And the third is Project PadawanReviews.
The most interesting one isobviously the Agent Mode.
And Agent Mode allows selfiterating code generation,
automatic error recognition andfixing, terminal command
suggestions.
Runtime error analysis with selfhealing, and it is going to be
available for everyone who'susing Visual Studio, which is
the code platform that is it'sintegrated into.
biggest difference here of allthese three platforms that it
(41:31):
will replace not just.
writing short code snippets, buta much bigger portion of the
coding ecosystem and process.
And as I mentioned in thebeginning of this episode, when
we're talking about YCombinator, this is where it's
going.
Like coders will sometime in thenear future, we'll write very
little code and they will dependon these platforms throughout
code on their behalf, which willallow them to create
significantly more code.
(41:51):
Faster and much, much cheaper,which will allow us to do future
developments better than we cando them right now.
Another big release this weekthat is very interesting and
very scary is ByteDance, thecompany behind TikTok has
released what they callOmniHuman, which is a tool that
generates videos of people basedon a static image of them and a
voice sample.
So you can take an image of anyperson and they've shown
(42:12):
multiple examples ofcelebrities, as well as
historical characters.
And then they took a voicesample of the person and it
generates a very realisticmotion of the person speaking as
if it's the real person.
That's just another reallyamazing deep fake capability,
which on one hand is amazingbecause you can bring to life
historical characters foreducational purposes.
So one of the examples theyshowed is Albert Einstein
(42:33):
speaking about whatever youwant.
And so you can use it foreducational reasons, which is
really cool.
But on the other hand, it's deepfake and it will allow to
replicate any person on theplanet in a very highly
realistic way in seconds.
Now, right now it's stillresearch, but I would not be
surprised if this becomes a partof the platform sometime in the
very near future.
Another interesting announcementon new types of models, but not
the models yet, is that a newEuropean AI Alliance has
(42:54):
emerged?
It's called the Open Euro LLM,and they have launched with 52
million euros in funding.
And their goal is to allow anindependent open source
development of models by Europein order not to relay on the
dominance of U.
S.
and China and to allow aconsortia of European
organizations to develop aEuropean based language models
(43:17):
and other A.
I.
capabilities.
And as I mentioned, the goal isto make it full open source,
including everything models,software, data and evaluation,
true open source capabilities.
And then the last topic I wantto talk about, is the new
announcement from the U.
S.
Copyright Office.
They just came out with a secondreport, and actually with two
different parts to it, that ishighlighting a significant
change from their initial viewof AI, but they're still
(43:39):
claiming that the law doesn'tneed to be changed.
They're just elaborating on it.
The original ruling from a yearago said that basically anything
that is AI generated cannot becopyrighted.
And now they've changed theiropinion a little bit.
And they're basically sayingthat if the process was
initiated by human ingenuity.
or if the significant part ofwork was human, then it's still
copyrightable and can beprotected under copyright laws.
One of the things that they madevery clear is that prompts alone
(44:02):
do not provide sufficient humancontrol to make user of the AI
system authors of the output.
Meaning even if you wrote athree page prompt to generate an
output, the output is notprotectable from AI laws in the
U.
S.
But they give other exampleswhere work that is co created or
initiated by the human isprotectable.
I have a very personal exampleof this.
A lot of the content that Icreate is an output of this
(44:24):
podcast.
I take the podcast and run itthrough AI systems, and that
generates blog posts, socialmedia posts, ideas for content
and other stuff.
And that now is protectedbecause it's derived from the
content that I'm creating on thepodcast.
Another example that they gaveis that if you're a.
Painter and you start with ascribble that you create and
then you use AI to improve it.
It's still yours and can becopyrighted.
I think that will continue toevolve.
(44:46):
I think serious prompting versusshort prompting and I don't
really know how to define thatin a legal way.
Will also be consideredeventually because it's still
something that I'm inputtinginto the system in order to
create an output that anotherperson will not be able to do
because they did not have myability to do it, which is the
whole point of copywriting.
So I think that will continue tochange, but the fact that they
finally moved from anything withAI is not copyrightable to,
(45:08):
okay, some things are, is sayingthat they understand where the
world is going and that they'reopen to change.
That's it for this week.
We'll be back on Tuesday withanother high two detailed
episode that you are going toabsolutely love.
If you are enjoying thispodcast, please rate us on your
favorite podcasting platform.
It will take you five seconds.
So pull up your phone right nowand click the five star review
(45:29):
and write whatever you thinkabout the podcast or reach out
to me on LinkedIn and give meany feedback.
I love getting your feedback onLinkedIn and I get a lot of it
almost every single day.
So good, bad, ugly, whatever youwant to say.
I really want to hear about it.
And while you're at it and youhave your phone open, please
share it with other people thatcan benefit from this.
I'm sure you know a few peoplethat if you think about it for a
second, you can say, Oh my God,so and so has to listen to this
podcast as well.
and so please share the podcastwith those people and until next
(45:53):
time have an amazing weekend.