Why AI Makes It Harder to Ship Good Products

Andrew and Sean dig into Clawdgate, the growing developer complaints that Anthropic has been quietly throttling Claude's default thinking power after locking in enterprise deals. They also get into Sean's Salvo orchestration system, why waterfall specs do not work even with AI, and why scope creep is the silent killer hiding inside every AI coding workflow.

Links:
For more information about the podcast, check out https://www.smalleffortspod.com/.

Transcript:

00:00.76
Sean
What is up? My phone is gigantic.

00:02.84
Andrew
I want to throw Claude out of the window.

00:06.04
Sean
Why? What's going on? How dare you speak about my boy, Claude?

00:13.07
Andrew
Have you seen all the chatter online? Like there, I don't know if it's like just my feed or what, but I'm seeing a bunch of people complaining about Claude getting worse.

00:15.63
Sean
Yeah.

00:23.62
Andrew
Like, uh, and like the, the hypothesis I've seen is that, that, um,

00:23.98
Sean
Totally.

00:30.65
Andrew
is that Anthropic has been kind of throttling down Claude's default thinking power, or like like basically how much time and resources it spends thinking to save money.

00:45.99
Andrew
like they The theory is that they like used the masses to get people excited and build hype for Claude so they could sell enterprise deals. And now that they've got the enterprise deals, they're like throttling down how good Claude is for the masses by default.

01:03.27
Andrew
but But I think you can still like adjust the settings to get it back to where it was. i haven't really tried to tinker with it.

01:10.37
Sean
You gotta make an ultra think, man. You gotta turn on ultra think.

01:15.65
Andrew
Wait, is that a thing? Is that a setting, UltraThink? Or is this...

01:21.16
Sean
If you type in ultra think, something will happen. But I think it's just called slash effort max now, not ultra think.

01:26.29
Andrew
yeah, yeah, yeah.

01:27.67
Andrew
yeah

01:28.95
Sean
maybe ultra think is still a thing though I'm not super sure I know that all of Facebook uses ultra think do you see that Facebook's and internally has a competition of like how like they're all token maxing because there's a leaderboard of how many tokens you're burning and one guy's burnt like

01:44.88
Andrew
Is this why people are like making fun of, I've seen a bunch of people talking about how stupid it is to like reward people for like maximizing tokens.

01:52.34
Sean
yeah

01:53.92
Andrew
Is this where this is coming from?

01:55.41
Sean
Yeah, yeah, yeah. 100%. 100% it is. Yeah. But have you turned on your Claude buddy? Do you have your Claude buddy yet?

02:05.24
Andrew
So no, is this, is this like a default cloud thing or is it something you have to install?

02:08.36
Sean
Yeah. Yeah, yeah. No, Claude code, just type in slash buddy. You got like a random buddy. I got a common turtle. his name is His name is Flukish, but I like to call him fuckish.

02:23.08
Sean
he' He's kind of an asshole. Like, it tells you, like, a personality rating, and it's, like, everything else is not helpful. Like, ability to debug, like, zero. Ability to whatever, zero. Snarky, 95. Yeah.

02:36.12
Andrew
So sounds like you, they they cloned you.

02:37.69
Sean
yeah I know.

02:39.68
Andrew
they somehow just interpreted you.

02:42.32
Sean
I'm just a common turtle, man.

02:43.62
Sean
Hmm.

02:44.22
Andrew
and So I use i exclusively use Claude inside of Conductor. And so I've never actually like looking directly at the Claude metal. Like I'm not and the Claude code interface.

02:55.56
Andrew
I'm always seeing it inside of just the sort of conductor wrapping.

02:58.95
Sean
Yeah. Yeah.

03:02.35
Andrew
Austin was just showing me a version, like an yet another

03:07.92
Sean
Yeah.

03:08.88
Andrew
uh one of these things it's called superset which is apparently like conductor but you're more you're getting more of the direct cloud code experience which i don't know if i need or want

03:13.18
Sean
Oh, I saw that. Yeah, yeah.

03:23.47
Sean
Yeah.

03:25.69
Andrew
But yeah, what i have also i have not tinkered with my like default fault Clawed settings at all.

03:25.81
Sean
Yeah.

03:35.42
Andrew
like i I haven't done the EffortMax thing yet. I haven't messed around with my ClawedMD file. I've just been like just been rolling with with what I get out of the box. Austin also raves about a set of skills called superpowers that I heard Ian Lanceman talking about some recently too.

03:56.24
Sean
Have you seen people talk about like impeccable? Okay.

03:59.16
Andrew
Oh, is this the is this kind of like the UI?

03:59.36
Sean
U-I-S-H.

04:02.80
Andrew
Well, I still don't, yeah, so no one knows what UiASH is going to actually be, but it seems like similar vibes.

04:03.00
Sean
Yeah. yeah

04:10.84
Sean
Yeah, someone... i saw a post yesterday about someone talking about how, like, a before and after they got to use UISH.

04:19.21
Andrew
Oh, sick. Yeah, I guess maybe they've started rolling out some invites, but it's still pretty small.

04:20.21
Sean
Yeah, yeah.

04:24.92
Sean
Yeah, it just looks like it knows how to use Tailwind.

04:29.41
Andrew
It's, yeah. I'm curious how they're going to charge for it, how they're going to charge for UISH, because if it is just a set of skills, it seems like that's going to be a hard thing to charge an ongoing subscription for, because like once people have the skills and

04:30.51
Sean
Yeah.

04:33.07
Sean
Yeah.

04:46.68
Andrew
Unless the skills the skill files, unless you never actually get access to the skill files, it's like all remote.

04:53.42
Sean
That's terrible. I would not use, I would not use that. That's like, I would just, yeah.

04:57.49
Andrew
Yeah, it seems like it would be really slow.

05:00.10
Sean
oh So I don't know how you would stop it from like just me going or just me going rebuild yourself, you know?

05:02.79
Andrew
Downloading the files.

05:08.24
Andrew
Yeah.

05:08.53
Sean
Yeah. yeah

05:10.96
Andrew
i It seems like there's gonna be a piece beyond, because I've also heard them talking about it kind of like a tool, so I'm wondering if there's more to it than just skill files.

05:21.94
Sean
Maybe it's like, you know how like a lot of those like styled component libraries have like theme creators. I mean, you can download it. Maybe it's like you get access to that. You get to like theme tailwind and you download it as like a specialized skill just for you.

05:36.66
Andrew
Maybe.

05:36.76
Sean
and then like that's your client. I don't know. i got I got no clue.

05:39.92
Andrew
Yeah, but impeccable is like trying to do the same thing, trying to solve the same problem of like teaching these things how to have design taste, right? Yeah, I don't need it to have design taste necessarily because I feel like like Austin and I can can provide that.

05:47.25
Sean
Yeah, yeah, yeah.

05:56.24
Andrew
i just need it to like stop being so frickin inconsistent and like I need it to be able to like make a button fucking functional.

06:00.88
Sean
Yeah.

06:07.24
Sean
Yeah.

06:08.08
Andrew
Like it just seems so bad at front end stuff still.

06:11.00
Sean
Yeah.

06:11.70
Andrew
like Like functional front end stuff, like not even design front end stuff, but just like making front ends that work.

06:12.88
Sean
It's not going to

06:19.50
Sean
Yeah, it's not going to back in either for what it's worth. The amount times like I got slash V1 slash API and then another service gets built as slash api There's so many mismatches. Yeah.

06:35.57
Sean
Yeah. yeah I have seen the sharp decline situation. Claude gate. You can call it Claude gate.

06:43.04
Andrew
Claude Gate, like that.

06:43.76
Sean
Claude gate.

06:46.52
Sean
and So did you watch the Glasswing? The Glasswing?

06:50.11
Andrew
No, I figured you would be paying attention to that. This is Glasswing. There's like two names for it. There's like a name for the model, and then there's Glasswing, which is like the name for the project or something like that.

07:02.95
Sean
Yeah, yeah. There's Claude Mythos, which is the...

07:06.24
Andrew
Also say hi to Vanta for me.

07:08.09
Sean
Oh. Yeah, Vonda says hi. does not Vonda's my cat for anyone who can't see what's happening. Yeah, there's Mythos, which I think is the model because people are finding O'Day's with it I have like, going back, like i have I have like question marks about the whole thing.

07:21.52
Andrew
Yeah.

07:27.49
Sean
My biggest, my the most annoying thing about the Glasswing promotion video was Dario going he said something along the lines of, we were out to we were we were set out to build the best coding agent, like, so yeah, coding agent, and in turn, we realized that we, asked oh, so, like, because of that, built the best security agent. i was like, mm, mm, I have feelings about what you just said there.

07:57.27
Sean
Those are two very different things.

08:00.02
Andrew
Also like finding zero days is like one very, very, very small piece of cybersecurity, like the actual work it takes to do security.

08:06.73
Sean
right right right the that being said like someone did find like six plus chromium bugs by like turning clog into like a fuzzer so don't know man like chromium bugs they pay out a lot of money for that so yeah much more than your token spend but

08:24.08
Andrew
Yeah.

08:32.15
Sean
Yeah, so what are you what are you doing now? Are you still on Cloud? Are you on the whole Codex, Cloud Codex train?

08:38.04
Andrew
No, I just don't, I'm, uh, I'm just cheap and don't want to pay for codex on top of paying for Claude, uh, max.

08:41.14
Sean
Yeah.

08:47.12
Andrew
I just, I shouldn't be cheap, but I am, and I just don't want to pay for all these things. Uh, so I'm still just, just, I'm trying to figure out how to like make Claude more functional, like how to prompt it better.

08:59.66
Andrew
How I need to, I'm, I'm probably gonna play around superpowers, play around with, with tweaking my like settings a little bit.

09:09.45
Sean
yeah i do wonder i do think it like i'm sure there is a sharp decline but i think like the other thing that's been subsidized besides like token count is your not not you but like someone's like ability to andtuit intuit into it what they need to do with claude because like like i guess my the point i'm getting at is like

09:09.70
Andrew
yeah. but yeah

09:34.91
Sean
like let's say there has been like a, like like let's say Opus is now definitely like a shittier model, right? But like because Opus was so good, we could be a lot more vague and and like just trust to do everything.

09:47.26
Andrew
Yes.

09:49.51
Sean
And now, like but what we should have been doing and what you know i feel like everyone had like like a good portion what we're saying we're like hey like you have to think about harnesses and context engineering and all this sort of stuff constant experiment for like what makes the most sense and what all that sort of blah blah blah now we're like like i guess like a double whammy of like shitter model plus like bad workflows that didn't that weren't talking optimized

10:01.08
Andrew
Mm-hmm.

10:16.57
Andrew
Yeah, people built, over the past couple months, people got too reliant on Opus and built bad habits. And now you can either make Opus better or you can improve your habits.

10:30.61
Sean
right yeah that being said i also feel like a lot of the people who are doing really cool stuff and are the leading leading edge were also saying that opus is shitty or so who knows who knows

10:30.57
Andrew
Yeah.

10:44.21
Andrew
Yeah, man. I find, yeah, I do find the dialogue also exhausting because it's like someone will be complaining about Opus being shitty. Someone will be talking in like the next tweet about how Opus is like, we're all fucked because Opus is so good and it's just running their entire life for them. And I'm just like, good God, I'm so tired of it.

11:05.83
Sean
Yeah.

11:06.89
Sean
Yeah. Me too. Me too. every day is a, every day is a new thing. yeah. yeah On the internet.

11:18.29
Sean
The, the worst thing is like, I know that if I go outside right now in New York and I go, Hey, do you use do you know what Claude Opus is?

11:30.52
Sean
Like nine, die 90 out of 10 times I think someone's going to be like, what are you talking about?

11:40.13
Andrew
yeah

11:40.89
Sean
I'm going to look like I have AI psychosis, which I definitely do. But yeah.

11:44.25
Andrew
you do yeah yeah you for sure do yeah i want to spend more time with those people and less time with people like who are just in the weeds all the time uh although at the same time i do sometimes feel the urge to be like like my friends who still haven't like used anything but free chat gpt i'm like

11:45.90
Sean
Yeah. Yeah.

11:52.16
Sean
Yeah.

12:08.89
Andrew
you need to learn. Like, this is going to be a problem. You need to like get ahead of this.

12:14.49
Sean
Yeah.

12:15.41
Andrew
But maybe they don't. I don't know, man.

12:18.10
Sean
Do you... Have you pilled anyone?

12:25.82
Andrew
don't think so. Yeah, that's kind of what I'm talking about. Like I sometimes feel, i sometimes feel in equal measures, like I have a responsibility to AI bill the people I love and also like AI billing the people I love is the worst thing I could do for them.

12:37.78
Sean
Yeah.

12:42.88
Andrew
it's like

12:49.03
Sean
Yeah. Yeah. Yeah. Same. But for what it's worth, I did AI build Ben and it's kind of sweet. It's kind of, it's kind of dope. It's kind of dope. We have like now a Friday morning call between him, i and Yarek called knowledge sharing where we just like, cause we're all out here just burning tokens.

13:13.57
Sean
Just yeah.

13:13.53
Andrew
Yeah. Lighting them on fire.

13:16.70
Sean
Yeah.

13:17.50
Andrew
Tell me about the system that you you were tinkering with. I get like this long ass text from you at midnight the other night.

13:24.25
Sean
Yeah, yeah,

13:25.02
Andrew
And i like I read it and had that, what's what's the meme? It's like like, good for you and I'm not reading that or whatever.

13:34.75
Sean
yeah.

13:35.45
Andrew
Do you know which one I'm talking about?

13:37.85
Sean
No, but but yes, the the concept, not the, not the, yeah.

13:41.85
Andrew
What is it? I can't remember. There's there's a meme that's like, oh, it's like good for you, or I'm sorry that happened.

13:49.12
Sean
Yeah, yeah, yeah.

13:52.28
Sean
OK. That's how I felt about your voice message. I was on a client call.

13:56.35
Andrew
yeah

13:56.79
Sean
I clicked the transcript. I was like, god damn. Sucks.

14:00.32
Andrew
yeah yeah yeah

14:01.14
Sean
Yeah.

14:04.97
Sean
you

14:04.83
Andrew
Yeah, I get to spend all last week trying to fix my shitty bookkeeping from the last year, last like year and a half, which was miserable. i i did the dumbest thing in the entire world. I i tried to do the whole profit first method with my my, like mercury accounts where I had multiple accounts.

14:20.41
Andrew
And so that made everything complicated because there's like lots of transactions flowing around.

14:24.35
Sean
Yeah, yeah.

14:24.87
Andrew
Okay,

14:25.33
Sean
That's called structuring in money laundering world. so I mean, not that you're laundering.

14:29.37
Andrew
great.

14:32.89
Andrew
No.

14:33.05
Sean
You're already clean money. yeah.

14:36.34
Andrew
Although there was a day where i I was trying to transfer money between accounts and I forgot that I had auto transfer rules set up.

14:41.46
Sean
Yeah.

14:42.58
Andrew
And so I just created this like, it would 100% look like money laundering because it's just money flowing around in ways that makes no sense.

14:51.60
Sean
Yeah.

14:53.34
Sean
I mean, it's a great way to like, like, you know, those like show your bank balance, like show your revenue.

14:53.66
Andrew
But the

14:59.66
Sean
It's a great way to spike the revenue number out real high.

15:02.40
Andrew
No, it spiked my cost because it made it look like I had all these crazy expenses because I kept withdrawing the same money over and over again.

15:09.82
Sean
Right, right, right.

15:12.67
Andrew
but, uh, the really dumb thing I did is I created a new account and then like gave it the same name as an old account and changed the name of the old account instead of like just giving the new account, the new name.

15:11.33
Sean
Okay.

15:21.75
Sean
Oh no.

15:24.08
Andrew
And so then I had to like dig through all these transactions to figure out which ones belong to which. And it took me so long to like untangle it. it is all orderly and organized now, but I was like, dear God, why did I do this to myself?

15:40.24
Sean
it gives me the same feeling of like when I think about, when I heard about Robin Hood the for the first time, like way back. And I was like, maybe money shouldn't be that easy to like, maybe shouldn't make the UX that easy. You know, like, like getting rid of fees, maybe not that good of an idea. Yeah.

15:58.83
Andrew
there should be some friction to creating massive bets that could bankrupt you overnight.

16:00.94
Sean
Yeah.

16:05.11
Sean
Yeah.

16:07.09
Andrew
Yeah, yeah, fair. All right, tell me about the system.

16:08.95
Sean
right. Okay. Okay. Okay. So. Okay. Okay.

16:12.69
Andrew
you're going to have to catch me up on the whole Y Combinator dude, the the G stack or whatever, because I i saw that and was like, no, I'm not reading this shit.

16:21.52
Sean
Yeah, same, same, same. I saw Gary, Gary Tans. So Gary Tans is a thing called G stack. G stack is like a bunch of different skills and you can like, also like instrumented with open claw and like, and can like a ton of Thomas, they like, like build out things at the same time, you know, like people are kind of clowning him because he's out here talking about like lines of code and people are like, how can you like measuring like your software engineering skills by lines of code per day is a crazy calm.

16:47.51
Sean
It was like a crazy way.

16:48.01
Andrew
Almost as bad as the Facebook guys measuring token counts.

16:50.31
Sean
Yeah. Token Max, yeah. like It was a weird thing to brag about. But so anyway, it felt too complicated. i was thinking about what you said last time, where like Conductor doesn't remember. or sorry sorry, the agents don't even know that like Conductor really exists. So it's hard to like have that conversation. Did you know that Claude has its own, like claude have you heard about Claude agent teams?

17:16.15
Sean
So there's a setting that you can turn on. It's in experimental mode where Claude now does the whole conductor thing by itself.

17:18.10
Andrew
No.

17:24.12
Sean
You just tell it to do agent teams and then it spins up a bunch of sub agents.

17:23.86
Andrew
course.

17:27.88
Sean
The master agent has like a task list, a shared task list.

17:31.09
Andrew
Yeah.

17:31.62
Sean
All the sub agents all like do, you know, are tasked to do all these sort of things. i was like, okay, cool. Don't need the conductor thing anymore. And I was sort of like thinking about like how I think about like product creation, which is typically I'm not a very good like MVP guy as, as you know.

17:50.41
Andrew
You don't say, Sean.

17:51.56
Sean
Yeah. Yeah. I'm not a very good MVP guy. and what like my process for like were thinking through like a product and like writing the spec is like i work with just like cloud on desktop or whatever and i like design out a lot of different screens and like mainly because like that part was easy and i didn't wasn't doing like cloud code stuff a lot yet so like I would just like spec these things out, like create these PRDs, create like sample screens and things, and just save it there and then think at some point I'm gonna go back to you doing this and actually building this out.

18:19.85
Andrew
Thank

18:29.77
Sean
Because with like Cloud Code and everything, it's still very, like it still feels very hands on, right? Even with Replit's autonomous max mode, it runs for 20 minutes or they come back and check and it still like takes the entire day to run out.

18:45.59
Sean
So then I was like, what if what if I just build a system where like, it's not like spec driven by feature, it's like full, full like spec of a product.

18:57.29
Andrew
Okay.

18:57.68
Sean
of All the things that I want has all the like, like, like, has like the like like whether it's like like the API or the design system or whatever figured out and also so like like the ontology like completely figured out and I dump it into this thing and what it then does is it spins up a so so then then as was sort workshopping it like The idea here is that it spins up like a codex agent.

19:25.95
Sean
And that's like the sink like the main orchestrator. salvo Salvo is the name of, I guess, this full process or thing. It's just like a script that you know keeps sending it to the next step.

19:38.84
Sean
The other thing is that like I pre-make all the decisions on like this is the... like it's always going to be like this stack this auth provider this type of database yes there's going to be like additional things depending on what i want to build but like the like diff lowest common differentiators are all agreed on already and I'm working within those constraints.

20:01.97
Sean
Okay, so like it spins it up. Basically there's an agent then it reads the spec. It also has Claude read the spec. They both give, it's like a kind council of agents. they They both give feedback on it.

20:15.57
Sean
The spec gets like rewritten based on like if and any assumptions if i've I've left over any assumptions and whatever. it makes like a spec that they can both completely approve.

20:26.37
Sean
That spec then gets decomposed into a bunch of different sprints and in each sprint a bunch of different issues and with each issue like even like more specific sub issues and the sub issues are almost like pseudo code-esque and then all lives in linear through like a linear MCP and that acts as like the canonical backlog of things.

20:49.39
Sean
So now we like then spin up Cloud Code agent teams and Cloud Code agent teams goes and does it goes sprint by sprint. It hits the sprint, it finishes all the tasks.

21:02.04
Sean
It creates a sprint review task, which Codex then goes, and reviews. And then once that's done, it makes a PR in GitHub.

21:12.40
Sean
Once it makes a PR in GitHub, it kicks off like a GitHub action. i haven't done this part yet, but like the idea is to like also like use like some graphs like community edition to like do the static analysis part.

21:24.85
Sean
and then And then it keeps going through basically keeps going through these loops of these sprints until like halfway through when there's probably like some version of at least like that web app. So maybe like sprint six.

21:35.09
Sean
Claude is also told to write all these like playwright scripts. So now we're doing end-to-end playwright testing every single time. The playwright agent also creates... a new project in linear, it logs all its findings so that basically like every single sprint gets like fully reviewed until like Codex is happy with it and allows it to kind of move forward. And then at the very end of like all those sprints being done Salvo also spins up like a Shannon or like a pen AGI, like some sort like AI pen testing open source tool and just hammers it.

22:12.84
Sean
And then it goes through the like remediations and findings. Basically the idea is if I have a really thought out spec and all these like previously, like this opinionated stack,

22:24.05
Sean
that I'm always gonna use, I should be able to throw that all in, hit run, go to sleep, wake up in the morning with a dev instance of like a nearly done production app.

22:36.86
Sean
And then the last part is me doing like a like of like an actual user like review and then like calling out any sort of issues. putting it into linear and hitting run again. It's one sort of final sprint before it's like fully ready to deploy with a full like Docker Compose and all that sort of stuff. And it's like, i guess this is all like stuff decided in originally, right? it's like It's always going to be a Next.js web app. It's going to run on Coolify. It's going to run on, which is going to be put on Hetzner, that sort of stuff.

23:05.97
Sean
Anyway, it doesn't work, by the way. Like, I couldn't get into it. I haven't figured out. Like, I keep i keep hitting things. i At this point, it is Wednesday midday. I've run out of my weekly limits on both Codex and Claude.

23:19.78
Sean
I've burned through 100% my weekly limits. So, yeah.

23:23.48
Andrew
Sean, congratulations, you have invented waterfall development.

23:29.17
Sean
Thanks. Thanks.

23:31.48
Andrew
Well done.

23:32.18
Sean
Thank you. Thank you.

23:33.97
Andrew
Okay, so this like perfectly feeds into what I've been thinking about and feeling frustrated about with like Claude and like agentic development workflows, which is basically that I think like AI and heavily incentivizes us to make bad like product and time management decisions.

23:59.33
Andrew
So like what you're describing is like being able to magically build an end-to-end plan and then instead of building things in small chunks and testing them as you go, building everything at once.

24:13.14
Andrew
And in

24:13.95
Sean
Wait, wait, aren't I testing them as I go, though?

24:14.06
Andrew
in Well, you're relying on agents to test them as test things as you go.

24:19.21
Sean
Oh, sure, sure, sure. Sure.

24:20.62
Andrew
And so you're trying to just give the agent a big, massive plan and then be like, no human involvement, the agent can handle everything, and I'm just going to get a magical working prototype or working MVP out of the other end.

24:35.07
Sean
Maximum desirable product, if you will.

24:36.69
Andrew
which is like the the like same thing that people did with like when they hired agencies.

24:38.10
Sean
Uh-huh.

24:42.47
Andrew
They were like, I want this big, massive plan with everything I could possibly want, and then I just want you to build it, and then I just wanted all the work.

24:46.55
Sean
Mm-hmm. Mm-hmm.

24:49.87
Andrew
And like... That's just like not really how... like We invented agile development for a reason, because like that's not how things work.

25:01.61
Andrew
And like if agents were AGI level, maybe this would work. And I i am really curious to try like experimenting with Playwright to see if I can get Claude to do better in DIN testing.

25:14.77
Andrew
I've seen some people having talking about having some success with that. But like the thing that I've been feeling as I've been building MetaMonster is just like how many rough edges there are and how much polish it takes to make something really good.

25:30.64
Andrew
And that polish, I believe, still takes human input. like I just don't think agents are far enough along to really be able to make good critical decisions that lead to a polished product yet.

25:45.36
Andrew
And so when you try to like outsource all of that thinking, you end up with this like massive backlog of like human input needed, where it's like, I have this big thing, and now I need to test it and go find all of the little things that are wrong.

26:00.23
Andrew
and like polish them polish them and like versus if you polish as you go then you are like slow it's this classic like skateboard approach of like you're building something good and you're like slowly making it something better versus like you end up with this mess and you're trying to like edit and prune and and polish and tweak and and it just becomes like overwhelming and it becomes very easy to get mired in like crap you don't need And I think because ai is makes it easier to build things, it makes it very very makes scope creep even more attractive. and And there's people who go, oh yeah, but scope creep doesn't matter anymore because it's free.

26:44.71
Andrew
And it's, but it's not free. AI makes us think that it's free, but until we have like actual AGI that can have like really good taste and like these perfect memory systems that everyone claims to be building where where it really has full context of your project and your your customers and, you know, what you tried last week and all of this stuff until we really have that, all you're creating is like, you're increasing the human backlog.

27:03.97
Sean
right

27:13.45
Andrew
And like the the the blocker is still the human input. And so you're just like incentivized to like expand the surface area faster than like the humans can keep up.

27:19.54
Sean
Mm-hmm.

27:25.86
Andrew
And so you're just incentivized to build crap. and And then like there's this whole other incentive of these like systems. like there's this Everyone's building these crazy systems or claiming to build these crazy systems. And like it becomes very tempting to like spend two weeks on the system. And then you look up and you're like, wow, two weeks went by and I haven't made any meaningful progress on my actual product.

27:51.93
Sean
Mm-hmm.

27:51.93
Andrew
And so it... you Like, I'm not like an AI doomer who would go so far as to say, like, we aren't able to move faster. But I think that, like, AI is this very tempting, it's like this very useful tool, but it's very tempting in a way that can, like,

28:12.05
Andrew
slow you down as much as it can increase your speed. And it takes real, real, real product discipline to like figure out the balance. And I'm not saying we've figured this out.

28:22.81
Andrew
Like it's something that I'm feeling a struggle with.

28:23.90
Sean
Sure.

28:26.46
Andrew
So for the last like two weeks, I've been doing SEO work with Metamonster, and I'm just running into all the little things that like we would miss if I wasn't doing SEO work with Metamonster.

28:38.40
Andrew
And like the agents just haven't found this stuff. The agent is like, yeah, this thing works. And then I go test it, and I'm like, no, it doesn't. And like in theory, this magical system of like playwright and everything might be able to catch some of this stuff, but some of this stuff is like, it's not even that it works or it doesn't work. It's like how it works is painful or or or doesn't fit a human workflow or something. And so, yeah, I just like, it feels like, yeah,

29:08.50
Andrew
you know AI feels kind of like the the ring in Lord of the Rings that's just constantly whispering to us its power. And like yes, it can do crazy shit, but like it can also fuck you over.

29:16.74
Sean
Yeah.

29:22.40
Sean
I

29:25.73
Sean
agree with you for the most part. I think the, like,

29:37.37
Sean
yes, because there's a cat in front of me. no one else can see if if you're not watching.

29:42.73
Andrew
If you're watching on YouTube, you can see it's the cutest thing ever.

29:43.11
Sean
Yeah. Yeah.

29:45.96
Andrew
Also, i love that vonta like Vonta's name is so perfect right now because Vonta's just like sucking in all the light. It's just like a shadow moving across the screen.

29:56.18
Sean
i agree with you because but what if all of the work was done before development started like what if all of the planning like like think about it in terms of like building a house right like like and but for the record i'm not saying this works in fact like i got like like like a day out of it and i was like i'm not it's not worth like the time it takes to do this is not worth me just literally like manually doing some of this getting like the app up getting and then like testing for the main flow so like i'm with you for the record i think that's a me not knowing how to build like this orchestrator well because the workers kept dying less like i haven't

30:45.90
Sean
but i agree with you it's also like very slot ma machine-esque or effective on building like a very complicated agent slot machine that spits something out of the at the end the i think i think my thought is like uh maybe that just means like the spec wasn't good enough though you know

31:04.61
Andrew
But this is the classic problem. Again, this is like the fallacy of waterfall development and why agile development got invented in the first place was everyone has been saying this for fucking 30 years. It's like, oh, if I if i create a good enough roadmap from day one, if I just like create the right spec from day one,

31:14.80
Sean
Yeah. Mm-hmm.

31:21.80
Andrew
then everything, like, I don't need to, like, build things in little chunks. I can just build the whole thing at once. But the reality is you don't know what you don't know when you, so at the very start of a project. You, like, you learn by doing.

31:35.81
Andrew
You learn by, like, building things and testing them and going, oh, I thought it would work this way, but once I actually have this in my hands, no, it doesn't work that well. I need to do it a different way.

31:47.17
Andrew
And like people have been using the house metaphor for ages, but like houses and products are fundamentally different things.

31:50.44
Sean
For sure.

31:52.96
Andrew
And I would be willing to bet that architects like, you know, if they could test as they go easily, they they would. But like a house is a fundamentally different thing that like, you know, it's, it you know, you need like the frame before you can put the door in and whatnot.

32:12.71
Andrew
And like products are that way a little bit too. Like I've never been an agile purist that's like, oh, you can't, there's, you can't plan anything into the future at all. You have to just look like one week ahead.

32:25.78
Sean
Mm-hmm.

32:26.92
Andrew
But I think like what I am experiencing with Metamonster is that like, AI makes it very tempting for us to increase our surface area. And the more we increase our surface area as a small team, the harder it is for us to spend enough time with them all of the facets of the product to really get them dialed in to where they feel really smooth and really, really good.

32:51.02
Andrew
And like, just sitting down and thinking about all this stuff ahead of time would not fundamentally work because i don't know what feels good and what feels bad until I'm using it.

33:02.99
Andrew
in

33:03.34
Sean
Oh, wait, wait. But, but, okay, hold on. But what if you're also designing like a prototype of the front end before you do any of this stuff?

33:12.56
Andrew
I still think that like there's a difference. Austin and I have done this a little bit this round, and it was helpful. like We built some prototypes with AI, and it was like really cool to use AI to build prototypes of like facets of the product to figure out like how

33:27.95
Sean
Sure.

33:31.22
Andrew
like what worked and what didn't. It was like better than than mockups because we could use it and test it.

33:36.76
Sean
Yeah, yeah.

33:37.92
Andrew
And so again, I'm not like an AI doomer who's like using AI to speed up. Like you can use, I think it's like a local, max versus global max kind of thing.

33:48.52
Andrew
Like I think AI can absolutely speed up your time to get to local maximum.

33:48.69
Sean
yeah yeah

33:54.20
Andrew
But it's like that global maximum is if you try to skip the local max to get straight to the global max, you end up with a mess and you still need to take things like a chunk at a time.

34:05.70
Sean
Right.

34:05.85
Andrew
Because like you like you know we So we built these prototypes, but we still didn't like really know how things were gonna work until we had the fully like the full end-to-end process. And like you know you know it's a prototype is still a prototype. It's still limited in what it can do. so like you know, figuring out how to get the data out of the tool and into the CMS is like something I've been doing lately. And I just, I don't know that there was a way to like magically know how to do that well, other than to sit down and do it.

34:44.19
Sean
Sorry. I think that is, don't think I disagree with that. I think like the, I mean, the what happens is like, you know, it builds a thing and then you realize that it needs, like you realize that your spec was not good enough in the first place.

35:04.96
Sean
So you need to like do all that sort of stuff. I think the, I think the way,

35:10.63
Sean
I

35:14.66
Sean
think I don't like i think the reason one of the one of the one of the reasons why we went why waterfall doesn't work is because the waterfall process for like a product is like six months of time you know and whereas in this case it's like less than $500 in like a night of time I think where I'm

35:36.86
Andrew
This is exactly the fallacy that everyone is telling right now, and i disagree.

35:40.24
Sean
Huh, okay.

35:41.23
Andrew
Like, yeah.

35:42.47
Sean
But you don't, okay, so like, like

35:42.95
Andrew
But keep going, keep going.

35:46.10
Sean
the problem is that if you keep adding stuff onto it and making these changes and whatever, okay, I think i think there's there's one big difference. And like my requirement is I really don't want to write code.

35:59.39
Sean
Like as i will I would like to write as little code as humanly possible.

36:04.00
Andrew
Check.

36:05.61
Sean
And like, so how do you how do you get, because the agents are amnesic and they lose context and all this sort of stuff, as you sort of build, as what I found is like as I started building things from like a V001, 0.01, and I kept building up to it, there's a there's a breaking point where it like,

36:32.49
Sean
like is unable to kind of keep moving further. Cause it like has made a lot of like weird shitty slop code decisions along the way. Cause I didn't know any better about this stuff.

36:44.51
Sean
So I guess what I'm saying is like the way I've been seeing the spec that I input into Salvo is after I've like shittily vibe coded a lot of different prototypes, tried a lot of different things, and like made specific decisions on what something is, then it runs out the entire intent, product.

37:06.52
Sean
And I'm not sure that's gonna be like the most perfect thing yet, right? That's why there's that like final human smoke test sort of thing, where like if it's completely wrong, that means something else was very wrong in the the planning, but like otherwise it's likely i need to add on a couple of other features and whatnot. that can just go back into the spec and you write a new spec and you run it. So it's just this like waterfall process. But I mean, it's still like cycles.

37:31.40
Sean
It's just like larger cycles where instead of building, instead of agile sprints where you tack things on, you just refactor the entire code base from the beginning and shit out the the final product again.

37:46.56
Sean
Because you know all the flows that you want at that point. see. but you i mean

37:50.43
Andrew
So this is fundamentally different from what it sounded like you were initially describing, which was like you were going to start down sit down at like day zero of a project and like write this magical perfect spec from like just pop it out of your brain and then build the product and it was just going to work.

38:09.86
Andrew
which I think is the like myth that so many people are selling. What you're now describing is like going through more of an agile process and then getting to a point where you need to do a big refactor and trying to build a system where AI can like kind of do that refactor in one go instead of in smaller chunks.

38:29.28
Andrew
Is that right?

38:30.93
Sean
i think it's like I think it's like the combination of both. I think it's like the re like the refactor requires a magic spec, right? The magic spec is not like a day zero like sort of thing as much as it's like a like I have been like fucking around with it in enough times to like now know more and more of what I want and what this needs to shape up like so that I can like run it out again. I think, okay, this is what I'm trying to prevent.

38:56.30
Sean
that like ai nottaker sort of idea thing that i like into ideas sort of thing that i mentioned uh sorry one second That AI Notetaker sort of thing that I mentioned was really cool. That MVP like got people excited. And i can't take that shit live because a lot of weird decisions were made along the way. I've refactored a lot of lot of things.

39:24.34
Sean
But because of that prototype, I know all the things that I wanted to do. And I can turn that into a spec and it has all those those elements and keep refining it and keep asking AI to like red team this idea and like tell me what I'm missing where these assumptions are until I feel like and and then now I'm giving it a full design system and application reference of like what the front end shell needs to look like.

39:54.40
Sean
And and the uh and and like the wiki the and the andre kaparthi wiki version of this thing and then i throw it into salvo and get something out and like

40:09.42
Andrew
Yeah, we're talking about fundamentally different things, which is like you're talking about, again, prototyping something and getting to the point where you have a good idea of like what you want. And then when you need to do a bigger refactor to kind of bring things up to speed, you know,

40:28.62
Andrew
building a system that can handle larger larger chunks of work instead of small chunks of work.

40:34.59
Sean
Yeah. Yeah, yeah, yeah.

40:35.30
Andrew
I'm talking about the allure of AI to like encourage people to build more than they can actually manage.

40:35.39
Sean
Totally.

40:38.74
Sean
yeah yeah

40:42.47
Andrew
Because I fundamentally believe that like the The limiting factor is still human input. like Our time, our capacity to think.

40:55.68
Andrew
and and like we you know It is very easy to just build, build, build and like skip the hard work of testing and refining and figuring out what works and what doesn't. and then you end up with like a pile of slop.

41:10.63
Andrew
This is the same reason I'm against people like AI generating a shit ton of content for their website and reviewing it with AI agents instead of reviewing it with humans.

41:10.93
Sean
Totally.

41:19.26
Sean
Totally.

41:19.36
Andrew
Like I just, i don't fundamentally I don't fundamentally believe that agents are capable yet of being totally like independent from my personal experience using agents a shit ton.

41:27.32
Sean
Totally.

41:33.11
Andrew
and And I think that there's this like very corrupting nature of AI to be like, it's just so fucking easy to like scope creep, scope creep, scope creep.

41:42.82
Andrew
And it takes discipline.

41:43.13
Sean
Totally.

41:45.22
Andrew
And I think like the the downside still to the approach that you're talking about is the same downside that there always was before agents existed to the big refactor, which is like, if you like,

42:00.24
Andrew
do this big refactor all at once and then hand it over to a human to test, there's just a lot to test. And it's just like kind of overwhelming to test it all versus if you like break it up into little chunks and like test as you go then like it's much easier to like manage as a person.

42:19.84
Sean
For sure. For sure.

42:21.51
Andrew
So I think there's still a little bit of the same point that I'm trying to make in like the process that you're trying to build, but I think Yeah, we're we are talking about slightly different things.

42:32.39
Sean
Yeah, yeah, yeah. For what it's worth, I agree with you. I think that's why, like, you know, that's why Gary's blog loads both the desktop version and the mobile version at the same time, no matter which thing you're on.

42:46.03
Sean
Like, you you load two different websites, right?

42:48.96
Andrew
Yeah.

42:49.80
Sean
Like, uh...

42:51.52
Andrew
And I think the AI maximalists, like maximal maximalists would say like, it doesn't matter. It's like worth the, like the overhead cost or whatever. It's worth the rough edges. And like the, if you just lean into using AI and building these AI systems that review themselves, like eventually the models are going to get good enough that like, you don't need humans and you'll already have the the systems in place.

43:18.86
Andrew
But I'm like,

43:19.00
Sean
Yeah. Yeah.

43:20.42
Andrew
if the AI gets that good, the AI will build the system itself. And in the meantime, it's just not there.

43:23.65
Sean
yeah

43:27.39
Sean
yeah for what it's worth i hate when people say it doesn't matter because the models will get good enough like like what like we keep saying like agi is here or is coming and or is going to be next year and it's you know we said 2026 we kept saying like this like opus situation is not you know any more convincing that like the models will get better so yeah yeah yeah i i think that's fair i think i've seen similar things as well where it's like oh yeah like the one-shot concept or like you can like uh yeah like like these companies can just be like one shot it's like i don't know and i think you i think you are missing

44:07.70
Andrew
Mm-hmm.

44:18.76
Sean
a lot of what happens past just the web GUI that you see that's in this, which is also why I i find it hard. That's why I think I roll my eyes at the whole like Dario going, we built a really good coding agent.

44:32.43
Sean
Therefore it's a great security agent. It's those

44:35.01
Andrew
Yeah, and it's why I rolled my eyes at people saying that like SaaS is dead. I'm like, have you tried to build a SaaS recently, like a good one? Like it's fucking hard still. And it's like, there's just so many edge cases and rough rough edges to polish.

44:47.32
Sean
yeah.

44:47.61
Andrew
Like, you know, so many people right now are saying like SEO is dead. And like, you know I can, I've got this like cloud code skill that can like do everything.

44:59.08
Andrew
that like any SEO tool can do. And I'm like, dog, I'm trying to build an SEO tool right now. And like, there's no fucking way cloud code can like actually sync all this shit to a CMS.

45:09.32
Andrew
Well, and like, there's no way that cloud code has like a good interface for going back and forth with the client on changes.

45:10.44
Sean
Yeah.

45:16.08
Sean
Yeah.

45:16.55
Andrew
can like one shot at interface for going back and forth with the client on changes. Like, like cloud code still can't like upload stuff to Google drive very well.

45:21.14
Sean
Yeah.

45:27.00
Andrew
Like it's a pain in the ass.

45:29.34
Sean
Yeah, I mean, by the way, something for you to know. I don't think your integration with Webflow I think it may maybe be difficult.

45:41.28
Sean
Same thing with like the whole margins thing. I think at some point recently, like their designer API where or CMS data API change, I can't, you it because their rich text element is literally like HTML tags. So it escapes the, like any custom code that you drop in just gets translated into us into a string now.

46:04.15
Sean
Yeah, yeah. Imagine r are us finding out in the middle of a content migration for like a large client. It was like, oh, we have to manually go and add these things.

46:13.39
Andrew
Did we talk last week about how I like tried to have Claude build a Wix integration because like one of my clients right now is Wix and it just like made up APIs that didn't exist.

46:14.32
Sean
Uh-huh.

46:19.46
Sean
Yeah. Yeah, yeah, yeah

46:23.52
Andrew
And then when I asked it to like dig in more, I was like, oh yeah, this is impossible.

46:23.86
Sean
yeah.

46:26.76
Andrew
The APIs just aren't there. And I'm like, fucking great, awesome.

46:28.52
Sean
Yeah. Yeah.

46:32.01
Andrew
So that's why I've spent the last week copying and pasting alt text into Wix.

46:31.36
Sean
Yeah.

46:36.46
Sean
Hell yeah.

46:37.00
Andrew
I'm...

46:37.82
Sean
You gotta get your, gotta to get your 10,000 hours, even if it's control C, control V, you know?

46:40.66
Andrew
and i I'm getting them in, Doug. I'm getting them in.

46:45.78
Sean
yeah, I think, uh, uh, what am I trying to say?

46:51.69
Sean
i think, I think it is still best to treat the coding agents as like junior software engineers who like, know the language well enough to implement, to like pass syntax checks and like nothing else besides that, you know?

47:11.30
Andrew
Yeah. And I mean, I will say I'm not reviewing code. I'm reviewing like decisions a little bit, but, and then I'm leaving heavily on Austin for like the actual like data model shit and to be like, Hey, is this all, are we building ourselves into a corner here or is this good?

47:26.36
Sean
right right right yeah yeah i think the building yeah what what did come out of it that was like actually useful after doing this was like two things one was the fact that i can like use a linear as like some sort of like backlog that the agents can read off of that was one that was cool and the other is i actually ended up like implementing like on track apart these new like llm wiki concept And my like librarian agent, just like every time there's codebase changes, as well as like decision logs and and whatnot, it logs it all into a wiki.

47:54.13
Andrew
Mm-hmm.

48:04.85
Sean
And that has helped the like getting an agent up to speed without like blowing up its context. like Instead of telling, you know when I spin up a new cloud code thing without Without telling it to just like read the entire code base, just like looking at what's relevant to the, yeah to it in the wiki has been, has helped at least.

48:24.79
Andrew
Surely Claude is going to build a better memory system at some point soon, right?

48:29.35
Sean
For sure.

48:29.42
Andrew
Like, like Anthropic's got to have a massive team working on memory and like how to do memory well.

48:30.13
Sean
For sure.

48:32.85
Sean
Yeah.

48:35.13
Sean
Yeah. Yeah. I mean, I hope, I hope.

48:38.84
Andrew
Yeah. And so like I imagine all these markdown memory systems that everyone's building that are all like slightly different are eventually just going to get absorbed by Anthropic, right?

48:46.75
Sean
Yeah.

48:50.33
Sean
Yeah, yeah, yeah. At some point, they're going to build like some sort of like virtual file system or post like small.

48:57.11
Andrew
Did you see AWS just like added like file system operations to S3?

48:58.81
Sean
Yeah.

49:02.22
Sean
Yeah. Yeah.

49:03.61
Andrew
Pretty interesting.

49:03.98
Sean
I, I love Osa by the way, how much, uh, the like best practices completely 180 every week. I love and hate it.

49:14.17
Sean
You know, it's like smash all your context into the agent.

49:14.55
Andrew
Yeah, yeah, it's exhausting.

49:18.25
Sean
It'll figure it out until like, don't chill it. Give it that much context. It's drowning in context to like agents are great with files versus agents will prefer can write SQL better.

49:29.21
Sean
And anyway, I, Yeah.

49:32.25
Andrew
Yeah, yeah, it changes constantly.

49:35.03
Sean
Yeah.

49:36.89
Andrew
Cool man.

49:38.84
Sean
Oh, yeah.

49:40.44
Andrew
Probably got a wrap there, yeah.

49:41.28
Sean
Yeah. Yeah, we got to run. All right.

49:43.19
Andrew
Sweet.

49:44.31
Sean
Cool. Good talk. Hey, I Say Coase's show. We're changing the name end of the podcast. Hey, Say Coase's.

49:48.82
Andrew
AI psychosis.

49:51.58
Sean
Cool.