Andrew Shindyapin: AI’s Impact on Software Development

Andrew Shindyapin, an experienced software engineer, shares how taking part in the Gauntlet AI bootcamp gave him a new perspective on what is now possible using AI for software development.

Andrew Shindyapin: AI’s Impact on Software Development
Lessons from Gauntlet AI Bootcamp

Andrew Shindyapin is an experienced software engineer who recently completed a 10-week intensive training course on using AI for software development. It changed his perspective on what an individual programmer can accomplish with today’s tools, now enabling them to deliver weeks of work in a matter of days. This interactive briefing will address his lessons learned from completing 14 applications in ten weeks.

Tools and methods covered: Claude Code, Cursor (and cursor-rules), Amp Code, RepoPrompt, the BMAD Method, OpenAI Codex, and a recap of leading LLMs Claude, ChatGPT, Grok, and Gemini.

Andrew Shindyapin graduated from PSU with a BS in Materials Science in 2004. He worked as a software developer from 2006 to 2018 and as a product/project manager from 2018. He is also an entrepreneur who founded AutoMicroFarm and acquired InstaTints.

Edited Transcript

Sean Murphy: We’re fortunate today to have a briefing by Andrew Shindyapin on the impact of AI on software development. Andrew is an experienced software engineer who recently completed a 10-week intensive training course using AI for software development. It changed his perspective on what an individual programmer can accomplish with today’s tools. You can deliver weeks of work in a matter of days. This interactive briefing will address his lessons learned from completing 14 applications in 10 weeks. Andrew graduated from Penn State with a BS in materials science in 2004. He worked as a software developer from 2006 to 2018 and as a product/project manager from 2018. He is also an entrepreneur who founded AutoMicroFarm and acquired InstaTints.

Andrew Shindyapin: Thank you for that intro. I’m going to talk about AI’s impact on software development based on my personal experience in the last few months in the Gauntlet AI program, and what the most promising tools and methods are currently. Those could change tomorrow, next week, or next month; things continue to move fast in AI.

One thing that I don’t think will change is that the fundamental cycle will remain Plan-Code-Test. This cycle is not unique to AI:  Fred Brooks wrote about it in “The Mythical Man Month” in 1975, when he suggested the following rule of thumb:

  • 1/3 time planning,
  • 1/6 time coding,
  • 1/4 time doing component tests,
  • 1/4 time doing system tests,

Bottom line up front, we are currently in a learn-by-doing phase, where the most important thing is to start experimenting. Start playing with the newest AI tools; things are changing every week, if not every day. In fact, 20 minutes before this talk, I got an email about Cursor 2.0 being out. AI is shrinking the coding time and some of the testing, but planning is becoming more important. Today’s tooling is scaffolding for the next three to five years; things will keep changing rapidly, but the fundamentals will matter more and more as they do.

Let me talk about Gauntlet AI: it’s an intense 10-week training program to learn how to develop with AI. When I say ‘intense,’ I tracked my hours and logged more than 80 hours a week. In the first two weeks, we were explicitly forbidden to write any code by hand.

It reminded me of teaching my kids how to do the dishes. The first few times, the results were slower and messier than when I did the washing. But I kept at it, and they learned well enough that I could say, “Please go do the dishes,” and reasonably expect them to be done well, with nothing broken.

I actually did 15 apps, 14 apps, and one website,  in 10 weeks. Because we were working at the leading edge, I learned as much from my classmates as from the instructors. There was a lot of work, a lot of discussion, a lot of hashing out ideas, and figuring out what works best.

I was able to build in hours what would have taken weeks

My personal experience was that I was able to build in hours what would have taken weeks before. As far as the tools we used, working from base or foundation LLM tools, building on them with an IDE, then CLI tools, then methodologies, and finally extensions.

  1. Leading LLMs: Claude, ChatGPT, Grok, Gemini
  2. IDE: Cursor (my favorite)
    1. Alternatives: VS Code + Copilot, Amp Code, RepoPrompt
  3. CLI tools: Claude Code, OpenAI Codex, Amp Code
  4. Methodologies
    1. Cursor + cursor-rules
    2. BMAD Method (currently experimenting with)
  5. Extensions / APIs: MCPs, Skills

Planning should remain the primary responsibility of the human in the loop. The fundamentals remain the same, plan, code test, and what we found out is that the better the plan, the less testing and rework would be needed, so you’d spend less time on that.

In the planning phase, I would estimate how long a project would take and ask the AI for its estimate of “How long would this take to implement?”

What I found was that there’s a 40x factor: however many weeks were claimed to take, that’s how many hours it actually takes. So if the estimate were that this code would be written in the next six weeks, it would take six hours to write.

Now the caveat is that it’s only coding, not the time spent planning or testing afterwards. Again, that time hasn’t changed much, but it may change as that’s done better in the future. But it still gives you an idea of what’s possible: a 40x coding speedup. And so whenever you’re able to speed up the time this much, this is what takes AI from Smart Auto Complete, where it could write some of the code for you, to agentic development, where you tell the AI to write the code, give it all the context it needs, and it does it for you. Eventually, we can have something like an AI agent team. BMAD gets into this, and we experimented with this. We actually used the Claude swarm feature, but didn’t find it much faster than working with one agent at a time. It’s definitely something that might be possible and advantageous in the future.

BMAD:  Breakthrough Method for Agile AI Driven Development

I have been experimenting with the BMAD method for the last couple of weeks. BMAD stands for Breakthrough Method for Agile AI Driven Development. We did not use it at Gauntlet because it’s just been recently developed.

I highly recommend that you watch the BMAD masterclass video that I link to at the end, because it kind of walks it’s like an hour plus long video, and it walks through step by step how to actually implement this. (see https://www.youtube.com/watch?v=LorEJPrALcg )

The idea is that there are predefined roles or prompts for the AI agents and a series of questions that each AI agent will ask you in order to move you in a guided way through the process development. The BMAD method seems more focused on startup projects, so it would take you from the very beginning of having an idea and spend time to think through all the research you might need to do before even starting to implement a software project.

You start with a project idea. You do analyst research, and then all of these, except for the last one, are optional, so you can skip them. You can go through the brainstorming phase, the general market research phase, the competitor analysis phase, or skip any one or more of these. At the end, it helps create a project brief, which is a precursor to the product requirements document.

  1. Start: Project Idea
  2. Analyst Research
  3. Brainstorming
  4. Market Research
  5. Competitor Analysis
  6. Create Project Brief
  7. Assess Project Brief
  8. Build PRD
  9. If UX & QA Required, Refine with UX/QA experts
  10. Refine with Architect
  11. Prepare for Dev Cycle
  12. Draft Story
  13. Optional QA Risk Analysis
  14. Implement & Test
  15. Optional QA Review
  16. Commit & Mark Done

At this point, you are working with the product manager, and the product manager role helps you build the product requirements document, the PRD, from the project brief. To compare the BMAD method to what we learned at Gauntlet AI, this is where we really began every project. We didn’t take time to do competitor analysis or market analysis, because we were just focused on taking a project brief—or, you know, it was really a homework or project assignment—and that was very well laid out. At first it was thorough, and progressively we got to doing partner projects, which could be of varying quality, either very well defined or not so well defined, but that’s what we were given.

We used a variety of tools instead of the BMAD method, and I’m partial to Cursor plus Cursor rules, but this is where we’d begin. We would take the project brief and build a PRD. The BMAD method is more formal. It asks if there’s UX required, and if so, it will refine with UX/QA experts. QA required, by the way—UX stands for user experience (I use that as UI/UX, so user interface/user experience), QA stands for Quality Assurance. Whether you optionally do that or not, you would eventually take your PRD and refine it with the software architect role and prepare a series of documents called stories for the development cycle.

This is what the BMAD method does. In our case, we took the PRD—it was a little less formal—and broke it into a series of large features or large tasks. Then we would take each task and break it into smaller subtasks, so it would have between five and 10 large features, between five and 10 subtasks per feature or large task. In total, you would get between 25 and 100 subtasks.Now you are ready for the actual coding.

The subtasks, or larger tasks, would be called stories in the traditional agile approach. This is the cycle where the AI agent (a different one) would optionally analyze the QA risk, then the code would be written, optionally QA reviewed, committed, and marked done, then go back to the next story.In contrast, the Cursor rules approach is more informal: take each subtask and just implement it. Optionally, the agent would stop after each subtask and let you test it.

I usually, depending on how much time we had left, would just tell it to go ahead and implement the whole feature, and then I test the whole feature again. You can be as slow or as fast as you like in this case.Looking at this in a more abstract view, this goes back to the plan-code-test cycle. Whether you use the BMAD method, which I feel is a little too formal for my taste—but that’s only because I’ve spent the summer doing things in a more informal way—or use Cursor rules, or you do something else, you really are doing the plan, code, test cycle.

This really helps you keep the AI agents you’re using on task. It helps you keep things prioritized, organized, and make sure you’re getting what you actually want the AI to do. If we’re able to refine this cycle a little bit more—and I mean the broader society or the community of AI-first engineers—we will be able to use effective agent teams, where you have multiple AI agents working on the same project in parallel, or partially in parallel and partially in sequence. More refinement of this is what will help us have an effective agent team.

Team Productivity Does Not Stack Individual Results, Yet

Sean Murphy: I’m inferring that even though you’re seeing something like a 40x bonus as an experienced developer with some appreciation for architecture, when you had a team working on it, the net speed up was much less: you were looking at what like a 3x bonus.

Andrew Shindyapin: Yeah, exactly. I didn’t really talk about the team aspect here, but we found that in one of our 10-week group projects, to our dismay, our team of four actually got less done than one of us would have alone. Part of it was context—we were operating under wildly different assumptions. But part of it is that there’s just no best-practice way yet for when every team member has a team of AI agents behind them. How do you multiply that 40x speedup even further to 100x or 200x? We’re not sure how to do that, and we’re still on the bleeding edge.

So what’s the impact? The software development and product manager roles are converging. I see that uniquely as someone who did software development for over 10 years, then switched to product/project management, and now I’m doing a bit of both. For startups, you can obviously move faster with fewer people. For enterprises, there are productivity gains, but to what extent? If you’re Amazon or Walmart with millions of users, you’re not going to fix what ain’t broke—you move extremely cautiously, and that makes sense. Unless you start a Skunk Works project, then you’re more in the startup-within-an-organization category where you can move faster.It’s hard to quantify the extent and speed of productivity gains.

Key takeaways:

  • We’re in a learn-by-doing phase, so definitely start experimenting now.
  • Fundamentals matter more and more: getting them solid and continuing to improve them will only serve you, whether as a developer or someone adjacent to the role.
  • Stay adaptable, because today’s tooling is just scaffolding for the next three to five years.

Sean Murphy: In terms of what you’ve personally witnessed, a tiny startup with one to three developers who have some coding experience and an appreciation for architecture and design trade-offs can get a substantial speed-up by getting their MVP or first application into the market. You’ve got a lot of evidence for that. The BMAD method and some of the other techniques you have shared force a developer or small team to write down their ideas so they can be reviewed, become plans and requirements, and provide context for whatever AI tools you are using.

On the other hand, if I have an idea for an application but have not written much code and don’t have a feel for architecture and trade-offs, these tools wouldn’t enable me to ship. Current AI tools do not empower novices as much as they enable experienced developers. Other bottlenecks are not addressed, notably test and multi-developer coordination, as well as shared situational awareness—at least this week.

Andrew Shindyapin: I tend to agree, but I think someone who is highly resourceful or high agency can figure out a way to do that. So a Starbucks barista with a killer idea might still be able to find a few hundred, or even a thousand, users, and parlay that into recruiting a more professional team. That’s been the case without AI; with AI, it can happen faster. It still requires clarity of thought and the ability to leverage architectural knowledge to do the right thing.

Most software developers don’t start from scratch

Question from Audience:  Unfortunately, most software developers don’t have the luxury of starting from scratch. They have to support a code base with a million or more lines of code. We deal with customers who have code that is 30 or 40 years old, still running in their development environment or production infrastructure.

Do you know what percentage of software engineering work is new applications versus maintenance and improvements to existing platforms and architectures? Because it seems to me that if you are sitting on a large code base, you’re going to be very limited in how much AI you can use, because these LLMs are trained on certain code bases and data sets. Most enterprises upload millions of lines of code to Claude and say, “Train it for me.”

So, how does AI play into the overall economy, where I believe most software engineers don’t write new code from scratch? Most are working on a codebase with a million or more lines of code that’s driving revenue. That’s what legacy code means: it means that’s how we make our money.  I don’t see how software engineers can use AI agents to maintain, improve, and integrate with that code unless they have the computing infrastructure to host their own LLMs.

Andrew Shindyapin: I’m glad you brought that up. We had a “legacy code week” during Gauntlet. One of our assignments was to update a codebase with a million lines of code. Gemini currently supports the highest token limit among mainstream LLMs at 2.5 million tokens. Reading this codebase would have taken five to ten million tokens; it would not fit into any LLM we had access to. So we had to find ways to summarize and organize the information. We used DeepWiki to load the entire codebase, and in 20 minutes, we had well-defined architectural diagrams and other useful documentation that LLMs could leverage to improve the code.

Gauntlet tried to get us a COBOL example, but all that code predates the open-source era. So the oldest code example we worked on was in Fortran 77, which we were able to update to Fortran 90 and work with that. There are techniques for working with legacy code that help you maintain a very high-quality, high-velocity development cycle. Perhaps not 40x, but still much faster. I personally worked with PHP 7 as part of a project using SugarCRM, an open-source CRM.

But you are correct, the vast majority of the time you are working with legacy code. It makes things harder in some ways, but it also removes “the paradox of choice.” Instead of 17 million different combinations to explore, you know the framework, the language, and the tech stack you need to work with. You avoid the geeks playing with toys phase.

How will junior developers acquire experience and expertise?

Question from Audience:  Got it. Another concern I have with AI is in the longer term. I have used AI tools extensively in the last few years and have a simple model: they rely on the data you train them on and the prompts you supply. As far as I can see, and from what I have read and heard, writing a good prompt requires experience and knowing what you are doing.

For example, a junior engineer who’s never written a piece of multi-threaded code is not going to know how to write the correct prompts to implement a multi-threaded piece of code. If they have never written database code, they would not be able to write a prompt for that. So today senior developers and architects can make the best use of AI tools because they can direct them effectively and analyze what they implement.

The long-term issue is, if young people are not writing threaded code, if young people are not writing database code, then where do the future architects come from in 10 years, when all these young engineers are not able to find employment and work and get their hands dirty, and who will architect the system 10 years from now?

Andrew Shindyapin: That’s a great point. We have moved from prompt engineering to context engineering, and now to knowledge engineering.

  1. Prompt engineering is using the right prompt.
  2. Context engineering: what cursor and other tools enable you to do is to give the right context before you give the prompt. If you have the right context and everything set up correctly, your prompt can be very vague. It could literally be, “Fix this.” Because the correct context has been established, the LLM knows enough to fix it.
  3. Knowledge engineering: the project builds a knowledge base or knowledge map that includes documentation, code, and everything else related to the project. It’s almost a monorepo approach, if you look at it from a software engineering perspective. Now the AI understands that this project needs to take a multi-threaded approach.

None of this takes away from the need for junior engineers to learn the basic concepts and the right terminology for the areas they are working in. In the database area, for example, when should you switch from a flat file or set of flat files to tables in a database? Which tables should be indexed? If you understand the concepts, you can prompt: “This query seems slow, please investigate and determine if the right indices have been set up.”

Question from Audience: That’s fair, but you cannot learn how to swim from reading a book. The bottom line is that there’ll be a lot of knowledge lost unless people apply it in context. Even if you have read three books on a subject, that’s not adequate to understand a certain industry. For example, you might have learned something about image processing, but it may not apply to graphics processing, or to image inspection of automated systems, or to animation. There is a lot of domain-specific knowledge you have to master to be effective in an industry. If you have not worked in that industry and don’t know its nuances, it imposes many limitations on what you can do. I think this applies to young people if they only read about things.

I may come across as a skeptic, but it’s because we have used AI to solve a ton of problems, and I am concerned about the long-term.

The flip side is that we’ve asked AI to generate code for us and had some frustrating experiences. We struggled for eight hours to get a very simple thing to work right, something similar to give me an algorithm to come up with a good range for a chart. Seems simple until you start testing it with log data and other common constructs, and it fails. So after spending eight hours of testing, we dumped it and wrote our own. It’s been cool to save three weeks of work using AI, but frustrating to see it unable to do something that seems simple.

Question from Audience: That’s a really good point. It’s probably a skill that young people will pick up faster, because they will understand that there’s a boundary between what the AI can do well and what it can’t do well. Yesterday, I spent most of the day banging my head against something. But there’s probably another skill that lets you detect when something isn’t going well, and you cut it off quickly. I suspect you can gain a good understanding of what you should do versus what to delegate, and when to change your approach.

Timebox your explorations, revisit your assumptions

Andrew Shindyapin: I think in that case, one technique I’ve used as a senior engineer when I have worked in startups and was the most senior engineer in the room is to time-box my efforts to 2, 4, or 8 hours. If I were not able to come up with a solution in the time box, I would throw it away and try a completely different solution.

One example: I was working on a project where I had everything working locally in two days and was trying to deploy. I wanted to keep everything in the browser, and I fell into this rabbit hole. Finally, after another day of banging my head against the wall, I asked each of the four leading LLMs in turn. I had cataloged everything they suggested and everything I tried. I said, we have more than 50 entries in the log of things we have tried that didn’t work. What’s the best solution to this? And one of the LLMs said, Well, you’ve tried all of this stuff. Why don’t we just switch away from trying to get this library to work in a sandbox that was the web front end,  use a third-party service, and rely on API  calls?

I said to myself, “I think that’s cheating.” But then I went back to the original assignment brief, and it specifically said that we could use a third-party vendor. An hour later, I had fixed the problem by implementing an API and keeping my front end that just called out to a third-party API. Everything worked lovely, and I was able to get the bonus assignments done. That’s an example of how, in the real world, I would probably have done that right away. Why try to roll our own solution when you can just use a third-party API and be done with it?

Sean Murphy: Russell Ackoff has a definition of creativity I like: “the ability to remove self-imposed constraints that aren’t in the original problem definition.” I have found this talk very helpful. It seems to me that mapping dead ends will become more important, as will time-boxing, which is really about resource management. Building on an earlier question, I think 95-98% of “real world software development” is about improving an existing codebase. There has been a lot of focus on rapid MVP development, but I think there are significant opportunities in managing the evolution of large code bases.

Andrew, I want to thank you for a great briefing and leading a thought-provoking conversation with the audience. I’ve certainly learned a lot. Thanks again.

Andrew Shindyapin: Thank you, Sean. I appreciate the opportunity to make this presentation.

Resources from the discussion

Follow Andrew Shindyapin

Related Blog Posts

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top