2026)

The Quarterback Paradox – while I’m not sure I agree that it is a paradox – i.e., recruiting a critical position to an organization is hard even if you have a lot of data, I love and strongly agree with the suffix of the post – “As in the NFL, in organizations the hardest part is often not finding talent, but creating the conditions in which real potential does not break before it has a chance to become reality.”

https://noamwakrat.medium.com/the-quarterback-paradox-e93e4325bac1

What LEGO Can Teach Us about Autonomy and Engagement – Who doesn’t like LEGO? We all played with it as children, and some of us still build today. In this post, Pawel Brodzinski describes a neat experiment he runs in training sessions – teams first build a LEGO set under a manager’s direction, then self-organize for a second build, and consistently report higher engagement when given more autonomy. While it shows a clear effect, the experiment has some drawbacks – most notably an order effect: the self-organized build always comes second, so the engagement boost could partly stem from participants being warmed up and more comfortable rather than from autonomy alone. Always nice to read about LEGO as an adult.

https://brodzinski.com/2026/01/lego-autonomy-engagement.html

Skyll – Skills are markdown instruction files that teach AI coding agents how to perform specific tasks. Today, skills must be manually installed before a session, meaning developers need to know upfront which skills they’ll need. Skyll is an open-source search engine and API that lets any AI agent discover and retrieve skills on demand at runtime, ranked by relevance, without pre-installation. You can think of it as a package manager for agent capabilities, enabling agents to be truly self-extending and autonomous.

https://github.com/assafelovic/skyll

Skyhook.io radar – Existing K8s dashboards tend to be either heavyweight, cloud-dependent, or require cluster-side components. Radar’s zero-install, single-binary approach with real-time topology and traffic visualization answers the need of developers and platform teams who want quick, frictionless cluster observability that can even run on their laptop, especially useful for DevEx-focused teams looking to reduce the friction of Kubernetes debugging and operations,

https://github.com/skyhook-io/radar

Babysitter – If you worked with coding agents, you probably experienced this pain: the lack of a structured process control and non-deterministic workflows. Babysitter lets you define iterative workflows (research → spec → TDD loop → quality gate → deploy) that are deterministic, resumable across sessions, and auditable, which is critical for moving AI-assisted development from ad-hoc experimentation toward reliable, production-grade engineering workflows and complex features.

https://github.com/a5c-ai/babysitter

The State of Coding Agents Using Local LLMs — February 2026

Last update: February 1st, 2026

Coding agents are no longer a novelty – they’re everywhere. Over the past year, we’ve seen massive adoption across startups and enterprises, alongside real improvements in autonomy, reasoning depth, and multi-step code execution. Tools like Claude Code, Codex, Copilot, and Kiro are shipping updates at a relentless pace, and teams are increasingly comfortable letting agents refactor modules, write tests, and manage pull requests.

But there’s a catch: these tools are token eaters. Autonomous agents don’t just answer a prompt – they plan, reflect, re-read the codebase, call tools, retry, and iterate. At scale, that translates into serious API bills.

That’s why we’re seeing growing interest in a different deployment pattern: running coding agents against local or self-hosted models. Ollama recently announced ollama launch a command that sets up and runs coding tools such as Claude Code, OpenCode, and Codex with local or cloud models. vLLM, LiteLLM, and OpenRouter also provide similar integrations. That signals that this is no longer fringe experimentation. For many teams, local LLMs are emerging as a viable path to reduce cost, improve stability, and gain tighter control over privacy.

Deployment models for coding agents

When teams talk about “running models locally,” they often mean different things. In practice, there are three distinct deployment patterns – and they differ meaningfully in cost structure, performance profile, and governance posture.

Local (Developer Machine) – the model runs directly on a developer’s laptop or workstation (e.g., via Ollama).
Hosted (Org-Managed Infrastructure / VPC) – the organization runs the model on its own infrastructure, either on-premises GPU servers or in a private cloud/VPC (e.g., via vLLM, Kubernetes, or managed GPU clusters).
Managed LLM API (e.g., Anthropic, OpenAI, etc.) – the model runs fully managed by a provider; the organization interacts via API.

Dimension	Local (Dev Machine)	Hosted (Org VPC / On-Prem)	Managed LLM API
Cost Structure	No per-token fees. Hardware cost borne by the developer. Cheap at a small scale; uneven across the team.	No per-token fees. Significant infra + ops cost. Economical at scale if usage is high.	Usage-based (per token / per request). Predictable but can become very expensive with agent loops.
Cost at Scale (Agents)	Hard to standardize; limited by laptop GPU/CPU.	Strong cost efficiency at high volume	Token costs compound quickly. Expensive in large org rollouts.
Performance (Latency)	Very low latency locally, but limited by hardware. Large models may be slow or impossible.	Good latency if well-provisioned GPU cluster. Can optimize with batching.	Typically excellent latency and throughput; globally distributed infra.
Model Size / Capability	Limited to smaller models (7B–34B typically; maybe 70B with strong GPUs).	Can run large open models (70B+), depending on infra budget.	Access to frontier SOTA models (often strongest reasoning & coding quality).
Quality (Coding Tasks)	Improving. “Good enough” for many workflows, especially with fine-tuned coding models.	Strong – can choose best open models and fine-tune internally.	Often highest raw reasoning quality and reliability on complex multi-file tasks.
Security / Privacy	Code never leaves device. Strong for IP protection. Risk: inconsistent security posture across developers.	Code stays inside org boundary. Strong centralized control.	Code leaves org boundary (even with enterprise contracts). Vendor trust required.
Compliance (GDPR, HIPAA, etc.)	Hard to audit across distributed machines.	Strong compliance posture if infra is controlled and logged centrally.	Enterprise compliance available via contract, but still external processing.
Governance & Observability	Weak – hard to monitor usage or enforce policies.	Strong – full logging, auditing, access controls, IAM integration.	Strong observability dashboards from vendor, but limited transparency into internals.
Stability / Availability	Works offline. Dependent on developer hardware reliability.	Controlled SLAs internally. Requires DevOps maturity.	Vendor-managed SLAs. Risk of outages outside your control.
Standardization Across Team	Low: “works on my machine” problem possible.	High – central model versions and infra.	Very high – single API endpoint for entire org.

Tools overview

Coding Agents and Model support

Coding Agent	Local LLM Support	Hosted Support	Notes
Claude Code	✅ via Ollama/vLLM integration	Native Anthropic	Run Claude Code with Local LLMs Using Ollama LLM gateway configuration LiteLLM Claude Code Quickstart OpenRouter integration with Claude Code
GitHub Copilot (Agent mode)	✅ via Ollama/vLLM integration	Cloud models (GPT-4o, Claude 3.5, Gemini, etc)	Ollama in VSCode GitHub copilot with Open Router GitHub copilot LLM Gateway
Codex (OpenAI)	✅ via Ollama integration	Cloud via OpenAI	Ollama Codex integration
Cursor AI	✅ via Ollama integration	Cloud multi-model	Use Local LLM with Cursor and Ollama OpenRouter with Cursor
AWS Kiro	❌ local	AWS hosted

Local LLM Frameworks

Framework	Primary Role	Notes
Ollama	Local LLM hosting & runtime	Lightweight CLI + API that serves models locally; integrates with multiple agents (Claude Code, Codex, Droid, OpenCode) and supports on-prem inferencing with moderate hardware.
vLLM (Serving)	High-performance LLM server	Optimized for scalable reasoning and long context LLM inference; integrates with agents (e.g., Claude Code) via Anthropic-Messages API compatibility.
OpenRouter	Unified LLM API broker	Central API layer for 400+ LLMs including local and cloud endpoints; can route agents to preferred backends with cost/redundancy optimization.
LiteLLM	Unified LLM API	Enables developers to use many LLM APIs, such as OpenAI, Anthropic, Gemini, and Ollama, in a single, OpenAI-compatible format.

Notable models

Model	Primary Use	Latest Release
Qwen3-Coder	Alibaba’s 480B-parameter MoE coding model. SOTA results among open models on agentic coding tasks	July 2025
DeepSeek Coder	DeepSeek’s open-source code model series (1B–33B params), achieving top performance among open-source code models across major benchmarks.	June 2024
Code Llama (7B/34B)	Meta’s open-source code-specialized LLMs, fine-tuned from Llama 2 in multiple sizes	January 2024
gpt-oss	OpenAI’s open-weight LLMs, available in 20B and 120B sizes under Apache 2.0. 120B variant matching o4-mini on reasoning benchmarks	August 2025
kimi-k2.5	Moonshot AI’s open-source, native multimodal agentic model	January 2026

📈 Predictions Through 2026

1. Hybrid Routing Will Become the Standard

Cost is the most immediate driver. Autonomous coding agents are token-intensive by design. At enterprise scale, those token costs compound quickly.

Local inference eliminates per-token fees, which makes it attractive for high-volume, repetitive tasks. But frontier proprietary models still maintain an edge on complex, cross-repository reasoning and edge cases. The likely outcome is not full replacement, but intelligent routing:

Simpler or repetitive tasks → local or hosted open models
High-stakes, complex reasoning → managed frontier APIs

Tools like OpenRouter and LiteLLM are already enabling this pattern, and by the end of 2026, hybrid routing is likely to be the default deployment strategy for medium- to large-sized engineering organizations.

2. Standardization Will Lower the Switching Cost

Hybrid only works if switching models is frictionless.

As coding agents like Claude Code, Codex, Copilot, and others converge around shared inference interfaces (Ollama, vLLM, OpenAI-compatible endpoints), swapping models in and out becomes operationally simple. This reduces lock-in and makes experimentation safer.
As interoperability improves, the barrier to trying local models drops dramatically – and adoption follows.

3. Open-Source Coding Models Will Close the Gap

Tool-use fine-tuning is maturing. Code reasoning benchmarks are becoming more rigorous.

By late 2026, open-weight coding models are likely to be “production-grade” for a substantial share of workflows – especially where cost control and data sovereignty matter more than absolute frontier performance.

4. Resilience Will Matter as Much as Cost

There’s also a structural pressure building: agent-driven workloads amplify the impact of API outages. When a coding agent is embedded into CI pipelines or developer workflows, downtime is no longer an inconvenience – it’s a blocker.

As usage scales, reliance on a single managed API becomes a risk vector. This will accelerate investment in redundancy:

Secondary API providers
Local fallback models
On-prem capacity for critical workflows

Summary

In 2026, hybrid won’t just be about cost optimization – it will be about operational resilience.

The future is not “local vs cloud.” It’s a composable, policy-driven model infrastructure.

Organizations that treat model routing, hosting strategy, and redundancy as part of their core engineering architecture – rather than as an afterthought – will have structural advantages in cost control, privacy, and reliability.

2026 won’t be the year enterprises abandon managed APIs. It will be the year they stop depending on them exclusively.

Tokens as Currency

Half-baked thought: Tokens will become currency.

Right now, the direction is obvious – more money buys more tokens.

But what if, in the near future, tokens themselves become a medium of exchange?

Consider this:

Microsoft is allocating tokens to support the maintenance of open-source projects.
Companies granting tokens in exchange for using their tools or infrastructure.
Open-source maintainers receiving donations in tokens instead of (or alongside) cash.
Platforms enabling distributed token usage across multiple accounts, almost like a modern SETI@home
Gift cards for Anthropic.

In other words, tokens are not just consumption units, but are tradable, transferable assets within an ecosystem.

Of course, this is far from trivial. Privacy, security, incentive alignment, and implementation complexity are all major hurdles.

But if I had to place one slightly outrageous bet for 2026, it would be movement in this direction.

Are we ready to start thinking of tokens as something you can “round up and donate”?

AI, Paradigm Shifts, and the Future of Building Companies

Over the past few months, I have been constantly reading conversations about how Generative AI will reshape software engineering. On LinkedIn, Twitter, or in closed professional groups, engineers and product leaders debate how tools like Cursor, GitHub Copilot, or automated testing frameworks will impact the way software is built and teams are organized.

But the conversation goes beyond just engineering practices. If we zoom out, AI will not only transform the workflows of software teams but also the structure of companies and even the financial models on which they are built. This kind of change feels familiar – it echoes a deeper historical pattern in how science and technology evolve.

Kuhn’s Cycle of Scientific Revolutions

During my bachelor’s, I read Thomas Kuhn’s The Structure of Scientific Revolutions. Kuhn argued that science does not progress in a linear, step-by-step manner. Instead, it moves through cycles of stability and disruption. The Kuhn Cycle¹, as reframed by later scholars, breaks this process into several stages:

Pre-science – A field without consensus; multiple competing ideas.
Normal Science – A dominant paradigm sets the rules of the game, guiding how problems are solved.
Model Drift – Anomalies accumulate, and cracks in the model appear.
Model Crisis – The old framework fails; confidence collapses.
Model Revolution – New models emerge, challenging the old order.
Paradigm Change – A new model wins acceptance and becomes the new normal.

The Kuhn Cycle Applied to Software Development

Normal Science

For decades, software engineering has operated under a shared set of practices and beliefs:

Clean Code & Best Practices – DRY, SOLID, Unit Testing, Peer Reviews.
Agile & Scrum – Iterative sprints and ceremonies as the “right” way to build products.
DevOps & CI/CD – Automation of builds, deployments, and testing.
Organizational Structure – Specialized roles (frontend, backend, QA, DevOps, PM) and a belief that more engineers equals more output.

The underlying assumption is hire more engineers + refine practices → better and quicker software.

Model Drift

Over time, cracks began to show.

The talent gap – demand for software far outstrips available developers.
Velocity mismatch – Agile rituals can’t keep pace with market demands.
Complexity overload – Microservices and massive codebases create systems that are too complex for a single person to comprehend fully.
Knowledge silos – onboarding takes months, and institutional knowledge remains fragile.

These anomalies signaled that “hire more engineers and improve processes” was no longer a sustainable model.

Model Crisis

The strain became obvious:

Even tech giants with thousands of engineers struggle with code sprawl and coordination overhead.
Brooks’ Law bites – adding more people to a project often makes it slower.
Business pressure grows – leaders demand faster iteration, lower costs, and higher adaptability than human-only teams can deliver.
Early AI tools, such as GitHub Copilot and ChatGPT, reveal something provocative – machines can generate boilerplate, tests, and documentation in seconds – tasks once thought to be unavoidably human.

This is where many organizations sit today – patching the old paradigm with AI, but without a coherent new model.

Model Revolution

A new way of working begins to take shape. Here are some already visible in experimenting, we can all see around us –

AI-first engineering – using AI agents for scaffolding code, generating tests, or refactoring large systems. Humans act as curators, reviewers, and high-level designers.
Smaller, AI-augmented teams
New roles and workflows – QA shifts toward system-level validation; PMs focus less on ticket grooming and more on problem framing and prompting.
Org structures evolve – less siloing by specialization, more “AI-augmented full-stack builders.”
Economics shift – productivity is no longer headcount-driven but iteration-driven. Cost models change when iteration is nearly free.

Paradigm Change

In the coming years, some of the ideas above, and probably additional ideals, could stabilize as the “normal science” of software development and organizational building. But we are not yet there. Once we get there, today’s experiments will feel as obvious as Agile sprints or pull requests do now.

We are in the midst of model drift tipping into crisis, with glimpses of revolution already underway. Kuhn’s lesson is that revolutions are not just about better tools – they’re about shifts in worldview. For AI, the shift might be that companies will no longer be limited by headcount and manual processes but by their ability to ask the right questions, frame the correct problems, and adapt their models of value creation.

We are moving toward a future where the shape of companies, not just their software stacks, will look radically different, and that’s an exciting era to be a part of.

https://www.thwink.org/sustain/glossary/KuhnCycle.htm ↩︎

LLM Debt: The Double-Edged Sword of AI Integration

Have you noticed how half the posts on LinkedIn these days feel like they were written by an LLM – too many words for too little substance? Or how product roadmaps suddenly include “AI features” that nobody asked for, just because it sounds good in a pitch deck? Or those meetings where someone suggests “let’s use GPT for this,” when a simple SQL query, an if-statement, or a much simpler ML model would do the job?

Laurence Tratt recently coined the term LLM inflation to describe how humans use LLMs to expand simple ideas into verbose prose, only for others to shrink them back down. That concept got me thinking about a related phenomenon: LLM debt.

LLM debt is the growing cost of misusing LLMs — by adding them where they don’t belong and neglecting them where they could help.

We’re all familiar with technical debt, product debt, and design debt¹ — the shortcuts or missed opportunities that slow us down over time. Similarly, organizations are quietly accumulating LLM debt.

So, what does LLM debt look like in practice? It’s a double-edged liability:

Overuse: Integrating LLMs where they’re unnecessary adds latency, complexity, cost, and stochasticity to systems that could be simpler, faster, and more reliable without them. For example, sending every API request through a multimillion-parameter model when a simple regex or deterministic logic would suffice.
Underuse: Failing to adopt LLM-based tools where they could genuinely help results in wasted effort and missed opportunities. Think of teams manually triaging support tickets, writing repetitive documentation, or analyzing text data by hand when an LLM could automate much of the work.

Like product or technical debt, a small amount of LLM debt can be strategic: it allows experimentation, faster prototyping, or proof-of-concept development. However, left unmanaged, it compounds, creating systems that are over-engineered in some areas and under-leveraged in others, which slows product evolution and innovation. Same as other types of debt, it should be owned and managed.

LLMs are powerful, but they come with costs. Just as we track and manage technical debt, we need to recognize, measure, and pay down our LLM debt. That means asking tough questions before adding LLMs to the stack, and also being bold enough to leverage them where they could provide real value.

If LLM inflation showed us how words can expand and collapse in unhelpful cycles, LLM debt shows us how our systems can quietly accumulate inefficiencies that slow us down. Recognizing it early is the key to keeping our products lean, intelligent, and future-ready.

I previously wrote about some of those topics here and talked about them in Hebrew here ↩︎

AWS has entered the building

AWS has released several notable announcements within the LLM ecosystem over the last few days.

Introducing Amazon S3 Vectors (preview) – Amazon S3 Vectors is a durable, cost-efficient vector storage solution that natively supports large-scale AI-ready data with subsecond query performance, reducing storage and query costs by up to 90%.

Why I find it interesting –

Balancing cost and performance – i.e., storing on a database is more expensive but yields better results. If you know what the “hot vectors” are, you can store them in the database and store the rest in S3.
Designated buckets – it started with table buckets and has now evolved to vector buckets. Interesting direction.

Launch of Kiro – the IDE market is on fire with OpenAI’s acquisition falling apart, Claude code and cursor competition, and now Amazon reveals Kiro with the promise – “helps you do your best work by bringing structure to AI coding with spec-driven development”

Why I find it interesting –

At first, I wondered why AWS entered this field, but I assume it is a must-have these days, and might lead to higher adoption of their models or Amazon Q.
The different IDEs and CLI tools are influenced by each other so it will be interesting to see how a new player influences this space.

Strand agents are now at v1.0.0 – Strand Agents are an AWS open-source SDK that enables building and running AI agents across multiple environments and models, with many pre-built tools that are easy to use.

Why I find it interesting –

The bedrock agents interface was limiting for a production-grade agent, specifically in terms of deployment modes, model support, and observability. Strand agents open many more doors.
There are many agent frameworks out there (probably two more were released while you read this post). Many of them experience different issues when working with AWS Bedrock. If you are using AWS as your primary cloud provider, it should be a leading candidate.

5 interesting things (11/05/2025)

Agents app design pattern – this is a back-to-basics adaptation. How would we read this 14 years from now? Would the ideas he mentioned there be a standard?

https://github.com/humanlayer/12-factor-agents

The original document “12 factors app” is also worth reading (note that it was first published 2011+-) –

https://12factor.net

When the Agents Go Marching In: Five Design Paradigms Reshaping Our Digital Future

This post complements the previous one, covering the same topics. If you are in a hurry, jump to the “The Reinvention of UX: Five Emerging Paradigms” section. I feel that I cope with all those aspects, e.g., building trust, transparency, cognitive load distribution, etc., on a daily basis.

https://medium.com/cyberark-engineering/when-the-agents-go-marching-in-five-design-paradigms-reshaping-our-digital-future-a219009db198

Using Alembic to create and seed a database
Seeding a database is essential for testing, development, and ensuring consistent application behavior across different environments. Alembic is a lightweight database migration tool for Python, designed to work seamlessly with SQLAlchemy.

We use Alembic to manage our database migrations, and I recently needed to seed our database for consistency across environments. I looked for several solutions and eventually used the solution in this post to create a migration that seeds the database –

https://medium.com/@fimd/using-alembic-to-create-and-seed-a-database-8f498638c406

A Field Guide to Rapidly Improving AI Products – while this post focuses on AI products, specifically ones LLM-based, multiple lessons can also be adapted to non-LLM-based AI products and general products. Conducting an error analysis, generating synthetic data (preferably with domain express), and using a data viewer are good starting points.

https://hamel.dev/blog/posts/field-guide/

I Tried Running an MCP Server on AWS Lambda… Here’s What Happened – this post involves two topics I think a lot about these days – MCP and serverless computing. I think it is clear why I think a lot about MCPs. But why do I think about serverless computing? I think of it as a low-cost solution for early-stage startups. Early-stage startups usually have low traffic, which does not justify the cost of having servers up 24/7. On the other hand, the serverless development experience still needs some refinement, and there are services that one would like to host that do not support running in a serverless manner.

https://www.ranthebuilder.cloud/post/mcp-server-on-aws-lambda

5 interesting thing (28/03/2025)

PgAI – LLMs have been part of everyday life already for a while. One aspect I think has not been explored well so far is using them as part of ETL. The implementations I have seen so far don’t take advantage of batch APIs and are not standardized to enable the easy replacement of a model. Having said that, I believe those hurdles will be overcome soon.

https://github.com/timescale/pgai

Related links

Snowflake cortex complete – call llm functions from snowflake https://docs.snowflake.com/en/developer-guide/snowpark-ml/reference/1.5.0/api/model/snowflake.cortex.Complete)
OpenAI Postgres extension – allows calling OpenAI or local models using Ollama. https://www.crunchydata.com/blog/accessing-large-language-models-from-postgresql
Do you use LLMs in your ETL pipelines – Reddit discussion regarding using LLMs in ETL pipelines. There are pros and cons of course, specially for cases you need a deterministic answer but there are also other cases –
https://www.reddit.com/r/dataengineering/comments/1h17gjm/do_you_use_llms_in_your_etl_pipelines/

Life Altering PostgreSql Patterns – a back-to-basics post. I agree with most of the points mentioned there, specifically around adding creaetd_at, updated_at, and deleted_at attributes to all tables and saving state data as logs rather than saving only the latest state. I found the section about enum tables interesting. This is the first time I was exposed to this idea, and the ability to add a description or metadata is excellent.

https://mccue.dev/pages/3-11-25-life-altering-postgresql-patterns

Via this post, I learned about the on update cascade option, you can read more about it here – https://medium.com/geoblinktech/postgresql-foreign-keys-with-condition-on-update-cascade-330e1b25b6e5

AI interfaces of the future – I usually don’t share videos, but I think this talk is thought-provoking for several reasons –

Gen UI patterns – an emerging field, the talk reviews several products and highlights good and destructive patterns. Some of the patterns, like suggestions or auto-complete, are transparent to us but are present in many products we know, and that’s something important to notice when you build such a product.
Product review: Knowing what is out there is good for inspiration, ideas, and understanding the competitive landscape. However, new products are coming out every day, and it is hard to track all of them.

Simplify Your Tech Stack: Use PostgreSQL for Everything – Two widespread tensions, especially in startups, are build vs. buy conflicts and using specialized products or technologies (e.g., different databases) that are top of the breed but not many people can use and maintain vs. more common technology that more people can maintain but can have performance drawbacks or other limitations. Mainly working in startups, I usually prefer to use standard technology to run faster, knowing that the product, focus, and priorities often change. With that being said, I acknowledge that early adoption of new technologies can be life-changing for a startup, but figuring out what to bet on is hard.

https://medium.com/timescale/simplify-your-tech-stack-use-postgresql-for-everything-f77c96026595

CDK Monitoring Constructs – if you are using AWS CDK as your IAC tool, CDK monitoring constructs enable you to create cloudwatch alarms and dashboards almost out of the box. I wish they would release and add additional options at a faster pace.

https://pypi.org/project/cdk-monitoring-constructs/

5 interesting things (31/05/2024)

How we built Text-to-SQL at Pinterest – Text-to-SQL and vice versa became one of the canonical examples of LLM, and every product needs one. The post described a very interesting work that can be implemented relatively easily. I relate the most to the closing paragraph, which emphasizes the gap between demos, tutorials, benchmarks, and real-world use cases. – “It would be helpful for applied researchers to produce more realistic benchmarks which include a larger amount of denormalized tables and treat table search as a core part of the problem.”

https://medium.com/pinterest-engineering/how-we-built-text-to-sql-at-pinterest-30bad30dabff

(p.s I mentioned post in a recent LinkedIn post – LLMs in the enterprise – looking beyond the hype on what’s possible today)

How an empty S3 bucket can make your AWS bill explode – this story completely blew my mind (and gladly not my account). I was happy to see that AWS is looking into this issue and wondered if in bigger accounts, such anomalies could get unnoticed.

https://medium.com/@maciej.pocwierz/how-an-empty-s3-bucket-can-make-your-aws-bill-explode-934a383cb8b1

The Design Philosophy of Great Tables – great_tables is a Python package for creating wonderful-looking tables. This post shares its visual design philosophy and is worth reading if you create tables even if you will not use this package.

https://posit-dev.github.io/great-tables/blog/design-philosophy/

1-measure-3-1 – a variation of the 1-3-1 problem-solving method for making proposals. I found it specifically effective for engineers as it is structured and focused.

https://www.annashipman.co.uk/jfdi/1-measure-3-1.html

On Making Mistakes — I love it when people combine experience or knowledge in one field or domain with another. For example, someone brings her experience as a soccer player to managing a team, or someone uses lessons he learned as a supermarket cashier to software architecture. This post discusses making mistakes and working through them and refers to several domains, including improv, chess, and F1 team management.

https://read.perspectiveship.com/p/on-making-mistakes

5 interesting things (08/03/2024)

(Almost) Every infrastructure decision I endorse or regret after 4 years running infrastructure at a startup – in my current role as a CTO of an early-stage startup, I make many choices about tools, programming languages, architecture, vendors, etc. This retrospective view was fascinating not only for the tools themselves but also for the arguments.

https://cep.dev/posts/every-infrastructure-decision-i-endorse-or-regret-after-4-years-running-infrastructure-at-a-startup/

Everything You Can Do with Python’s textwrap Module – I have used Python for more than 10 years and never heard of textwrap model. Maybe you, too, haven’t heard of it.

https://towardsdatascience.com/everything-you-can-do-with-pythons-textwrap-module-0d82c377a4c8

It was never about LLM performance – I couldn’t agree more. The performance gaps between different LLMs are becoming neglectable. Now, it is about the experience you build using those models and the guardrails you put in to ensure the experience.

https://read.technically.dev/p/it-was-never-about-llm-performance

How to build an enterprise LLM application: Lessons from GitHub Copilot – the post ends with a summary of 3 key takeaways –

Identify a focused problem and thoughtfully discern an AI’s use cases.
Integrate experimentation and tight feedback loops into the design process
As you scale, continue to leverage user feedback and prioritize user needs

Those takeaways are general and correct for almost every product launch I can think of. The post provides more concrete tips for LLM applications. It is interesting to read about a product on such a scale that I use it on a daily basis.

https://github.blog/2023-09-06-how-to-build-an-enterprise-llm-application-lessons-from-github-copilot/

Speaking for Hackers – public speaking is hard. From choosing a topic, submitting a CFP, preparing your talk and slides, and wrapping it all up. Every step can be tricky, and each of us has other things that are harder for us. This site provides excellent materials for all the parts before, during, and after the talk, making it easier to step out of our shells and share the knowledge.

https://sfhbook.netlify.app/

	Nicole S on 5 Python NLP pacakges
	blissful4bdd2399fa on CSV to radar plot
	tom on CSV to radar plot
	Matt on CSV to radar plot
	“ – Tom… on 📚 Book club Q1 2024 – 3…