Claude subscription or API key? Cloud vs. self-hosting for AI

Why does this matter?

Every AI solution raises the same question: does this run in the cloud, or do we have to host it ourselves? And do we pay for it with a subscription or an API key? People often lump these together, while they are three different things. Keeping them apart helps you pick the right setup faster, and tells you straight away what it costs and who manages what.

The three things to keep separate

How you access the model: with a subscription (for people) or with an API key (for software).
Where the application around it runs: in the vendor's cloud, or on your own infrastructure or the client's.
Where the model itself runs: that is always at the vendor. So that question is effectively already answered.

Question 1: subscription or API key?

This is about how you get access and how you pay.

A subscription (such as a paid plan on claude.ai) is meant for people working interactively: chatting, writing, analysing, plus the light automations that person sets up themselves. You pay a fixed amount per user and fall under usage limits. Ideal for personal and team productivity.
An API key is meant for software: your own code or application calls the model. You pay per use and it scales. You need this as soon as something sits inside a product, runs unattended or serves many end users. Such a key is itself a secret you have to store and manage safely.

Rule of thumb: subscription = people, API key = software. You never build a product for customers on a personal subscription. That always needs an API key.

Question 2: cloud or your own infrastructure?

This isn't about the model, but about the application around it: the code, the storage, the connections to your systems, the user interface.

In the vendor's cloud: the vendor runs everything, you build little or no infrastructure. Think of the ready-made apps and the scheduled agents that run in the cloud.
On your own infrastructure or the client's: you run the code yourself, on your own server, your own cloud or the client's cloud. Needed for custom work, integration into existing systems, or strict control over the data. You then also manage the keys and access yourself.

The model always runs at the vendor

You never host the large language models (Claude, GPT) yourself, the model weights aren't available. "In our own cloud" therefore doesn't mean running the model itself, but accessing it through the client's cloud (AWS, Google Cloud, Azure) with the application in their infrastructure. That's the route when data may not leave the company's own cloud.

If you really want a model that runs entirely on your own servers, you need an open-weights model. Also mind the difference between where the model sits and where your data is processed: see EU hosting and zero retention.

Side by side

Two recognisable options, each a typical combination of the two questions above: the Claude app with Routines versus a custom-built application.

The Claude app on your computer or phone, with connectors and Routines (scheduled tasks that run automatically on your account) is the ready-made ecosystem on a subscription. Note: the app on your device is just a window, the model and the Routines run in the vendor's cloud, not on your device itself.

A custom application is bespoke work on an API key, hosted on your own servers or the client's, connected to existing systems. We built one ourselves to run our team: it reads emails, meetings and tickets and drives the work, fully unattended.

The difference per aspect:

How you access it: a subscription per user, versus an API key billed per use.
What you have to build and manage: nothing, the vendor handles the infrastructure and access, versus a real application that is built, hosted and maintained, including safely storing keys and passwords.
For whom: one person or a team working interactively, plus light automations via Routines, versus many users and processes running unattended and embedded in your business systems.
What you can do with it: chat with your own documents through connectors and scheduled tasks on your account, versus your own data model, your own screens, your own logic and connections to your existing software.
Customisation and integration: limited to what the app and connectors offer, versus fully tailored to your processes.
How fast you start: right away, no project needed, versus a development track you invest in.

In short: the Claude ecosystem is ideal to work faster as a person and automate light things without building anything. You choose a custom application when AI becomes a fixed part of how your business runs: unattended, for several people and woven into your own systems.

When do you choose what?

A few people who want to work interactively ("help me work faster") → a subscription, no infrastructure needed.
Light scheduled internal automation on someone's account → a cloud agent on a subscription. Runs by itself, nothing to host.
Unattended, embedded in the client's systems, for many users or as a product feature → an API key, plus hosting. That hosting can be at the vendor or on your own or client infrastructure.
"Our data may not leave our cloud" → access the model through the client's cloud, with the application in their own infrastructure.

Subscription = people. API key = software. Hosting = where the application around it lives. The model always sits at the vendor.

How Sevendays handles this: we pick the simplest setup that works for each situation. For personal and team productivity we stay in the subscription ecosystem; for an application that runs unattended or sits inside your systems, we build on an API key with the right hosting. That's how our AI platform works securely on your own data, and our examples show what that looks like in practice. Mainly want to know what our layer adds on top of your existing chat tool (Claude, ChatGPT, Gemini)? Then the two paths side by side spell it out. Not sure what fits you? Our approach to AI always starts from your concrete situation.

Related terms

EU hosting and zero retention: where your data sits and whether it's used for training, separate from where the model runs.
Open-weights model: the only way to truly run a model on your own infrastructure.
Which Claude do you use when?: chat, Cowork or Claude Code, and the difference between web and local.