You can think of tokens as pieces of words, where 1,000 tokens is about 750 words.
It depends on the model used, for example for text-davinci-003 requests can use up to4097tokens shared between prompt and completion. If your prompt is4000tokens, your completion can be97tokens at most.
The limit is currently a technical limitation
Solution: there are often creative ways to solve problems within the limit, e.g. condensing the prompt, breaking the text into smaller pieces, etc.