You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 24 Next »

As mentioned in this section, OpenAI provides several models that we can use in order to achieve our specific tasks.

In this section, we'll dive deeper into:

  • The available models
  • The API Limitation
  • How to test this API & get an API Key
  • What would be the best Model to use & Pricing to follow
  • The number of tokens that should be used in our prompt to prevent errors


GPT : Generative Pre-trained Transformer 

  • Latest model
  • With broad general knowledge and domain expertise, GPT-4 can follow complex instructions in natural language and solve difficult problems with with greater accuracy.
  • is more creative and collaborative than ever before. It can generate, edit, and iterate with users on creative and technical writing tasks, such as composing songs, writing screenplays, or learning a user’s writing style.

  • Following the research path from GPT, GPT-2, and GPT-3, the deep learning approach leverages more data and more computation to create increasingly sophisticated and capable language models
  • 6 months were spent making GPT-4 safer and more aligned.

  • GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on our internal evaluations.

  • Price list for GPT-4  (Multiple models, each with different capabilities and price points. Prices are per 1,000 tokens. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words.)

    ModelPromptCompletion
    8K context$0.03 / 1K tokens$0.06 / 1K tokens
    32K context$0.06 / 1K tokens$0.12 / 1K tokens
  • GPT-4 models

    LATEST MODELDESCRIPTIONMAX TOKENSTRAINING DATA
    gpt-4More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. Will be updated with our latest model iteration.8,192 tokensUp to Sep 2021
    gpt-4-0314Snapshot of gpt-4 from March 14th 2023. Unlike gpt-4, this model will not receive updates, and will only be supported for a three month period ending on June 14th 2023.8,192 tokensUp to Sep 2021
    gpt-4-32kSame capabilities as the base gpt-4 mode but with 4x the context length. Will be updated with our latest model iteration.32,768 tokensUp to Sep 2021
    gpt-4-32k-0314Snapshot of gpt-4-32 from March 14th 2023. Unlike gpt-4-32k, this model will not receive updates, and will only be supported for a three month period ending on June 14th 2023.32,768 tokensUp to Sep 2021


    For many basic tasks, the difference between GPT-4 and GPT-3.5 models is not significant. However, in more complex reasoning situations, GPT-4 is much more capable than any of our previous models.

  • Limitation:

          GPT-4 is currently in a limited beta and only accessible to those who have been granted access. In order to use this API we need to join the waitlist to get access when capacity is available.

GPT-3.5 models can understand and generate natural language or code. Our most capable and cost effective model in the GPT-3.5 family is gpt-3.5-turbo which has been optimized for chat but works well for traditional completions tasks as well.

LATEST MODELDESCRIPTIONMAX TOKENSTRAINING DATA
gpt-3.5-turboMost capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003. Will be updated with our latest model iteration.4,096 tokensUp to Sep 2021
gpt-3.5-turbo-0301Snapshot of gpt-3.5-turbo from March 1st 2023. Unlike gpt-3.5-turbo, this model will not receive updates, and will only be supported for a three month period ending on June 1st 2023.4,096 tokensUp to Sep 2021
text-davinci-003
  • Can do any language task with better quality, longer output, and consistent instruction-following than the curie, babbage, or ada models. Also supports inserting completions within text.
4,097 tokensUp to Jun 2021
text-davinci-002Similar capabilities to text-davinci-003 but trained with supervised fine-tuning instead of reinforcement learning4,097 tokensUp to Jun 2021
code-davinci-002

Optimized for code-completion tasks

Now deprecated

8,001 tokensUp to Jun 2021


Models referred to as "GPT 3.5"

GPT-3.5 series is a series of models that was trained on a blend of text and code from before Q4 2021. The following models are in the GPT-3.5 series:

    1. code-davinci-002 is a base model, so good for pure code-completion tasks
    2. text-davinci-002 is an InstructGPT model based on code-davinci-002
    3. text-davinci-003 is an improvement on text-davinci-002
    4. gpt-3.5-turbo-0301 is an improvement on text-davinci-003, optimized for chat


Recommendation to use gpt-3.5-turbo over the other GPT-3.5 models because of its lower cost.

Experimenting with gpt-3.5-turbo is a great way to find out what the API is capable of doing. After you have an idea of what you want to accomplish, you can stay with gpt-3.5-turbo or another model and try to optimize around its capabilities.

Note: OpenAI models are non-deterministic, meaning that identical inputs can yield different outputs. Setting temperature to 0 will make the outputs mostly deterministic, but a small amount of variability may remain.

  • GPT-3 models can understand and generate natural language.
  • These models were superceded by the more powerful GPT-3.5 generation models.
  • However, the original GPT-3 base models (davinci, curie, ada, and babbage) are current the only models that are available to fine-tune.
  • Fine-tuning mean to build and train our own data from one of GPT-3 Models
  • Prices:

         

 Comparative Table:   

 

 

As a language model based on the GPT-3 architecture, the GPT-3 API has some limitations, including:

  • Rate Limits check this link
  • Limited control over outputs: While the GPT-3 API can generate high-quality text, it may not always produce the output you want. It can be difficult to control the direction of the text or ensure that it stays on topic.
  • Expensive pricing: The GPT-3 API can be quite expensive to use, especially for large-scale projects. This can be a limiting factor for individuals or small businesses who want to use the API but don't have the resources to do so.
  • Limited access: Currently, the GPT-3 API is only available to selected partners and developers. This means that not everyone has access to the API, which can limit its usefulness in certain contexts.
  • Bias and fairness issues: As with any AI system, the GPT-3 API is not immune to bias and fairness issues. These issues can affect the accuracy and usefulness of the model, particularly in sensitive areas such as healthcare or criminal justice.
  • Lack of transparency: The GPT-3 API is a proprietary system, and its inner workings are not transparent to the public. This can make it difficult to assess the model's accuracy and potential biases.
  • We can use the GPT comparison tool that lets us run different models side-by-side to compare outputs, settings, and response times and then download the data into an Excel spreadsheet.
  • Under the examples section, we can benefit from examples to choose what we want
  • For example we can use text-davinci-003 for SQL Translation

         

         

  • Other example we can benefit from, to reduce the email content, before sending it in the prompt:  using text-davinci-003

          For sure we need to test how limited the # of tokens in this example prompt itself

                                                            or                   

                                                        

  • For retrieving data from the email we can try one of these examples: using text-davinci-003

                                     or                     

                                                         

  • Classification
  • Translation
  • Summarization
  • Copywriting
  • Parsing unstructured text

In order to test the API we need to get an API Key:

  • OpenAI provides a free tier for the GPT-3 API that allows developers to experiment with the API and build small-scale applications without incurring any costs.

  • The free tier provides access to a limited number of API requests per month, after which you will need to upgrade to a paid plan to continue using the API.
  • It's important to note that the availability and terms of these resources may change over time, so it's best to check with OpenAI directly for the most up-to-date information on their programs and offerings.
  • Supported languages: it provides several SDKs and libraries to use the API in different programming languages, including Python, Node.js, Java, Ruby, C#
  • The OpenAI API uses API keys for authentication. Visit your API keys page to retrieve the API key you'll use in your requests.

  • Avoid exposing the API keys in your code or in public repositories; instead, store them in a secure location. You should expose your keys to your application using environment variables or secret management service, so that you don't need to hard-code them in your codebase. Read more in our Best practices for API key safety.

  • I've found 2 ways to get an API Key:
    • Steps: 

      Note that in the new documentation under the Products section we can't find GPT-3 we find GPT-4, so the below section refer to GPT-4 instead of GPT-3

      • Go to the OpenAI website: Visit the OpenAI website at https://openai.com/.
      • Click on "Products": From the OpenAI homepage, click on the "Products" tab located in the top navigation menu.
      • Click on "GPT-3": From the Products page, click on the "GPT-3" option to learn more about the API.
      • Click on "Apply for Access": Once you have reviewed the information about the GPT-3 API, click on the Join API waitlist button located on the GPT-3 page.
      • Fill out the application form: Fill out the application form with your personal and project information. You will need to provide information about your intended use of the API, as well as your technical expertise and experience.
      • Agree to the terms: Read and agree to the terms of the GPT-3 API access agreement.
      • Submit your application: Once you have completed the application form and agreed to the terms, submit your application for review.
      • Wait for approval: The review process can take several weeks, and not all applications are approved. If your application is approved, you will receive an email with instructions on how to set up your account and obtain an API key.
      • Set up your account: Follow the instructions in the email to set up your OpenAI account. You will need to create a password and verify your email address.
      • Obtain your API key: Once your account is set up, log in to the OpenAI developer dashboard (https://beta.openai.com/login/) and navigate to the API keys section. Here you will find your API key, which you can copy and use in your application.
      1. Go to the OpenAI website at https://openai.com/
      2. Log into your OpenAI dashboard
      3. Click your profile icon at the top right
      4. Go to View API Keys and click Create new secret key to generate your API secret key.

      Below you can find a way suggested by chatGPT to generate an API Key for specific model for ex. for text-davinci-003;  but couldn't find it:

      1. Go to the OpenAI website at https://openai.com/
      2. Click on the "Get Started for Free" button in the top right corner of the page.
      3. Sign up for an OpenAI account by providing your email address and a password.
      4. Once you've signed up, log in to your OpenAI account.
      5. Navigate to the API Keys section of your account dashboard.
      6. Create a new API key by clicking the "New API Key" button.
      7. Select the text-davinci-003 model from the dropdown menu of available models.
      8. Give your API key a name and description, if desired.
      9. Click the "Create API Key" button to generate your new API key.
  • Create Request body
  • The OpenAI API usage is simple and follows the conventional API consumption pattern.

    1. Install the openai package using pip: pip install openai. If using Node instead, you can do so using npm: npm install openai.
    2. Grab your API keys: Log into your OpenAI dashboard and click your profile icon at the top right. Go to View API Keys and click Create new secret key to generate your API secret key.
    3. Make API calls to your chosen model endpoints via a server-side language like Python or JavaScript (Node). Feed these to your custom APIs and test your endpoints.
    4. Then fetch custom APIs via JavaScript frameworks like React, Vue, or Angular.
    5. Present data (user requests and model responses) in a visually appealing UI, and your app is ready for real-world use.
  • What are tokens and how to count them?
  • You can think of tokens as pieces of words, where 1,000 tokens is about 750 words.
  • It depends on the model used, for example for text-davinci-003 requests can use up to 4097 tokens shared between prompt and completion. If your prompt is 4000 tokens, your completion can be 97 tokens at most.
  • The limit is currently a technical limitation
  • Solution: there are often creative ways to solve problems within the limit, e.g. condensing the prompt, breaking the text into smaller pieces, etc.
  • Techniques for improving reliability around prompts

Even with careful planning, it's important to be prepared for unexpected issues when using GPT-3 in your application. In some cases, the model may fail on a task, so it's helpful to consider what you can do to improve the reliability of your application.

If your task involves logical reasoning or complexity, you may need to take additional steps to build more reliable prompts. For some helpful suggestions, consult our Techniques to improve reliability guide. Overall the recommendations revolve around:

    • Decomposing unreliable operations into smaller, more reliable operations (e.g., selection-inference prompting)
    • Using multiple steps or multiple relationships to make the system's reliability greater than any individual component (e.g., maieutic prompting)
  • Data usage policies can be found  here 
  • API usage policies can be found  here

Follow this link to get more info

How prompt engineering works

Due to the way the instruction-following models are trained or the data they are trained on, there are specific prompt formats that work particularly well and align better with the tasks at hand. Below we present a number of prompt formats we find work reliably well, but feel free to explore different formats, which may fit your task best.

Rules of Thumb and Examples :

Note: the "{text input here}" is a placeholder for actual text/context

1. Use the latest model

For best results, we generally recommend using the latest, most capable models. As of November 2022, the best options are the “text-davinci-003” model for text generation, and the “code-davinci-002” model for code generation.

2. Put instructions at the beginning of the prompt and use ### or """ to separate the instruction and context

Less effective ❌:

Summarize the text below as a bullet point list of the most important points.
{text input here}

Better ✅:

Summarize the text below as a bullet point list of the most important points.
Text: """
{text input here}
"""

3. Be specific, descriptive and as detailed as possible about the desired context, outcome, length, format, style, etc

Be specific about the context, outcome, length, format, style, etc

Less effective ❌:

Write a poem about OpenAI.

Better ✅:

Write a short inspiring poem about OpenAI, focusing on the recent DALL-E product launch (DALL-E is a text to image ML model) in the style of a {famous poet}

4. Articulate the desired output format through examples (example 1, example 2).

Less effective ❌:

Extract the entities mentioned in the text below. Extract the following 4 entity types: company names, people names, specific topics and themes.
Text: {text}

Show, and tell - the models respond better when shown specific format requirements. This also makes it easier to programmatically parse out multiple outputs reliably.

Better ✅:

Extract the important entities mentioned in the text below. First extract all company names, then extract all people names, then extract specific topics which fit the content and finally extract general overarching themes
Desired format:
Company names: <comma_separated_list_of_company_names>
People names: -||-
Specific topics: -||-
General themes: -||-

Text: {text}

5. Start with zero-shot, then few-shot (example), neither of them worked, then fine-tune

✅ Zero-shot

Extract keywords from the below text.
Text: {text}

Keywords:

✅ Few-shot - provide a couple of examples

Extract keywords from the corresponding texts below.
Text 1: Stripe provides APIs that web developers can use to integrate payment processing into their websites and mobile applications.
Keywords 1: Stripe, payment processing, APIs, web developers, websites, mobile applications
##
Text 2: OpenAI has trained cutting-edge language models that are very good at understanding and generating text. Our API provides access to these models and can be used to solve virtually any task that involves processing language.
Keywords 2: OpenAI, language models, text processing, API.
##
Text 3: {text}
Keywords 3:

✅Fine-tune: see fine-tune best practices here.

6. Reduce “fluffy” and imprecise descriptions

Less effective ❌:

The description for this product should be fairly short, a few sentences only, and not too much more.

Better ✅:

Use a 3 to 5 sentence paragraph to describe this product.

7. Instead of just saying what not to do, say what to do instead

Less effective ❌:

The following is a conversation between an Agent and a Customer. DO NOT ASK USERNAME OR PASSWORD. DO NOT REPEAT.
Customer: I can’t log in to my account.
Agent:

Better ✅:

The following is a conversation between an Agent and a Customer. The agent will attempt to diagnose the problem and suggest a solution, whilst refraining from asking any questions related to PII. Instead of asking for PII, such as username or password, refer the user to the help article 
Customer: I can’t log in to my account.
Agent:

8. Code Generation Specific - Use “leading words” to nudge the model toward a particular pattern

Less effective ❌:

# Write a simple python function that

# 1. Ask me for a number in mile
# 2. It converts miles to kilometers

In this code example below, adding “import” hints to the model that it should start writing in Python. (Similarly “SELECT” is a good hint for the start of a SQL statement.)

Better ✅:

# Write a simple python function that

# 1. Ask me for a number in mile
# 2. It converts miles to kilometers
 
import

Parameters

Generally, we find that model and temperature are the most commonly used parameters to alter the model output.

    • model: Higher performance models are more expensive and have higher latency.
    • temperatureA measure of how often the model outputs a less likely token. The higher the temperature, the more random (and usually creative) the output. This, however, is not the same as “truthfulness”. For most factual use cases such as data extraction, and truthful Q&A, the temperature of 0 is best.
    • max_tokens (maximum length):  Does not control the length of the output, but a hard cutoff limit for token generation. Ideally you won’t hit this limit often, as your model will stop either when it thinks it’s finished, or when it hits a stop sequence you defined.
    • stop (stop sequences):  A set of characters (tokens) that, when generated, will cause the text generation to stop.

For other parameter descriptions see the API reference.


  • No labels