OpenAI

Swift community driven package for OpenAI public API

Stars: 2380

Visit

OpenAI is a Swift community-maintained implementation over OpenAI public API. It is a non-profit artificial intelligence research organization founded in San Francisco, California in 2015. OpenAI's mission is to ensure safe and responsible use of AI for civic good, economic growth, and other public benefits. The repository provides functionalities for text completions, chats, image generation, audio processing, edits, embeddings, models, moderations, utilities, and Combine extensions.

README:

OpenAI

This repository contains Swift community-maintained implementation over OpenAI public API.

What is OpenAI
Installation
Usage
Support for other providers: Gemini, DeepSeek, Perplexity, OpenRouter, etc.
Example Project
Contribution Guidelines
Links
License

What is OpenAI

OpenAI is a non-profit artificial intelligence research organization founded in San Francisco, California in 2015. It was created with the purpose of advancing digital intelligence in ways that benefit humanity as a whole and promote societal progress. The organization strives to develop AI (Artificial Intelligence) programs and systems that can think, act and adapt quickly on their own – autonomously. OpenAI's mission is to ensure safe and responsible use of AI for civic good, economic growth and other public benefits; this includes cutting-edge research into important topics such as general AI safety, natural language processing, applied reinforcement learning methods, machine vision algorithms etc.

The OpenAI API can be applied to virtually any task that involves understanding or generating natural language or code. We offer a spectrum of models with different levels of power suitable for different tasks, as well as the ability to fine-tune your own custom models. These models can be used for everything from content generation to semantic search and classification.

Installation

OpenAI is available with Swift Package Manager. The Swift Package Manager is a tool for automating the distribution of Swift code and is integrated into the swift compiler. Once you have your Swift package set up, adding OpenAI as a dependency is as easy as adding it to the dependencies value of your Package.swift.

dependencies: [
    .package(url: "https://github.com/MacPaw/OpenAI.git", branch: "main")
]

Usage

Initialization

To initialize API instance you need to obtain API token from your Open AI organization.

Remember that your API key is a secret! Do not share it with others or expose it in any client-side code (browsers, apps). Production requests must be routed through your own backend server where your API key can be securely loaded from an environment variable or key management service.

Once you have a token, you can initialize OpenAI class, which is an entry point to the API.

⚠️ OpenAI strongly recommends developers of client-side applications proxy requests through a separate backend service to keep their API key safe. API keys can access and manipulate customer billing, usage, and organizational data, so it's a significant risk to expose them.

let openAI = OpenAI(apiToken: "YOUR_TOKEN_HERE")

Optionally you can initialize OpenAI with token, organization identifier and timeoutInterval.

let configuration = OpenAI.Configuration(token: "YOUR_TOKEN_HERE", organizationIdentifier: "YOUR_ORGANIZATION_ID_HERE", timeoutInterval: 60.0)
let openAI = OpenAI(configuration: configuration)

See OpenAI.Configuration for more values that can be passed on init for customization, like: host, basePath, port, scheme and customHeaders.

Once you posses the token, and the instance is initialized you are ready to make requests.

Chats

Using the OpenAI Chat API, you can build your own applications with gpt-3.5-turbo to do things like:

Draft an email or other piece of writing
Write Python code
Answer questions about a set of documents
Create conversational agents
Give your software a natural language interface
Tutor in a range of subjects
Translate languages
Simulate characters for video games and much more

Request

struct ChatQuery: Codable {
    /// ID of the model to use.
    public let model: Model
    /// An object specifying the format that the model must output.
    public let responseFormat: ResponseFormat?
    /// The messages to generate chat completions for
    public let messages: [Message]
    /// A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for.
    public let tools: [Tool]?
    /// Controls how the model responds to tool calls. "none" means the model does not call a function, and responds to the end-user. "auto" means the model can pick between and end-user or calling a function. Specifying a particular function via `{"name": "my_function"}` forces the model to call that function. "none" is the default when no functions are present. "auto" is the default if functions are present.
    public let toolChoice: ToolChoice?
    /// What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and  We generally recommend altering this or top_p but not both.
    public let temperature: Double?
    /// An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
    public let topP: Double?
    /// How many chat completion choices to generate for each input message.
    public let n: Int?
    /// Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
    public let stop: [String]?
    /// The maximum number of tokens to generate in the completion.
    public let maxTokens: Int?
    /// Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
    public let presencePenalty: Double?
    /// Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
    public let frequencyPenalty: Double?
    /// Modify the likelihood of specified tokens appearing in the completion.
    public let logitBias: [String:Int]?
    /// A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.
    public let user: String?
}

Response

struct ChatResult: Codable, Equatable {
    public struct Choice: Codable, Equatable {
        public let index: Int
        public let message: Chat
        public let finishReason: String
    }
    
    public struct Usage: Codable, Equatable {
        public let promptTokens: Int
        public let completionTokens: Int
        public let totalTokens: Int
    }
    
    public let id: String
    public let object: String
    public let created: TimeInterval
    public let model: Model
    public let choices: [Choice]
    public let usage: Usage
}

Example

let query = ChatQuery(model: .gpt3_5Turbo, messages: [.init(role: .user, content: "who are you")])
let result = try await openAI.chats(query: query)

(lldb) po result
▿ ChatResult
  - id : "chatcmpl-6pwjgxGV2iPP4QGdyOLXnTY0LE3F8"
  - object : "chat.completion"
  - created : 1677838528.0
  - model : "gpt-3.5-turbo-0301"
  ▿ choices : 1 element
    ▿ 0 : Choice
      - index : 0
      ▿ message : Chat
        - role : "assistant"
        - content : "\n\nI\'m an AI language model developed by OpenAI, created to provide assistance and support for various tasks such as answering questions, generating text, and providing recommendations. Nice to meet you!"
      - finish_reason : "stop"
  ▿ usage : Usage
    - prompt_tokens : 10
    - completion_tokens : 39
    - total_tokens : 49

Chats Streaming

Chats streaming is available by using chatStream function. Tokens will be sent one-by-one.

Closures

openAI.chatsStream(query: query) { partialResult in
    switch partialResult {
    case .success(let result):
        print(result.choices)
    case .failure(let error):
        //Handle chunk error here
    }
} completion: { error in
    //Handle streaming error here
}

Combine

openAI
    .chatsStream(query: query)
    .sink { completion in
        //Handle completion result here
    } receiveValue: { result in
        //Handle chunk here
    }.store(in: &cancellables)

Structured concurrency

for try await result in openAI.chatsStream(query: query) {
   //Handle result here
}

Function calls

let openAI = OpenAI(apiToken: "...")
// Declare functions which GPT-3 might decide to call.
let functions = [
  FunctionDeclaration(
      name: "get_current_weather",
      description: "Get the current weather in a given location",
      parameters:
        JSONSchema(
          type: .object,
          properties: [
            "location": .init(type: .string, description: "The city and state, e.g. San Francisco, CA"),
            "unit": .init(type: .string, enumValues: ["celsius", "fahrenheit"])
          ],
          required: ["location"]
        )
  )
]
let query = ChatQuery(
  model: "gpt-3.5-turbo-0613",  // 0613 is the earliest version with function calls support.
  messages: [
      Chat(role: .user, content: "What's the weather like in Boston?")
  ],
  tools: functions.map { Tool.function($0) }
)
let result = try await openAI.chats(query: query)

Result will be (serialized as JSON here for readability):

{
  "id": "chatcmpl-1234",
  "object": "chat.completion",
  "created": 1686000000,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call-0",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\n  \"location\": \"Boston, MA\"\n}"
            }
          }
        ]
      },
      "finish_reason": "function_call"
    }
  ],
  "usage": { "total_tokens": 100, "completion_tokens": 18, "prompt_tokens": 82 }
}

Review Chat Documentation for more info.

Structured Output

JSON is one of the most widely used formats in the world for applications to exchange data.

Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.

Example

struct MovieInfo: StructuredOutput {
    
    let title: String
    let director: String
    let release: Date
    let genres: [MovieGenre]
    let cast: [String]
    
    static let example: Self = { 
        .init(
            title: "Earth",
            director: "Alexander Dovzhenko",
            release: Calendar.current.date(from: DateComponents(year: 1930, month: 4, day: 1))!,
            genres: [.drama],
            cast: ["Stepan Shkurat", "Semyon Svashenko", "Yuliya Solntseva"]
        )
    }()
}

enum MovieGenre: String, Codable, StructuredOutputEnum {
    case action, drama, comedy, scifi
    
    var caseNames: [String] { Self.allCases.map { $0.rawValue } }
}

let query = ChatQuery(
    messages: [.system(.init(content: "Best Picture winner at the 2011 Oscars"))],
    model: .gpt4_o,
    responseFormat: .jsonSchema(name: "movie-info", type: MovieInfo.self)
)
let result = try await openAI.chats(query: query)

Use the jsonSchema(name:type:) response format when creating a ChatQuery
Provide a schema name and a type that conforms to ChatQuery.StructuredOutput and generates an instance as an example
Make sure all enum types within the provided type conform to ChatQuery.StructuredOutputEnum and generate an array of names for all cases

Review Structured Output Documentation for more info.

Images

Given a prompt and/or an input image, the model will generate a new image.

As Artificial Intelligence continues to develop, so too does the intriguing concept of Dall-E. Developed by OpenAI, a research lab for artificial intelligence purposes, Dall-E has been classified as an AI system that can generate images based on descriptions provided by humans. With its potential applications spanning from animation and illustration to design and engineering - not to mention the endless possibilities in between - it's easy to see why there is such excitement over this new technology.

Create Image

Request

struct ImagesQuery: Codable {
    /// A text description of the desired image(s). The maximum length is 1000 characters.
    public let prompt: String
    /// The number of images to generate. Must be between 1 and 10.
    public let n: Int?
    /// The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024.
    public let size: String?
}

Response

struct ImagesResult: Codable, Equatable {
    public struct URLResult: Codable, Equatable {
        public let url: String
    }
    public let created: TimeInterval
    public let data: [URLResult]
}

Example

let query = ImagesQuery(prompt: "White cat with heterochromia sitting on the kitchen table", n: 1, size: "1024x1024")
openAI.images(query: query) { result in
  //Handle result here
}
//or
let result = try await openAI.images(query: query)

(lldb) po result
▿ ImagesResult
  - created : 1671453505.0
  ▿ data : 1 element
    ▿ 0 : URLResult
      - url : "https://oaidalleapiprodscus.blob.core.windows.net/private/org-CWjU5cDIzgCcVjq10pp5yX5Q/user-GoBXgChvLBqLHdBiMJBUbPqF/img-WZVUK2dOD4HKbKwW1NeMJHBd.png?st=2022-12-19T11%3A38%3A25Z&se=2022-12-19T13%3A38%3A25Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2022-12-19T09%3A35%3A16Z&ske=2022-12-20T09%3A35%3A16Z&sks=b&skv=2021-08-06&sig=mh52rmtbQ8CXArv5bMaU6lhgZHFBZz/ePr4y%2BJwLKOc%3D"

Generated image

Create Image Edit

Creates an edited or extended image given an original image and a prompt.

Request

public struct ImageEditsQuery: Codable {
    /// The image to edit. Must be a valid PNG file, less than 4MB, and square. If mask is not provided, image must have transparency, which will be used as the mask.
    public let image: Data
    public let fileName: String
    /// An additional image whose fully transparent areas (e.g. where alpha is zero) indicate where image should be edited. Must be a valid PNG file, less than 4MB, and have the same dimensions as image.
    public let mask: Data?
    public let maskFileName: String?
    /// A text description of the desired image(s). The maximum length is 1000 characters.
    public let prompt: String
    /// The number of images to generate. Must be between 1 and 10.
    public let n: Int?
    /// The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024.
    public let size: String?
}

Response

Uses the ImagesResult response similarly to ImagesQuery.

Example

let data = image.pngData()
let query = ImageEditQuery(image: data, fileName: "whitecat.png", prompt: "White cat with heterochromia sitting on the kitchen table with a bowl of food", n: 1, size: "1024x1024")
openAI.imageEdits(query: query) { result in
  //Handle result here
}
//or
let result = try await openAI.imageEdits(query: query)

Create Image Variation

Creates a variation of a given image.

Request

public struct ImageVariationsQuery: Codable {
    /// The image to edit. Must be a valid PNG file, less than 4MB, and square. If mask is not provided, image must have transparency, which will be used as the mask.
    public let image: Data
    public let fileName: String
    /// The number of images to generate. Must be between 1 and 10.
    public let n: Int?
    /// The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024.
    public let size: String?
}

Response

Uses the ImagesResult response similarly to ImagesQuery.

Example

let data = image.pngData()
let query = ImageVariationQuery(image: data, fileName: "whitecat.png", n: 1, size: "1024x1024")
openAI.imageVariations(query: query) { result in
  //Handle result here
}
//or
let result = try await openAI.imageVariations(query: query)

Review Images Documentation for more info.

Audio

The speech to text API provides two endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model. They can be used to:

Transcribe audio into whatever language the audio is in. Translate and transcribe the audio into english. File uploads are currently limited to 25 MB and the following input file types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm.

Audio Create Speech

This function sends an AudioSpeechQuery to the OpenAI API to create audio speech from text using a specific voice and format.

Learn more about voices.
Learn more about models.

Request:

public struct AudioSpeechQuery: Codable, Equatable {
    //...
    public let model: Model // tts-1 or tts-1-hd  
    public let input: String
    public let voice: AudioSpeechVoice
    public let responseFormat: AudioSpeechResponseFormat
    public let speed: String? // Initializes with Double?
    //...
}

Response:

/// Audio data for one of the following formats :`mp3`, `opus`, `aac`, `flac`, `pcm`
public let audioData: Data?

Example:

let query = AudioSpeechQuery(model: .tts_1, input: "Hello, world!", voice: .alloy, responseFormat: .mp3, speed: 1.0)

openAI.audioCreateSpeech(query: query) { result in
    // Handle response here
}
//or
let result = try await openAI.audioCreateSpeech(query: query)

OpenAI Create Speech – Documentation

Audio Create Speech Streaming

Audio Create Speech is available by using audioCreateSpeechStream function. Tokens will be sent one-by-one.

Closures

openAI.audioCreateSpeechStream(query: query) { partialResult in
    switch partialResult {
    case .success(let result):
        print(result.audio)
    case .failure(let error):
        //Handle chunk error here
    }
} completion: { error in
    //Handle streaming error here
}

Combine

openAI
    .audioCreateSpeechStream(query: query)
    .sink { completion in
        //Handle completion result here
    } receiveValue: { result in
        //Handle chunk here
    }.store(in: &cancellables)

Structured concurrency

for try await result in openAI.audioCreateSpeechStream(query: query) {
   //Handle result here
}

Audio Transcriptions

Transcribes audio into the input language.

Request

public struct AudioTranscriptionQuery: Codable, Equatable {
    
    public let file: Data
    public let fileName: String
    public let model: Model
    
    public let prompt: String?
    public let temperature: Double?
    public let language: String?
}

Response

public struct AudioTranscriptionResult: Codable, Equatable {
    
    public let text: String
}

Example

let data = Data(contentsOfURL:...)
let query = AudioTranscriptionQuery(file: data, fileName: "audio.m4a", model: .whisper_1)        

openAI.audioTranscriptions(query: query) { result in
    //Handle result here
}
//or
let result = try await openAI.audioTranscriptions(query: query)

Audio Translations

Translates audio into into English.

Request

public struct AudioTranslationQuery: Codable, Equatable {
    
    public let file: Data
    public let fileName: String
    public let model: Model
    
    public let prompt: String?
    public let temperature: Double?
}

Response

public struct AudioTranslationResult: Codable, Equatable {
    
    public let text: String
}

Example

let data = Data(contentsOfURL:...)
let query = AudioTranslationQuery(file: data, fileName: "audio.m4a", model: .whisper_1)  

openAI.audioTranslations(query: query) { result in
    //Handle result here
}
//or
let result = try await openAI.audioTranslations(query: query)

Review Audio Documentation for more info.

Embeddings

Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.

Request

struct EmbeddingsQuery: Codable {
    /// ID of the model to use.
    public let model: Model
    /// Input text to get embeddings for
    public let input: String
}

Response

struct EmbeddingsResult: Codable, Equatable {

    public struct Embedding: Codable, Equatable {

        public let object: String
        public let embedding: [Double]
        public let index: Int
    }
    public let data: [Embedding]
    public let usage: Usage
}

Example

let query = EmbeddingsQuery(model: .textSearchBabbageDoc, input: "The food was delicious and the waiter...")
openAI.embeddings(query: query) { result in
  //Handle response here
}
//or
let result = try await openAI.embeddings(query: query)

(lldb) po result
▿ EmbeddingsResult
  ▿ data : 1 element
    ▿ 0 : Embedding
      - object : "embedding"
      ▿ embedding : 2048 elements
        - 0 : 0.0010535449
        - 1 : 0.024234328
        - 2 : -0.0084999
        - 3 : 0.008647452
    .......
        - 2044 : 0.017536353
        - 2045 : -0.005897616
        - 2046 : -0.026559394
        - 2047 : -0.016633155
      - index : 0

(lldb)

Review Embeddings Documentation for more info.

Models

Models are represented as a typealias typealias Model = String.

public extension Model {
    static let gpt4_turbo_preview = "gpt-4-turbo-preview"
    static let gpt4_vision_preview = "gpt-4-vision-preview"
    static let gpt4_0125_preview = "gpt-4-0125-preview"
    static let gpt4_1106_preview = "gpt-4-1106-preview"
    static let gpt4 = "gpt-4"
    static let gpt4_0613 = "gpt-4-0613"
    static let gpt4_0314 = "gpt-4-0314"
    static let gpt4_32k = "gpt-4-32k"
    static let gpt4_32k_0613 = "gpt-4-32k-0613"
    static let gpt4_32k_0314 = "gpt-4-32k-0314"
    
    static let gpt3_5Turbo = "gpt-3.5-turbo"
    static let gpt3_5Turbo_0125 = "gpt-3.5-turbo-0125"
    static let gpt3_5Turbo_1106 = "gpt-3.5-turbo-1106"
    static let gpt3_5Turbo_0613 = "gpt-3.5-turbo-0613"
    static let gpt3_5Turbo_0301 = "gpt-3.5-turbo-0301"
    static let gpt3_5Turbo_16k = "gpt-3.5-turbo-16k"
    static let gpt3_5Turbo_16k_0613 = "gpt-3.5-turbo-16k-0613"
    
    static let textDavinci_003 = "text-davinci-003"
    static let textDavinci_002 = "text-davinci-002"
    static let textCurie = "text-curie-001"
    static let textBabbage = "text-babbage-001"
    static let textAda = "text-ada-001"
    
    static let textDavinci_001 = "text-davinci-001"
    static let codeDavinciEdit_001 = "code-davinci-edit-001"
    
    static let tts_1 = "tts-1"
    static let tts_1_hd = "tts-1-hd"
    
    static let whisper_1 = "whisper-1"

    static let dall_e_2 = "dall-e-2"
    static let dall_e_3 = "dall-e-3"
    
    static let davinci = "davinci"
    static let curie = "curie"
    static let babbage = "babbage"
    static let ada = "ada"
    
    static let textEmbeddingAda = "text-embedding-ada-002"
    static let textSearchAda = "text-search-ada-doc-001"
    static let textSearchBabbageDoc = "text-search-babbage-doc-001"
    static let textSearchBabbageQuery001 = "text-search-babbage-query-001"
    static let textEmbedding3 = "text-embedding-3-small"
    static let textEmbedding3Large = "text-embedding-3-large"
    
    static let textModerationStable = "text-moderation-stable"
    static let textModerationLatest = "text-moderation-latest"
    static let moderation = "text-moderation-007"
}

GPT-4 models are supported.

As an example: To use the gpt-4-turbo-preview model, pass .gpt4_turbo_preview as the parameter to the ChatQuery init.

let query = ChatQuery(model: .gpt4_turbo_preview, messages: [
    .init(role: .system, content: "You are Librarian-GPT. You know everything about the books."),
    .init(role: .user, content: "Who wrote Harry Potter?")
])
let result = try await openAI.chats(query: query)
XCTAssertFalse(result.choices.isEmpty)

You can also pass a custom string if you need to use some model, that is not represented above.

List Models

Lists the currently available models.

Response

public struct ModelsResult: Codable, Equatable {
    
    public let data: [ModelResult]
    public let object: String
}

Example

openAI.models() { result in
  //Handle result here
}
//or
let result = try await openAI.models()

Retrieve Model

Retrieves a model instance, providing ownership information.

Request

public struct ModelQuery: Codable, Equatable {
    
    public let model: Model
}

Response

public struct ModelResult: Codable, Equatable {

    public let id: Model
    public let object: String
    public let ownedBy: String
}

Example

let query = ModelQuery(model: .gpt4)
openAI.model(query: query) { result in
  //Handle result here
}
//or
let result = try await openAI.model(query: query)

Review Models Documentation for more info.

Moderations

Given a input text, outputs if the model classifies it as violating OpenAI's content policy.

Request

public struct ModerationsQuery: Codable {
    
    public let input: String
    public let model: Model?
}

Response

public struct ModerationsResult: Codable, Equatable {

    public let id: String
    public let model: Model
    public let results: [CategoryResult]
}

Example

let query = ModerationsQuery(input: "I want to kill them.")
openAI.moderations(query: query) { result in
  //Handle result here
}
//or
let result = try await openAI.moderations(query: query)

Review Moderations Documentation for more info.

Utilities

The component comes with several handy utility functions to work with the vectors.

public struct Vector {

    /// Returns the similarity between two vectors
    ///
    /// - Parameters:
    ///     - a: The first vector
    ///     - b: The second vector
    public static func cosineSimilarity(a: [Double], b: [Double]) -> Double {
        return dot(a, b) / (mag(a) * mag(b))
    }

    /// Returns the difference between two vectors. Cosine distance is defined as `1 - cosineSimilarity(a, b)`
    ///
    /// - Parameters:
    ///     - a: The first vector
    ///     - b: The second vector
    public func cosineDifference(a: [Double], b: [Double]) -> Double {
        return 1 - Self.cosineSimilarity(a: a, b: b)
    }
}

Example

let vector1 = [0.213123, 0.3214124, 0.421412, 0.3214521251, 0.412412, 0.3214124, 0.1414124, 0.3214521251, 0.213123, 0.3214124, 0.1414124, 0.4214214, 0.213123, 0.3214124, 0.1414124, 0.3214521251, 0.213123, 0.3214124, 0.1414124, 0.3214521251]
let vector2 = [0.213123, 0.3214124, 0.1414124, 0.3214521251, 0.213123, 0.3214124, 0.1414124, 0.3214521251, 0.213123, 0.511515, 0.1414124, 0.3214521251, 0.213123, 0.3214124, 0.1414124, 0.3214521251, 0.213123, 0.3214124, 0.1414124, 0.3213213]
let similarity = Vector.cosineSimilarity(a: vector1, b: vector2)
print(similarity) //0.9510201910206734

In data analysis, cosine similarity is a measure of similarity between two sequences of numbers.

Combine Extensions

The library contains built-in Combine extensions.

func images(query: ImagesQuery) -> AnyPublisher<ImagesResult, Error>
func embeddings(query: EmbeddingsQuery) -> AnyPublisher<EmbeddingsResult, Error>
func chats(query: ChatQuery) -> AnyPublisher<ChatResult, Error>
func model(query: ModelQuery) -> AnyPublisher<ModelResult, Error>
func models() -> AnyPublisher<ModelsResult, Error>
func moderations(query: ModerationsQuery) -> AnyPublisher<ModerationsResult, Error>
func audioTranscriptions(query: AudioTranscriptionQuery) -> AnyPublisher<AudioTranscriptionResult, Error>
func audioTranslations(query: AudioTranslationQuery) -> AnyPublisher<AudioTranslationResult, Error>

Assistants

Review Assistants Documentation for more info.

Create Assistant

Example: Create Assistant

let query = AssistantsQuery(model: Model.gpt4_o_mini, name: name, description: description, instructions: instructions, tools: tools, toolResources: toolResources)
openAI.assistantCreate(query: query) { result in
   //Handle response here
}

Modify Assistant

Example: Modify Assistant

let query = AssistantsQuery(model: Model.gpt4_o_mini, name: name, description: description, instructions: instructions, tools: tools, toolResources: toolResources)
openAI.assistantModify(query: query, assistantId: "asst_1234") { result in
    //Handle response here
}

List Assistants

Example: List Assistants

openAI.assistants() { result in
   //Handle response here
}

Threads

Review Threads Documentation for more info.

Create Thread

Example: Create Thread

let threadsQuery = ThreadsQuery(messages: [Chat(role: message.role, content: message.content)])
openAI.threads(query: threadsQuery) { result in
  //Handle response here
}

Create and Run Thread

Example: Create and Run Thread

let threadsQuery = ThreadQuery(messages: [Chat(role: message.role, content: message.content)])
let threadRunQuery = ThreadRunQuery(assistantId: "asst_1234"  thread: threadsQuery)
openAI.threadRun(query: threadRunQuery) { result in
  //Handle response here
}

Get Threads Messages

Review Messages Documentation for more info.

Example: Get Threads Messages

openAI.threadsMessages(threadId: currentThreadId) { result in
  //Handle response here
}

Add Message to Thread

Example: Add Message to Thread

let query = MessageQuery(role: message.role.rawValue, content: message.content)
openAI.threadsAddMessage(threadId: currentThreadId, query: query) { result in
  //Handle response here
}

Runs

Review Runs Documentation for more info.

Create Run

Example: Create Run

let runsQuery = RunsQuery(assistantId:  currentAssistantId)
openAI.runs(threadId: threadsResult.id, query: runsQuery) { result in
  //Handle response here
}

Retrieve Run

Example: Retrieve Run

openAI.runRetrieve(threadId: currentThreadId, runId: currentRunId) { result in
  //Handle response here
}

Retrieve Run Steps

Example: Retrieve Run Steps

openAI.runRetrieveSteps(threadId: currentThreadId, runId: currentRunId) { result in
  //Handle response here
}

Submit Tool Outputs for Run

Example: Submit Tool Outputs for Run

let output = RunToolOutputsQuery.ToolOutput(toolCallId: "call123", output: "Success")
let query = RunToolOutputsQuery(toolOutputs: [output])
openAI.runSubmitToolOutputs(threadId: currentThreadId, runId: currentRunId, query: query) { result in
  //Handle response here
}

Files

Review Files Documentation for more info.

Upload file

Example: Upload file

let query = FilesQuery(purpose: "assistants", file: fileData, fileName: url.lastPathComponent, contentType: "application/pdf")
openAI.files(query: query) { result in
  //Handle response here
}

Cancelling requests

Closure based API

When you call any of the closure-based API methods, it returns discardable CancellableRequest. Hold a reference to it to be able to cancel the request later.

let cancellableRequest = object.chats(query: query, completion: { _ in })
cancellableReques

Swift Concurrency

For Swift Concurrency calls, you can simply cancel the calling task, and corresponding URLSessionDataTask would get cancelled automatically.

let task = Task {
    do {
        let chatResult = try await openAIClient.chats(query: .init(messages: [], model: "asd"))
    } catch {
        // Handle cancellation or error
    }
}
            
task.cancel()

Combine

In Combine, use a default cancellation mechanism. Just discard the reference to a subscription, or call cancel() on it.

let subscription = openAIClient
    .images(query: query)
    .sink(receiveCompletion: { completion in }, receiveValue: { imagesResult in })
    
subscription.cancel()

Support for other providers

This SDK has a limited support for other providers like Gemini, Perplexity etc.

The top priority of this SDK is OpenAI, and the main rule is for all the main types to be fully compatible with OpenAI's API Reference. If it says a field should be optional, it must be optional in main subset of Query/Result types of this SDK. The same goes for other info declared in the reference, like default values.

That said we still want to give a support for other providers.

Handling missing keys in responses with Parsing Options

Some providers return responses that don't completely satisfy OpenAI's scheme. Like, Gemini chat completion response ommits id field which is a required field in OpenAI's API Reference.

In such case use fillRequiredFieldIfKeyNotFound Parsing Option, like this:

let configuration = OpenAI.Configuration(token: "", parsingOptions: .fillRequiredFieldIfKeyNotFound)

What if a provider returns additional fields?

Currently we handle such cases by simply adding additional fields to main model set. This is possible because optional fields wouldn't break or conflict with OpenAI's scheme. At the moment, such additional fields are added.

ChatResult

citations Perplexity

ChatResult.Choice.Message

reasoningContent Grok, DeepSeek
reasoning OpenRouter

Example Project

You can find example iOS application in Demo folder.

Contribution Guidelines

Make your Pull Requests clear and obvious to anyone viewing them.
Set main as your target branch.

Use Conventional Commits principles in naming PRs and branches:

Feat: ... for new features and new functionality implementations.
Bug: ... for bug fixes.
Fix: ... for minor issues fixing, like typos or inaccuracies in code.
Chore: ... for boring stuff like code polishing, refactoring, deprecation fixing etc.

PR naming example: Feat: Add Threads API handling or Bug: Fix message result duplication

Branch naming example: feat/add-threads-API-handling or bug/fix-message-result-duplication

Write description to pull requests in following format:

What

...
Why

...
Affected Areas

...
More Info

...

We'll appreciate you including tests to your code if it is needed and possible. ❤️

License

MIT License

Copyright (c) 2023 MacPaw Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

For Tasks:

Click tags to check more tools for each tasks

generate text completions build chat applications create images from descriptions transcribe and translate audio analyze text embeddings

For Jobs:

ai researcher software engineer data scientist machine learning engineer developer advocate

Alternative AI tools for OpenAI

Similar Open Source Tools

OpenAI

github

: 2.4k

Lumos

Lumos is a Chrome extension powered by a local LLM co-pilot for browsing the web. It allows users to summarize long threads, news articles, and technical documentation. Users can ask questions about reviews and product pages. The tool requires a local Ollama server for LLM inference and embedding database. Lumos supports multimodal models and file attachments for processing text and image content. It also provides options to customize models, hosts, and content parsers. The extension can be easily accessed through keyboard shortcuts and offers tools for automatic invocation based on prompts.

github

: 1.3k

memobase

Memobase is a user profile-based memory system designed to enhance Generative AI applications by enabling them to remember, understand, and evolve with users. It provides structured user profiles, scalable profiling, easy integration with existing LLM stacks, batch processing for speed, and is production-ready. Users can manage users, insert data, get memory profiles, and track user preferences and behaviors. Memobase is ideal for applications that require user analysis, tracking, and personalized interactions.

github

: 994

claim-ai-phone-bot

AI-powered call center solution with Azure and OpenAI GPT. The bot can answer calls, understand the customer's request, and provide relevant information or assistance. It can also create a todo list of tasks to complete the claim, and send a report after the call. The bot is customizable, and can be used in multiple languages.

github

: 65

instructor

Instructor is a popular Python library for managing structured outputs from large language models (LLMs). It offers a user-friendly API for validation, retries, and streaming responses. With support for various LLM providers and multiple languages, Instructor simplifies working with LLM outputs. The library includes features like response models, retry management, validation, streaming support, and flexible backends. It also provides hooks for logging and monitoring LLM interactions, and supports integration with Anthropic, Cohere, Gemini, Litellm, and Google AI models. Instructor facilitates tasks such as extracting user data from natural language, creating fine-tuned models, managing uploaded files, and monitoring usage of OpenAI models.

github

: 10.0k

langchainrb

Langchain.rb is a Ruby library that makes it easy to build LLM-powered applications. It provides a unified interface to a variety of LLMs, vector search databases, and other tools, making it easy to build and deploy RAG (Retrieval Augmented Generation) systems and assistants. Langchain.rb is open source and available under the MIT License.

github

: 1.7k

call-center-ai

Call Center AI is an AI-powered call center solution that leverages Azure and OpenAI GPT. It is a proof of concept demonstrating the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI to build an automated call center solution. The project showcases features like accessing claims on a public website, customer conversation history, language change during conversation, bot interaction via phone number, multiple voice tones, lexicon understanding, todo list creation, customizable prompts, content filtering, GPT-4 Turbo for customer requests, specific data schema for claims, documentation database access, SMS report sending, conversation resumption, and more. The system architecture includes components like RAG AI Search, SMS gateway, call gateway, moderation, Cosmos DB, event broker, GPT-4 Turbo, Redis cache, translation service, and more. The tool can be deployed remotely using GitHub Actions and locally with prerequisites like Azure environment setup, configuration file creation, and resource hosting. Advanced usage includes custom training data with AI Search, prompt customization, language customization, moderation level customization, claim data schema customization, OpenAI compatible model usage for the LLM, and Twilio integration for SMS.

github

: 119

genaiscript

GenAIScript is a scripting environment designed to facilitate file ingestion, prompt development, and structured data extraction. Users can define metadata and model configurations, specify data sources, and define tasks to extract specific information. The tool provides a convenient way to analyze files and extract desired content in a structured format. It offers a user-friendly interface for working with data and automating data extraction processes, making it suitable for various data processing tasks.

github

: 2.5k

json-repair

JSON Repair is a toolkit designed to address JSON anomalies that can arise from Large Language Models (LLMs). It offers a comprehensive solution for repairing JSON strings, ensuring accuracy and reliability in your data processing. With its user-friendly interface and extensive capabilities, JSON Repair empowers developers to seamlessly integrate JSON repair into their workflows.

github

: 135

SimplerLLM

SimplerLLM is an open-source Python library that simplifies interactions with Large Language Models (LLMs) for researchers and beginners. It provides a unified interface for different LLM providers, tools for enhancing language model capabilities, and easy development of AI-powered tools and apps. The library offers features like unified LLM interface, generic text loader, RapidAPI connector, SERP integration, prompt template builder, and more. Users can easily set up environment variables, create LLM instances, use tools like SERP, generic text loader, calling RapidAPI APIs, and prompt template builder. Additionally, the library includes chunking functions to split texts into manageable chunks based on different criteria. Future updates will bring more tools, interactions with local LLMs, prompt optimization, response evaluation, GPT Trainer, document chunker, advanced document loader, integration with more providers, Simple RAG with SimplerVectors, integration with vector databases, agent builder, and LLM server.

github

: 110

FlashLearn

FlashLearn is a tool that provides a simple interface and orchestration for incorporating Agent LLMs into workflows and ETL pipelines. It allows data transformations, classifications, summarizations, rewriting, and custom multi-step tasks using LLMs. Each step and task has a compact JSON definition, making pipelines easy to understand and maintain. FlashLearn supports LiteLLM, Ollama, OpenAI, DeepSeek, and other OpenAI-compatible clients.

github

: 574

swarmzero

SwarmZero SDK is a library that simplifies the creation and execution of AI Agents and Swarms of Agents. It supports various LLM Providers such as OpenAI, Azure OpenAI, Anthropic, MistralAI, Gemini, Nebius, and Ollama. Users can easily install the library using pip or poetry, set up the environment and configuration, create and run Agents, collaborate with Swarms, add tools for complex tasks, and utilize retriever tools for semantic information retrieval. Sample prompts are provided to help users explore the capabilities of the agents and swarms. The SDK also includes detailed examples and documentation for reference.

github

: 218

instructor

Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses. Get ready to supercharge your LLM workflows!

github

: 7.7k

syncode

SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that ensures syntactically valid output with respect to defined Context-Free Grammar (CFG) rules. It supports general-purpose programming languages like Python, Go, SQL, JSON, and more, allowing users to define custom grammars using EBNF syntax. The tool compares favorably to other constrained decoders and offers features like fast grammar-guided generation, compatibility with HuggingFace Language Models, and the ability to work with various decoding strategies.

github

: 225

syncode

github

: 251

lmstudio.js

lmstudio.js is a pre-release alpha client SDK for LM Studio, allowing users to use local LLMs in JS/TS/Node. It is currently undergoing rapid development with breaking changes expected. Users can follow LM Studio's announcements on Twitter and Discord. The SDK provides API usage for loading models, predicting text, setting up the local LLM server, and more. It supports features like custom loading progress tracking, model unloading, structured output prediction, and cancellation of predictions. Users can interact with LM Studio through the CLI tool 'lms' and perform tasks like text completion, conversation, and getting prediction statistics.

github

: 663

For similar tasks

OpenAI

github

: 2.4k

vector-search-class-notes

The 'vector-search-class-notes' repository contains class materials for a course on Long Term Memory in AI, focusing on vector search and databases. The course covers theoretical foundations and practical implementation of vector search applications, algorithms, and systems. It explores the intersection of Artificial Intelligence and Database Management Systems, with topics including text embeddings, image embeddings, low dimensional vector search, dimensionality reduction, approximate nearest neighbor search, clustering, quantization, and graph-based indexes. The repository also includes information on the course syllabus, project details, selected literature, and contributions from industry experts in the field.

github

: 316

ipex-llm-tutorial

IPEX-LLM is a low-bit LLM library on Intel XPU (Xeon/Core/Flex/Arc/PVC) that provides tutorials to help users understand and use the library to build LLM applications. The tutorials cover topics such as introduction to IPEX-LLM, environment setup, basic application development, Chinese language support, intermediate and advanced application development, GPU acceleration, and finetuning. Users can learn how to build chat applications, chatbots, speech recognition, and more using IPEX-LLM.

github

: 117

PhiCookBook

Phi Cookbook is a repository containing hands-on examples with Microsoft's Phi models, which are a series of open source AI models developed by Microsoft. Phi is currently the most powerful and cost-effective small language model with benchmarks in various scenarios like multi-language, reasoning, text/chat generation, coding, images, audio, and more. Users can deploy Phi to the cloud or edge devices to build generative AI applications with limited computing power.

github

: 3.1k

azure-functions-openai-extension

Azure Functions OpenAI Extension is a project that adds support for OpenAI LLM (GPT-3.5-turbo, GPT-4) bindings in Azure Functions. It provides NuGet packages for various functionalities like text completions, chat completions, assistants, embeddings generators, and semantic search. The project requires .NET 6 SDK or greater, Azure Functions Core Tools v4.x, and specific settings in Azure Function or local settings for development. It offers features like text completions, chat completion, assistants with custom skills, embeddings generators for text relatedness, and semantic search using vector databases. The project also includes examples in C# and Python for different functionalities.

github

: 87

dingllm.nvim

dingllm.nvim is a lightweight configuration for Neovim that provides scripts for invoking various AI models for text generation. It offers functionalities to interact with APIs from OpenAI, Groq, and Anthropic for generating text completions. The configuration is designed to be simple and easy to understand, allowing users to quickly set up and use the provided AI models for text generation tasks.

github

: 382

AI

AI is an open-source Swift framework for interfacing with generative AI. It provides functionalities for text completions, image-to-text vision, function calling, DALLE-3 image generation, audio transcription and generation, and text embeddings. The framework supports multiple AI models from providers like OpenAI, Anthropic, Mistral, Groq, and ElevenLabs. Users can easily integrate AI capabilities into their Swift projects using AI framework.

github

: 106

shellChatGPT

ShellChatGPT is a shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS, featuring integration with LocalAI, Ollama, Gemini, Mistral, Groq, and GitHub Models. It provides text and chat completions, vision, reasoning, and audio models, voice-in and voice-out chatting mode, text editor interface, markdown rendering support, session management, instruction prompt manager, integration with various service providers, command line completion, file picker dialogs, color scheme personalization, stdin and text file input support, and compatibility with Linux, FreeBSD, MacOS, and Termux for a responsive experience.

github

: 71

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

OpenAI

README:

OpenAI

What is OpenAI

Installation

Usage

Initialization

Chats

Chats Streaming

Structured Output

Images

Create Image

Create Image Edit

Create Image Variation

Audio

Audio Create Speech

Audio Create Speech Streaming

Audio Transcriptions

Audio Translations

Embeddings

Models

List Models

Retrieve Model

Moderations

Utilities

Combine Extensions

Assistants

Create Assistant

Modify Assistant

List Assistants

Threads

Create Thread

Create and Run Thread

Get Threads Messages

Add Message to Thread

Runs

Create Run

Retrieve Run

Retrieve Run Steps

Submit Tool Outputs for Run

Files

Upload file

Cancelling requests

Closure based API

Swift Concurrency

Combine

Support for other providers

Handling missing keys in responses with Parsing Options

What if a provider returns additional fields?

Example Project

Contribution Guidelines

Use Conventional Commits principles in naming PRs and branches:

Write description to pull requests in following format:

Links

License

For Tasks:

For Jobs:

Alternative AI tools for OpenAI

Similar Open Source Tools

OpenAI

Lumos

memobase

claim-ai-phone-bot

instructor

langchainrb

call-center-ai

genaiscript

json-repair

SimplerLLM

FlashLearn

swarmzero

instructor

syncode

syncode

lmstudio.js

For similar tasks

OpenAI

vector-search-class-notes

ipex-llm-tutorial

PhiCookBook