Firewalling OpenAI APIs: Data loss prevention and identity access control

Introduction

Due to the cost of LLMs APIs it is critical to enforce rate limit and granular access control on them. Further, concerns over sensitive data leakage is leading more and more companies to restricting or entirely banning access to LLMs.

In this blog post we are going to show how we can use SlashID to:

Block or monitor requests to OpenAI APIs that contain sensitive sensitive data
Enforce RBAC on OpenAI APIs and safely store the OpenAI API Key

In a future article, we’ll show how to enforce rate limiting and throttling.

OpenAI APIs

The OpenAI APIs allow applications to access models and other utilities that OpenAI exposes.

For instance, you can easily generate a chat completion with a call like the following:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

Introducing Gate

At SlashID, we believe that security begins with Identity. Gate is our identity-aware edge authorizer to protect APIs and workloads.

Gate can be used to monitor or enforce authentication, authorization and identity-based rate limiting on APIs and workloads, as well as to detect, anonymize, or block personally identifiable information (PII) and sensitive data exposed through your APIs or workloads.

Deploying Gate

In this blog post we’ll assume you deploy Gate as a standalone service and proxy OpenAI traffic through it. We are using docker compose for this exercise:

version: '3.7'

services:
  gate:
    image: slashid/gate
    ports:
      - 8080:8080
    command: [--yaml, /gate.yaml]
    volumes:
      - ./gate.yaml:/gate.yaml

In the most basic form, let’s simply proxy a request to the OpenAI API and log it. This is what the gate.yaml configuration file looks like:

gate:
  port: '8080'
  log:
    format: text
    level: debug

  urls:
    - pattern: '*/v1/chat/completions*'
      target: https://api.openai.com/

curl https://gate:8080/v1/chat/completions
{
    "error": {
        "message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
        "type": "invalid_request_error",
        "param": null,
        "code": null
    }
}

Block requests containing PII data

We can use the Gate PII detection plugin together with the OPA plugin to block requests to OpenAI APIs containing PII data:

Note: In the examples below, we embed the OPA policies directly in the Gate configuration, but they can also be served through a bundle.

Please check out our documentation to learn more about the plugin.

gate:
  plugins:
    - id: pii_anonymizer
      type: anonymizer
      enabled: false
      parameters:
        anonymizers: |
          DEFAULT:
            type: keep
    - id: authz_deny_pii
      type: opa
      enabled: false
      intercept: request
      parameters:
        policy_decision_path: /authz/allow
        policy: |
          package authz

          import future.keywords.if

          default allow := false

          key_found(obj, key) if { obj[key] }

          allow if not key_found(input.request.http.headers, "X-Gate-Anonymize-1")
---
urls:
  - pattern: '*/v1/chat/completions*'
    target: https://api.openai.com/
    plugins:
      pii_anonymizer:
        enabled: true
      authz_deny_pii:
        enabled: true

The PII Detection plugin is able to modify, hash, strip or simply detect PII data in requests/responses. In this example it is configured to only monitor for the presence of PII in requests:

anonymizers: |
  DEFAULT:
  type: keep

If any PII is detected, the PII plugin adds a series of headers to the request/response, X-Gate-Anonymize-N, where N is the number of PII detected.

The authz_deny_pii instance of the OPA plugin enforces an OPA policy that blocks a request if the response contains a X-Gate-Anonymize-1.

In other words, it blocks the request if the PII detection plugin found at least one PII element in the request.

Let’s see an example in action:

curl -vL https://gate:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "[email protected]"}],
     "temperature": 0.7
   }'

...
 POST /v1/chat/completions HTTP/1.1
> Host: gate:8080
> User-Agent: curl/7.87.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 128
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 403 Forbidden
< Via: 1.0 gate
< Date: Thu, 14 Sep 2023 01:57:40 GMT
< Content-Length: 0

Blocking other sensitive data

The PII detection plugin supports a number of PII selectors out of the box, but it is not limited to them. It is also possible to define custom categories of data for the plugin to detect.

In the example below we specify a pattern for data that either contains the word confidential or mentions Acme Corp:

analyzer_ad_hoc_recognizers: |
  - name: Acme Corp regex
  supported_language: en
  supported_entity: ACME_SENSITIVE_DATA
  patterns:
      - name: acme_data
          regex: "*Acme Corp*"
      - name: deny_list
          deny_list: "confidential"

Once we have specified the pattern we can see it in action like we did in the previous section:

curl -vL https://gate:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "This data is confidential"}],
     "temperature": 0.7
   }'

...
 POST /v1/chat/completions HTTP/1.1
> Host: gate:8080
> User-Agent: curl/7.87.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 128
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 403 Forbidden
< Via: 1.0 gate
< Date: Thu, 14 Sep 2023 01:57:40 GMT
< Content-Length: 0

Running in monitoring mode

In large, complex environments, it is not always possible or advisable to block requests. Gate allows to run a number of plugins in monitoring mode, where an alert is triggered but the requests are not modified.

The OPA plugin also supports monitoring mode by adding monitoring_mode: true in its parameters as shown below:

    - id: authz_allow_if_authed_pii
      type: opa
      enabled: false
      intercept: request
      parameters:
        <<: *slashid_config
        monitoring_mode: true
        policy_decision_path: /authz/allow
        policy: |
          package authz

          import future.keywords.if

          default allow := false

          key_found(obj, key) if { obj[key] }

          allow if not key_found(input.request.http.headers, "X-Gate-Anonymize-1")

Let’s send a request with PII data:

curl -vL https://gate:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "[email protected]"}],
     "temperature": 0.7
   }'

{
    "error": {
        "message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
        "type": "invalid_request_error",
        "param": null,
        "code": null
    }
}

The request passes but Gate logs the policy violation:

gate-llm-demo-gate-1        | time=2023-09-14T01:57:40Z level=info msg=OPA decision: false decision_id=9f248489-4876-40ed-824e-17fb35306c3b decision_provenance={0.55.0 19fc439d01c8d667b128606390ad2cb9ded04b16-dirty 2023-09-02T15:18:29Z   map[gate:{}]} plugin=opa req_path=/v1/chat/completions request_host=gate:8080 request_url=/v1/chat/completions

Block requests from users with unauthorized roles

With Gate we can also block requests to the OpenAI APIs that are unauthenticated or come from users who don’t belong to a specific group. In this example we’ll perform two actions:

Use the JWT token validator plugin to validate an authorization header minted by SlashID. We further check that the user belongs to the openai_users group
We use the token reminter plugin coupled with SlashID Data Vault storage module to forward the request to the OpenAI APIs swapping the SlashID Authorization header for the OpenAI API key

We first define a Python webhook to remint the token and swap the Authorization header:

class MintTokenRequest(BaseModel):
    slashid_token: str = Field(
        title="SlashID Token", example="eyJhbGciOiJIUzI1NiIsInR5cCI6Ik..."
    )
    person_id: str = Field(
        title="SlashID Person ID", example="cc006198-ef43-42ac-9b5a-b52713569d0f"
    )
    handles: Optional[List[Handle]] = Field(title="Person's login handles")

class MintTokenResponse(BaseModel):
    headers_to_set: Optional[Dict[str, str]]
    cookies_to_add: Optional[Dict[str, str]]

def retrieve_openai_key(person_id):
    # Construct the URL
    url = f"https://api.slashid.com/persons/{person_id}/attributes/end_user_no_access"

    # Perform the GET request
    response = requests.get(url)

    # Check if the request was successful
    if response.status_code == 200:
        data = response.json()
        return data['result']['openai_apikey']
    else:
        print(f"Request failed with status code {response.status_code}.")
        return None

@app.post(
    "/remint_token",
    tags=["gate"],
    status_code=status.HTTP_201_CREATED,
    summary="Endpoint called by Gate to translate SlashID token in a custom token",
)
def mint_token(request: MintTokenRequest) -> MintTokenResponse:
    logger.info(f"/mint_token: request={request}")

    sid_token = request.slashid_token
    try:
        token = jwt.decode(sid_token, options={"verify_signature": False})
    except Exception as e:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail=f"Token {sid_token} is invalid: {e}",
        )

    openai_token = retrieve_openai_key(token.get("person_id"))

    if openai_token is None:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail=f"Unable to get openai token",
        )

    return MintTokenResponse(
        headers_to_set={"Authorization": "Bearer " + openai_token}, cookies_to_add=None
    )

The mint_token function decodes the JWT token with the SlashID user, retrieves one of its Data Vault buckets where the OpenAI API key is stored and rewrite the Authorization header with the OpenAI key.

Note: The reminter works without SlashID. You could for instance look up the OpenAI API Key from your instance of Hashicorp Vault or from another location

Let’s now take a look at the Gate configuration:

gate:
  plugins:
    - id: validator
      type: validate-jwt
      enable_http_caching: true
      enabled: false
      parameters:
        required_groups: ['openai_users']
        jwks_url: https://api.slashid.com/.well-known/jwks.json
        jwks_refresh_interval: 15m
        jwt_expected_issuer: https://api.slashid.com
    - id: pii_anonymizer
      type: anonymizer
      enabled: true
      parameters:
        anonymizers: |
          DEFAULT:
            type: keep
    - id: authz_deny_pii
      type: opa
      enabled: false
      intercept: request
      parameters:
        policy_decision_path: /authz/allow
        policy: |
          package authz

          import future.keywords.if

          default allow := false

          key_found(obj, key) if { obj[key] }

          allow if not key_found(input.request.http.headers, "X-Gate-Anonymize-1")
    - id: reminter
      type: token-reminting
      enabled: false
      parameters:
        remint_token_endpoint: http://backend:8000/remint_token

  port: '8080'
  log:
    format: text
    level: debug

  urls:
    - pattern: '*/v1/chat/completions*'
      target: https://api.openai.com/
      plugins:
        validator:
          enabled: true
        pii_anonymizer:
          enabled: true
        authz_deny_pii:
          enabled: true
        reminter:
          enabled: true

A lot of things are happening on the */v1/chat/completions* endpoint:

We first validate the Authorization header, including checking that the user belongs to the correct group, through the validator instance of the JWT Validator plugin.
We then proceed to check whether the request contains any PII data with pii_anonymizer.
Next, we block the request through the OPA plugin authz_deny_pii if any PII was detected.
Finally, if the request made it this far we use the reminter to swap the SlashID authorization token for the OpenAI API key and we let the request through.

In particular, the JWT validator checks the validity of the JWT token as well as whether the user belongs to the openai_users group as shown below:

- id: validator
  type: validate-jwt
  enable_http_caching: true
  enabled: false
  parameters:
    required_groups: ['openai_users']
    jwks_url: https://api.slashid.com/.well-known/jwks.json
    jwks_refresh_interval: 15m
    jwt_expected_issuer: https://api.slashid.com

Let’s see it in action:

curl -vL https://gate:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer abc"
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "Say this is a test with SlashID!"}],
     "temperature": 0.7
   }'

{
    "id": "chatcmpl-abc123",
    "object": "chat.completion",
    "created": 1677858242,
    "model": "gpt-3.5-turbo-0613",
    "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 7,
        "total_tokens": 20
    },
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "\n\nThis is a test with SlashID!"
            },
            "finish_reason": "stop",
            "index": 0
        }
    ]
}

Conclusion

In this brief blog post we’ve demonstrated a few of the things you can do to securely access LLMs without compromising on privacy and security. Stay tuned for the next blog post on rate limiting and OpenAI APIs.

We’d love to hear any feedback you may have! Try out Gate with a free account. If you’d like to use the PII detection plugin, please contact us at [email protected]!

Introduction

OpenAI APIs

Introducing Gate

Deploying Gate

Block requests containing PII data

Blocking other sensitive data

Running in monitoring mode

Block requests from users with unauthorized roles

Conclusion

Related articles

Rate Limiting for Large-scale, Distributed Applications and APIs Using GCRA

Context-aware authentication: fight identity fraud and qualify your users

No-code anti-phishing protection of internal apps with Passkeys

Ready to start a top-tier security upgrade?