Skip to content

Conversation

@cirospaciari
Copy link
Member

Summary

Add support for S3 custom metadata headers (x-amz-meta-*) that allow users to attach key-value pairs to S3 objects.

Write path

  • Accept metadata: Record<string, string> in S3Options
  • Convert { sku: "12345" } to x-amz-meta-sku: 12345 headers
  • Works with write(), writer(), and presigned PUT URLs
  • Keys are automatically normalized to lowercase per AWS requirements

Read path

  • Return metadata in S3Stats via stat()
  • Extract x-amz-meta-* headers from HEAD response
  • Strip prefix and return as metadata: Record<string, string>

Presigned URLs

  • Include metadata as signed headers in PUT URLs
  • Client uploading to presigned URL must include matching headers

Usage

// Write with metadata
await file.write("content", {
  metadata: { sku: "12345", category: "test" }
});

// Read metadata via stat()
const stat = await file.stat();
console.log(stat.metadata); // { sku: "12345", category: "test" }

// Presigned PUT URL with metadata
const url = file.presign({
  method: "PUT",
  metadata: { "custom-key": "value" }
});
// Client must include: x-amz-meta-custom-key: value header

// Streaming upload with metadata
const writer = file.writer({ metadata: { key: "value" } });
writer.write("data");
await writer.end();

Test plan

  • Test basic upload and retrieve via S3File.write() and stat()
  • Test S3Client.write() static method with metadata
  • Test key normalization to lowercase
  • Test empty metadata object when no metadata set
  • Test presigned PUT URL with metadata in signed headers
  • Test streaming upload with writer() and metadata

All 6 metadata tests pass with MinIO via docker-compose.

Changelog

Added metadata option to S3Options for attaching custom metadata (x-amz-meta-* headers) to S3 objects during upload. Added metadata property to S3Stats returned by stat() to retrieve stored metadata. Keys are automatically lowercased per AWS requirements. Maximum ~2KB total metadata supported.

🤖 Generated with Claude Code

@cirospaciari cirospaciari requested a review from alii as a code owner January 16, 2026 02:06
@robobun
Copy link
Collaborator

robobun commented Jan 16, 2026

Updated 7:41 PM PT - Jan 16th, 2026

@cirospaciari, your commit a54a38f has 2 failures in Build #35096 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 26154

That installs a local version of the PR into your bun-26154 executable, so you can run:

bun-26154 --bun

@cirospaciari cirospaciari marked this pull request as draft January 16, 2026 02:07
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2026

Walkthrough

Adds support for user-defined S3 metadata (x-amz-meta-*) across the stack: types, signing, client APIs, multipart/streaming uploads, stat/list parsing, ownership/deinit handling, and tests.

Changes

Cohort / File(s) Summary
Type defs & JS API
packages/bun-types/s3.d.ts, src/bun.js/api/S3Stat.classes.ts
Added metadata option to S3Options and metadata on S3Stats; added S3Stat metadata getter (cached).
WebCore: S3Stat, File, Blob, fetch
src/bun.js/webcore/S3Stat.zig, src/bun.js/webcore/S3File.zig, src/bun.js/webcore/Blob.zig, src/bun.js/webcore/fetch.zig
Build/cache JS metadata from x-amz-meta-* headers; propagate/duplicate metadata into upload/presign/writableStream/uploadStream; transfer ownership and null originals to avoid double-free.
Credentials & signing
src/s3/credentials.zig, src/s3/simple_request.zig, src/s3/client.zig
Added MetadataMap, MAX_METADATA_HEADERS/MAX_HEADERS, wired metadata into SignRequest/presign and canonicalization; extended client APIs (stat, download, upload, uploadStream, writableStream, downloadStream, readableStream) to accept metadata; S3StatResult now exposes raw headers.
Multipart & store
src/s3/multipart.zig, src/s3/multipart_options.zig, src/bun.js/webcore/blob/Store.zig
MultiPartUpload and S3 store gain optional metadata field; metadata propagated through part lifecycle; added MAX_SINGLE_UPLOAD_SIZE constant and deinit handling.
Download streaming / state
src/s3/download_stream.zig, src/s3/simple_request.zig
Added atomic packed State, processHttpCallback, and state getters/setters for streaming downloads; stat path includes response headers for metadata extraction.
List / enums / errors
src/s3/list_objects.zig, src/s3/acl.zig, src/s3/storage_class.zig, src/s3/error.zig
Implemented ListObjectsV2 parsing/result structs and JS conversion; added ACL enum and Map/toString; added StorageClass variants; added S3 error header docs.
Tests
test/js/bun/s3/s3.test.ts
New MinIO-gated tests covering write/stat, key normalization, presigned PUT with metadata, streaming uploads, and client write paths.
Minor
src/bun.js/webcore/Request.zig
Reordered import (FetchRedirect after FetchCacheMode); no behavior change.

Possibly related PRs

Suggested reviewers

  • alii
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(s3): add custom metadata support (x-amz-meta-)' clearly and concisely describes the main feature added—custom metadata support for S3 objects via x-amz-meta- headers.
Description check ✅ Passed The PR description comprehensively covers both required template sections: it clearly explains what the PR does (metadata support for write/read/presigned paths) and demonstrates verification through six passing tests with MinIO.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Add support for S3 custom metadata headers that allow users to attach
key-value pairs to S3 objects.

**Write path**: Accept `metadata: Record<string, string>` in S3Options
- Convert `{ sku: "12345" }` to `x-amz-meta-sku: 12345` headers
- Works with `write()`, `writer()`, and presigned PUT URLs
- Keys are automatically normalized to lowercase per AWS requirements

**Read path**: Return metadata in S3Stats via `stat()`
- Extract x-amz-meta-* headers from response
- Strip prefix and return as `metadata: Record<string, string>`

**Presigned URLs**: Include metadata as signed headers in PUT URLs
- Client uploading to presigned URL must include matching headers

Added `metadata` option to S3Options for attaching custom metadata (x-amz-meta-* headers) to S3 objects during upload. Added `metadata` property to S3Stats returned by `stat()` to retrieve stored metadata. Keys are automatically lowercased per AWS requirements. Maximum ~2KB total metadata supported.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix BASE_HEADERS constant: 9 → 11 (content_encoding was added upstream)
- Add named constants for buffer sizes with size calculations
- Document signRequest() with step-by-step AWS Sig V4 process
- Add AWS documentation links for all x-amz-* headers:
  - x-amz-content-sha256, x-amz-date, x-amz-acl, x-amz-security-token
  - x-amz-storage-class, x-amz-request-payer, x-amz-meta-*
- Add AWS spec references to all src/s3/ files:
  - acl.zig: Canned ACL spec
  - storage_class.zig: Storage class descriptions
  - multipart.zig: Multipart upload constraints
  - credentials.zig: SignedHeaders and CanonicalRequest format docs
  - client.zig, simple_request.zig, error.zig, list_objects.zig, download_stream.zig

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
autofix-ci bot and others added 3 commits January 16, 2026 21:27
Document all public functions and key internal structs in:
- simple_request.zig: Result types (S3StatResult, S3DownloadResult, etc.), S3HttpSimpleTask lifecycle, executeSimpleS3Request()
- download_stream.zig: S3HttpDownloadStreamingTask state machine, thread safety, atomic State struct
- list_objects.zig: S3ListObjectsOptions params, XML parser, S3ListObjectsV2Result.toJS()
- client.zig: All 10 public functions (stat, download, downloadSlice, delete, listObjects, upload, writableStream, uploadStream, downloadStream, readableStream)
- multipart.zig: MultiPartUpload lifecycle, UploadPart struct, commit/rollback with AWS API links

Each function includes AWS API reference links where applicable.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace manual JSValue protect/unprotect with JSRef for proper
lifecycle management of the metadata JS object.

- Change metadata field type from jsc.JSValue to jsc.JSRef
- Use JSRef.initStrong() instead of manual protect()
- Use JSRef.tryGet() instead of direct access
- Use JSRef.deinit() instead of manual unprotect()

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@cirospaciari
Copy link
Member Author

@coderabbitai review it

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2026

@cirospaciari: I'll review the changes in this PR that adds S3 custom metadata support (x-amz-meta-*). Let me analyze the implementation for the write path, read path, presigned URLs, and the test coverage.


🧠 Learnings used
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T03:22:50.711Z
Learning: Run `bun run zig:check-all` to compile Zig code on all platforms when making platform-specific changes

Learnt from: RiskyMH
Repo: oven-sh/bun PR: 25111
File: packages/bun-types/wasm.d.ts:103-104
Timestamp: 2025-11-27T15:18:21.607Z
Learning: In the Bun repository, autofixci automatically formats code in the main branch, which can cause formatting diffs to appear in pull requests even when the PR author didn't make those changes. These formatting changes from autofixci are expected and not the PR author's responsibility.

Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T03:22:50.711Z
Learning: Only push changes after running `bun bd test <file>` and ensuring tests pass

Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2026-01-05T23:04:01.518Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : Use `bun bd test <...test file>` to run tests with compiled code changes. Do not use `bun test` as it will not include your changes.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
src/bun.js/webcore/Blob.zig (1)

965-983: Apply consistent metadata ownership pattern across all S3 upload call sites.

The S3 upload APIs store direct references to metadata.keys/values within SignResult headers (line 1584-1586 of credentials.zig), which are then used asynchronously by S3HttpSimpleTask. However, defer aws_options.deinit() at line 934 immediately frees the metadata arrays after scheduling the upload, creating a use-after-free condition. This pattern is inconsistent with the writableStream path (line 2717), which correctly duplicates metadata to take ownership. Apply metadata_dupe consistently across all upload call sites (including simple uploads, uploadStream, and writableStream) to ensure metadata remains valid for the entire async operation lifecycle.

This affects: 965-983, 1115-1132, 1161-1179, 1190-1206, 1399-1415, 1463-1479, 2443-2459, and 2701-2730.

src/s3/download_stream.zig (1)

228-228: Missing this. prefix when calling updateState.

The call to updateState at line 228 is missing the this. prefix. In Zig, calling a method on this requires explicit this. notation.

🐛 Proposed fix
-        const wait_until_done = updateState(this, async_http, result, &state);
+        const wait_until_done = this.updateState(async_http, result, &state);
src/s3/list_objects.zig (4)

14-35: Use #-prefixed private backing fields in S3ListObjectsOptions.

The backing ZigString.Slice fields are internal and should use # to avoid accidental public exposure and align with Zig private-field conventions. As per coding guidelines, please align private field naming.

♻️ Suggested change
 pub const S3ListObjectsOptions = struct {
     continuation_token: ?[]const u8,
     delimiter: ?[]const u8,
     encoding_type: ?[]const u8,
     fetch_owner: ?bool,
     max_keys: ?i64,
     prefix: ?[]const u8,
     start_after: ?[]const u8,

-    _continuation_token: ?jsc.ZigString.Slice,
-    _delimiter: ?jsc.ZigString.Slice,
-    _encoding_type: ?jsc.ZigString.Slice,
-    _prefix: ?jsc.ZigString.Slice,
-    _start_after: ?jsc.ZigString.Slice,
+    `#continuation_token`: ?jsc.ZigString.Slice,
+    `#delimiter`: ?jsc.ZigString.Slice,
+    `#encoding_type`: ?jsc.ZigString.Slice,
+    `#prefix`: ?jsc.ZigString.Slice,
+    `#start_after`: ?jsc.ZigString.Slice,

     pub fn deinit(this: *@This()) void {
-        if (this._continuation_token) |slice| slice.deinit();
-        if (this._delimiter) |slice| slice.deinit();
-        if (this._encoding_type) |slice| slice.deinit();
-        if (this._prefix) |slice| slice.deinit();
-        if (this._start_after) |slice| slice.deinit();
+        if (this.#continuation_token) |slice| slice.deinit();
+        if (this.#delimiter) |slice| slice.deinit();
+        if (this.#encoding_type) |slice| slice.deinit();
+        if (this.#prefix) |slice| slice.deinit();
+        if (this.#start_after) |slice| slice.deinit();
     }
 };

@@
-        ._continuation_token = null,
-        ._delimiter = null,
-        ._encoding_type = null,
-        ._prefix = null,
-        ._start_after = null,
+        .#continuation_token = null,
+        .#delimiter = null,
+        .#encoding_type = null,
+        .#prefix = null,
+        .#start_after = null,
@@
-                listObjectsOptions._continuation_token = str.toUTF8(bun.default_allocator);
-                listObjectsOptions.continuation_token = listObjectsOptions._continuation_token.?.slice();
+                listObjectsOptions.#continuation_token = str.toUTF8(bun.default_allocator);
+                listObjectsOptions.continuation_token = listObjectsOptions.#continuation_token.?.slice();
@@
-                listObjectsOptions._delimiter = str.toUTF8(bun.default_allocator);
-                listObjectsOptions.delimiter = listObjectsOptions._delimiter.?.slice();
+                listObjectsOptions.#delimiter = str.toUTF8(bun.default_allocator);
+                listObjectsOptions.delimiter = listObjectsOptions.#delimiter.?.slice();
@@
-                listObjectsOptions._encoding_type = str.toUTF8(bun.default_allocator);
-                listObjectsOptions.encoding_type = listObjectsOptions._encoding_type.?.slice();
+                listObjectsOptions.#encoding_type = str.toUTF8(bun.default_allocator);
+                listObjectsOptions.encoding_type = listObjectsOptions.#encoding_type.?.slice();
@@
-                listObjectsOptions._prefix = str.toUTF8(bun.default_allocator);
-                listObjectsOptions.prefix = listObjectsOptions._prefix.?.slice();
+                listObjectsOptions.#prefix = str.toUTF8(bun.default_allocator);
+                listObjectsOptions.prefix = listObjectsOptions.#prefix.?.slice();
@@
-                listObjectsOptions._start_after = str.toUTF8(bun.default_allocator);
-                listObjectsOptions.start_after = listObjectsOptions._start_after.?.slice();
+                listObjectsOptions.#start_after = str.toUTF8(bun.default_allocator);
+                listObjectsOptions.start_after = listObjectsOptions.#start_after.?.slice();

Also applies to: 552-624


50-56: Fix the checksumAlgorithme typo in the public API.

The struct field and JS property are misspelled (Algorithme), which is inconsistent with the parsed XML tag and expected camelCase naming. This is user-facing and will surprise consumers.

🐛 Suggested fix
 const S3ListObjectsContents = struct {
     key: []const u8,
     etag: ?bun.ptr.OwnedIn([]const u8, bun.allocators.MaybeOwned(bun.DefaultAllocator)),
     checksum_type: ?[]const u8,
-    checksum_algorithme: ?[]const u8,
+    checksum_algorithm: ?[]const u8,
@@
-                    var checksum_algorithme: ?[]const u8 = null;
+                    var checksum_algorithm: ?[]const u8 = null;
@@
-                                } else if (strings.eql(inner_tag_name_or_tag_end, "ChecksumAlgorithm")) {
+                                } else if (strings.eql(inner_tag_name_or_tag_end, "ChecksumAlgorithm")) {
                                     if (strings.indexOf(xml[i..], "</ChecksumAlgorithm>")) |__tag_end| {
-                                        checksum_algorithme = xml[i .. i + __tag_end];
+                                        checksum_algorithm = xml[i .. i + __tag_end];
                                         i = i + __tag_end + 20;
                                     }
                                 }
@@
-                            .checksum_algorithme = checksum_algorithme,
+                            .checksum_algorithm = checksum_algorithm,
@@
-                if (item.checksum_algorithme) |checksum_algorithme| {
-                    objectInfo.put(globalObject, jsc.ZigString.static("checksumAlgorithme"), try bun.String.createUTF8ForJS(globalObject, checksum_algorithme));
+                if (item.checksum_algorithm) |checksum_algorithm| {
+                    objectInfo.put(globalObject, jsc.ZigString.static("checksumAlgorithm"), try bun.String.createUTF8ForJS(globalObject, checksum_algorithm));
                 }

Also applies to: 155-157, 265-266, 311-315, 421-424


239-244: Parser can skip the tail of the XML response.

The loop bounds use xml[delete_result_pos..].len but index from xml[i], which truncates scanning when <ListBucketResult> isn’t at offset 0. That can drop elements near the end of the response.

🐛 Suggested fix
-    if (strings.indexOf(xml, "<ListBucketResult")) |delete_result_pos| {
-        var i: usize = 0;
-        while (i < xml[delete_result_pos..].len) {
+    if (strings.indexOf(xml, "<ListBucketResult")) |list_result_pos| {
+        var i: usize = list_result_pos;
+        while (i < xml.len) {

236-237: Avoid leaking contents/common_prefixes when <ListBucketResult> is missing.

These lists are allocated unconditionally but only deinitialized inside the if branch. If the XML is malformed or the tag is absent, the allocations leak.

🐛 Suggested fix
-    var contents = std.array_list.Managed(S3ListObjectsContents).init(bun.default_allocator);
-    var common_prefixes = std.array_list.Managed([]const u8).init(bun.default_allocator);
-
-    if (strings.indexOf(xml, "<ListBucketResult")) |list_result_pos| {
+    if (strings.indexOf(xml, "<ListBucketResult")) |list_result_pos| {
+        var contents = std.array_list.Managed(S3ListObjectsContents).init(bun.default_allocator);
+        var common_prefixes = std.array_list.Managed([]const u8).init(bun.default_allocator);
         ...
     }

Also applies to: 534-536

src/s3/client.zig (1)

284-300: Metadata ownership ambiguity requires clarification or defensive cloning.

Metadata is accepted by value but stored directly in request/multipart state without cloning, unlike path and content_type parameters which are explicitly cloned with .dupe(). This creates inconsistent ownership semantics: the request task will deinit metadata (simple_request.zig:170, multipart.zig:318), but there's no defensive clone to prevent double-free if a caller attempts to reuse or deinit the original map.

Either document this as a transfer-of-ownership API (caller must not deinit), or apply defensive cloning consistent with other string parameters using metadata.dupe(allocator).

Also applies to: 319-366, 512-559

🤖 Fix all issues with AI agents
In `@src/bun.js/webcore/S3Stat.zig`:
- Around line 56-68: The length check excludes headers whose name equals the
prefix exactly and should be defensive: change the condition from
header.name.len > prefix_len to header.name.len >= prefix_len so the slice
header.name[prefix_len..] is valid for empty suffixes; also ensure failures are
propagated/handled by treating obj.put as fallible—use try (or otherwise handle
its error) when calling obj.put(globalThis, key, value_js). Reference symbols:
the for (headers) loop, header.name.len, prefix_len,
strings.eqlCaseInsensitiveASCII, header.name[prefix_len..],
bun.String.createUTF8ForJS, and obj.put.

In `@src/s3/credentials.zig`:
- Around line 1003-1007: The code currently treats list.items.len == 0 as a
silent fallback to SignedHeaders.get(header_key) via the brk_signed label when
the fixed buffer allocator overflows; instead, change this behavior to surface
the allocator failure by returning an error (or at minimum logging a warning) so
callers know metadata signing was skipped. Locate the check on list.items.len
and the brk_signed SignedHeaders.get(header_key) path and replace the silent
break with an explicit error return (or call to the module logger with a clear
message including allocator/list context) so signature mismatches can be
detected and handled upstream.
- Around line 83-86: When duplicating this.keys/this.values into
new_keys/new_values using allocator.dupe, ensure partial allocations are cleaned
up on failure: if allocator.dupe fails for either a key or a value at index i,
loop backwards to free all previously allocated new_keys[0..i-1] and
new_values[0..i-1], then free the new_keys and new_values arrays themselves (and
any other owned resources) before returning null; update the duplication loop
around new_keys/new_values to handle failures for both key and value
duplications and reference the symbols allocator.dupe, new_keys, new_values,
this.keys, and this.values when implementing the cleanup.
- Around line 411-414: The code silently converts non-string metadata values to
empty strings (values[i] = ""), which hides bad input; change this to match the
function's other type-validation behavior by returning an error or converting
the value to a string. Specifically, in the block that currently sets values[i]
= "", inspect the metadata entry (use the metadata key variable and index i) and
either: 1) return a descriptive error (e.g., error.InvalidMetadataType)
including the metadata key and actual type, or 2) coerce the value to a string
representation before assigning to values[i]; ensure the new behavior is
consistent with existing validation patterns in this function and include the
metadata key in the error message for easier debugging.

In `@src/s3/download_stream.zig`:
- Line 259: The call to processHttpCallback is missing the required instance
prefix; update the call site so it invokes this.processHttpCallback(this,
async_http, result) (i.e., use this.processHttpCallback) to match other method
calls on the same object and ensure the method resolves correctly; locate the
usage of processHttpCallback in the function around async_http/result and add
the this. prefix.

In `@src/s3/multipart.zig`:
- Around line 124-125: MultiPartUpload and SimpleRequest are both holding and
deinitting the same MetadataMap, causing a double-free; fix by making
MultiPartUpload duplicate the metadata before passing it into
executeSimpleS3Request (so SimpleRequest owns its copy) using
MetadataMap.dupe(), or alternatively transfer ownership by not
storing/deinitting metadata in one of the structs. Specifically, update the call
sites in executeSimpleS3Request (the invocations from MultiPartUpload at the
locations referenced) to pass metadata.dupe() instead of the original, and
ensure deinit() of SimpleRequest and MultiPartUpload continue to call deinit()
only on the metadata they own.

cirospaciari and others added 5 commits January 16, 2026 14:44
- S3Stat.zig: Use >= instead of > for prefix length check and skip empty keys
- credentials.zig: Fix memory leak on partial allocation failure in MetadataMap.dupe
- credentials.zig: Throw error for non-string metadata values instead of silent fallback

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
MultiPartUpload stores metadata and frees it in deinit(). When passing
metadata to uploadStream/writableStream, the caller's S3CredentialsWithOptions
also frees the same metadata on defer deinit() - causing double-free.

Fix: Duplicate metadata before passing to multipart functions, then null
out the original to prevent double-free in the caller's cleanup.

Also refactors credentials.zig to use errdefer pattern for cleaner
memory management in parseMetadataFromJS() and MetadataMap.dupe().

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use method call syntax for processHttpCallback (download_stream.zig:259)
- Return error.MetadataSigningBufferOverflow instead of silently falling
  back when signed headers buffer overflows (credentials.zig)
  - Previous behavior: silent fallback caused signature mismatches
  - New behavior: explicit error helps diagnose metadata signing issues

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace 4 instances of `catch bun.outOfMemory()` with `bun.handleOom()`
in the S3 metadata parsing code. The former pattern is banned because
it can catch unrelated errors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When minioCredentials is undefined (no Docker), the describe callback
is still evaluated even though the tests are skipped. Using optional
chaining with fallbacks prevents the TypeError during evaluation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@cirospaciari cirospaciari marked this pull request as ready for review January 17, 2026 01:41
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/bun.js/webcore/Blob.zig (1)

2715-2731: Guard metadata_dupe on writableStream errors.

Both locations (lines 2715 and 2734) have a potential memory leak. When metadata_dupe is allocated and passed to S3.writableStream(), if the function throws before the task takes ownership, the duplicate metadata leaks because the original metadata is nulled (to prevent double-free) and there is no cleanup for the allocated copy.

Line 2734 is particularly vulnerable—it has no defer pattern at all protecting metadata_dupe.

Add an errdefer to clean up the metadata if writableStream fails:

Proposed fix
-                const metadata_dupe = if (credentialsWithOptions.metadata) |meta| meta.dupe(bun.default_allocator) else null;
+                var metadata_dupe = if (credentialsWithOptions.metadata) |meta| meta.dupe(bun.default_allocator) else null;
+                errdefer if (metadata_dupe) |*meta| meta.deinit(bun.default_allocator);
                 credentialsWithOptions.metadata = null; // Prevent double-free in deinit
                 defer credentialsWithOptions.deinit();
-        const metadata_dupe = if (s3.metadata) |meta| meta.dupe(bun.default_allocator) else null;
+        var metadata_dupe = if (s3.metadata) |meta| meta.dupe(bun.default_allocator) else null;
+        errdefer if (metadata_dupe) |*meta| meta.deinit(bun.default_allocator);
         return try S3.writableStream(
src/bun.js/webcore/fetch.zig (1)

1312-1329: Add errdefer for metadata_dupe to prevent leak if uploadStream fails.

Line 1314 creates metadata_dupe with .dupe(), and line 1315 nulls the original. If uploadStream on line 1316 throws an error before taking ownership of the duplicate, metadata_dupe leaks. Change const to var and add errdefer to clean it up:

-            const metadata_dupe = if (credentialsWithOptions.metadata) |meta| meta.dupe(bun.default_allocator) else null;
+            var metadata_dupe = if (credentialsWithOptions.metadata) |meta| meta.dupe(bun.default_allocator) else null;
+            errdefer if (metadata_dupe) |*meta| meta.deinit(bun.default_allocator);
             credentialsWithOptions.metadata = null; // Prevent double-free in deinit
             _ = try bun.S3.uploadStream(
🤖 Fix all issues with AI agents
In `@src/s3/credentials.zig`:
- Around line 124-181: After counting properties in parseMetadataFromJS, enforce
a hard cap using the existing MAX_METADATA_HEADERS constant: if property_count
== 0 return null (keep), but if property_count > MAX_METADATA_HEADERS return an
error (e.g., use globalObject.throwRangeError or a suitable throw* helper) so we
never allocate/produce more entries than the fixed-size metadata arrays. Add
this guard immediately after the property_count loop and before allocating
keys/values (refer to property_count, MAX_METADATA_HEADERS, and
parseMetadataFromJS).
♻️ Duplicate comments (1)
src/bun.js/webcore/S3Stat.zig (1)

56-68: Confirm whether obj.put is fallible and should be handled.

If put can throw/return an error, this path should propagate it to avoid silently dropping metadata entries.

#!/bin/bash
# Inspect JSValue.put signature to determine if it returns an error union.
rg -n "pub fn put\\(" --type=zig

- Add guard in parseMetadataFromJS to enforce MAX_METADATA_HEADERS limit
  (32 headers max) and throw RangeError if exceeded
- Add errdefer to clean up metadata_dupe if writableStream/uploadStream
  fails before taking ownership (prevents memory leaks in Blob.zig and
  fetch.zig)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/s3/credentials.zig`:
- Around line 124-190: parseMetadataFromJS currently enforces only
MAX_METADATA_HEADERS but not the total bytes, which can overflow
CANONICAL_REQUEST_BUF_SIZE; add a running counter (e.g., total_metadata_bytes)
and for each key and value increment it by key_owned.len + val_slice.len (plus
any separators if relevant) and if it exceeds the AWS-enforced limit (2048
bytes) return a RangeError via globalObject.throwRangeError; perform this check
before duplicating/allocating (before bun.default_allocator.dupe/alloc for
key_lower and values[i]) to avoid OOM/overflow and mirror the same guard in any
other metadata-parsing code paths that build canonical requests.
♻️ Duplicate comments (1)
src/s3/credentials.zig (1)

972-1005: Handle FixedBufferAllocator overflow in presigned SignedHeaders build.

In the presigned branch, appendSlice failures are ignored; the resulting signed_headers can be truncated without error, yielding invalid signatures. Mirror the overflow handling used in the header-based branch.

🐛 Proposed fix
-                        list.appendSlice("host") catch {};
+                        list.appendSlice("host") catch return error.MetadataSigningBufferOverflow;
@@
-                            list.appendSlice(";x-amz-meta-") catch {};
-                            list.appendSlice(meta.keys[idx]) catch {};
+                            list.appendSlice(";x-amz-meta-") catch return error.MetadataSigningBufferOverflow;
+                            list.appendSlice(meta.keys[idx]) catch return error.MetadataSigningBufferOverflow;

Comment on lines +124 to +190
fn parseMetadataFromJS(metadata_value: jsc.JSValue, globalObject: *jsc.JSGlobalObject) bun.JSError!?MetadataMap {
const metadata_obj = try metadata_value.toObject(globalObject);

// Count properties first
var property_count: usize = 0;
var iter = try jsc.JSPropertyIterator(.{ .skip_empty_name = true, .include_value = true }).init(globalObject, metadata_obj);
defer iter.deinit();
while (try iter.next()) |_| {
property_count += 1;
}

if (property_count == 0) return null;

// Enforce maximum metadata headers limit to prevent buffer overflows
if (property_count > MAX_METADATA_HEADERS) {
return globalObject.throwRangeError(@as(i64, @intCast(property_count)), .{
.min = 0,
.max = MAX_METADATA_HEADERS,
.field_name = "metadata properties",
});
}

// Allocate arrays for keys and values
const keys = bun.handleOom(bun.default_allocator.alloc([]const u8, property_count));
errdefer bun.default_allocator.free(keys);

const values = bun.handleOom(bun.default_allocator.alloc([]const u8, property_count));
errdefer bun.default_allocator.free(values);

// Track allocations for cleanup on error
var keys_allocated: usize = 0;
var values_allocated: usize = 0;

errdefer {
for (0..keys_allocated) |j| bun.default_allocator.free(keys[j]);
for (0..values_allocated) |j| bun.default_allocator.free(values[j]);
}

// Second pass to extract keys and values
var iter2 = try jsc.JSPropertyIterator(.{ .skip_empty_name = true, .include_value = true }).init(globalObject, metadata_obj);
defer iter2.deinit();

var i: usize = 0;
while (try iter2.next()) |key| {
// Convert key to lowercase (AWS requires lowercase metadata keys)
const key_owned = try key.toOwnedSlice(bun.default_allocator);
defer bun.default_allocator.free(key_owned);

const key_lower = bun.handleOom(bun.default_allocator.alloc(u8, key_owned.len));
keys[i] = strings.copyLowercase(key_owned, key_lower);
keys_allocated = i + 1;

// Get value - metadata values must be strings
const val = iter2.value;
if (!val.isString()) {
return globalObject.throwInvalidArgumentTypeValue("metadata value", "string", val);
}

const val_str = try bun.String.fromJS(val, globalObject);
defer val_str.deref();
const val_slice = val_str.toUTF8(bun.default_allocator);
defer val_slice.deinit();

values[i] = bun.handleOom(bun.default_allocator.dupe(u8, val_slice.slice()));
values_allocated = i + 1;

i += 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

AWS S3 user-defined metadata size limit bytes per object

💡 Result:

User-defined metadata per object is limited to 2 KB (2,048 bytes). The PUT request header overall is limited to 8 KB (8,192 bytes). [1][2]

Sources:

  • AWS S3 — Working with object metadata (notes on 2 KB user metadata and 8 KB PUT header). [1]
  • AWS S3 — Editing object metadata (restates 2 KB user-defined metadata). [2]

🏁 Script executed:

# Find and examine the credentials.zig file
fd credentials.zig src/

Repository: oven-sh/bun

Length of output: 77


🏁 Script executed:

# Check the file size and structure
wc -l src/s3/credentials.zig

Repository: oven-sh/bun

Length of output: 82


🏁 Script executed:

# Look for the constants mentioned in the review
rg "MAX_METADATA_HEADERS|SIGNED_HEADERS_BUF_SIZE|CANONICAL_REQUEST_BUF_SIZE" src/s3/credentials.zig

Repository: oven-sh/bun

Length of output: 1202


🏁 Script executed:

# Read the parseMetadataFromJS function and surrounding code
sed -n '100,200p' src/s3/credentials.zig

Repository: oven-sh/bun

Length of output: 3584


🏁 Script executed:

# Read lines 464-473
sed -n '450,480p' src/s3/credentials.zig

Repository: oven-sh/bun

Length of output: 1559


🏁 Script executed:

# Read lines 1335-1489
sed -n '1320,1510p' src/s3/credentials.zig

Repository: oven-sh/bun

Length of output: 9002


🏁 Script executed:

# Search for where metadata is added to canonical request to understand buffer usage
rg -A 5 -B 5 "canonical" src/s3/credentials.zig | head -100

Repository: oven-sh/bun

Length of output: 5182


🏁 Script executed:

# Search for any bounds checking on pos variable in canonical request building
rg -A 2 -B 2 "pos >" src/s3/credentials.zig | head -30

Repository: oven-sh/bun

Length of output: 37


🏁 Script executed:

# Also check if there's any validation on metadata size elsewhere
rg "total.*byte|total.*size|metadata.*byte|metadata.*size" src/s3/credentials.zig

Repository: oven-sh/bun

Length of output: 149


🏁 Script executed:

# Check the exact TEMP_BUFFER_SIZE and other related constants
rg "TEMP_BUFFER_SIZE|BUFFER_SIZE" src/s3/credentials.zig

Repository: oven-sh/bun

Length of output: 846


🏁 Script executed:

# Final verification: check if there's any runtime bounds checking around pos assignments
sed -n '1400,1500p' src/s3/credentials.zig | grep -E "(pos|CANONICAL|unreachable|panic)"

Repository: oven-sh/bun

Length of output: 2709


Add total metadata size guard in parseMetadataFromJS to prevent buffer overflow in canonical request building.

The current implementation only validates the count of metadata headers (max 32), but does not cap the total byte size. This allows attackers to supply large key/value pairs that overflow the fixed buffers (CANONICAL_REQUEST_BUF_SIZE = 8192 bytes) during signature generation. AWS enforces a 2 KB limit on user-defined metadata, which must be enforced in parsing to prevent silent memory corruption in release builds.

Example overflow: 32 headers × 256 bytes each = 8192 bytes for metadata alone, plus METHOD/PATH/QUERY/base headers = buffer overflow.

Proposed fix
 fn parseMetadataFromJS(metadata_value: jsc.JSValue, globalObject: *jsc.JSGlobalObject) bun.JSError!?MetadataMap {
     const metadata_obj = try metadata_value.toObject(globalObject);
+    const max_total_bytes: usize = 2048; // AWS user metadata limit

@@
     // Second pass to extract keys and values
     var iter2 = try jsc.JSPropertyIterator(.{ .skip_empty_name = true, .include_value = true }).init(globalObject, metadata_obj);
     defer iter2.deinit();
 
     var i: usize = 0;
+    var total_bytes: usize = 0;
     while (try iter2.next()) |key| {
@@
         const val_slice = val_str.toUTF8(bun.default_allocator);
         defer val_slice.deinit();
 
+        total_bytes += "x-amz-meta-".len + key_owned.len + 1 + val_slice.slice().len;
+        if (total_bytes > max_total_bytes) {
+            return globalObject.throwRangeError(`@as`(i64, `@intCast`(total_bytes)), .{
+                .min = 0,
+                .max = `@as`(i64, `@intCast`(max_total_bytes)),
+                .field_name = "metadata total size",
+            });
+        }
+
         values[i] = bun.handleOom(bun.default_allocator.dupe(u8, val_slice.slice()));
         values_allocated = i + 1;

Also applies to: 464-473, 1335-1489

🤖 Prompt for AI Agents
In `@src/s3/credentials.zig` around lines 124 - 190, parseMetadataFromJS currently
enforces only MAX_METADATA_HEADERS but not the total bytes, which can overflow
CANONICAL_REQUEST_BUF_SIZE; add a running counter (e.g., total_metadata_bytes)
and for each key and value increment it by key_owned.len + val_slice.len (plus
any separators if relevant) and if it exceeds the AWS-enforced limit (2048
bytes) return a RangeError via globalObject.throwRangeError; perform this check
before duplicating/allocating (before bun.default_allocator.dupe/alloc for
key_lower and values[i]) to avoid OOM/overflow and mirror the same guard in any
other metadata-parsing code paths that build canonical requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants