-
Notifications
You must be signed in to change notification settings - Fork 3.9k
feat(s3): add custom metadata support (x-amz-meta-*) #26154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Updated 7:41 PM PT - Jan 16th, 2026
❌ @cirospaciari, your commit a54a38f has 2 failures in
🧪 To try this PR locally: bunx bun-pr 26154That installs a local version of the PR into your bun-26154 --bun |
WalkthroughAdds support for user-defined S3 metadata (x-amz-meta-*) across the stack: types, signing, client APIs, multipart/streaming uploads, stat/list parsing, ownership/deinit handling, and tests. Changes
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Add support for S3 custom metadata headers that allow users to attach
key-value pairs to S3 objects.
**Write path**: Accept `metadata: Record<string, string>` in S3Options
- Convert `{ sku: "12345" }` to `x-amz-meta-sku: 12345` headers
- Works with `write()`, `writer()`, and presigned PUT URLs
- Keys are automatically normalized to lowercase per AWS requirements
**Read path**: Return metadata in S3Stats via `stat()`
- Extract x-amz-meta-* headers from response
- Strip prefix and return as `metadata: Record<string, string>`
**Presigned URLs**: Include metadata as signed headers in PUT URLs
- Client uploading to presigned URL must include matching headers
Added `metadata` option to S3Options for attaching custom metadata (x-amz-meta-* headers) to S3 objects during upload. Added `metadata` property to S3Stats returned by `stat()` to retrieve stored metadata. Keys are automatically lowercased per AWS requirements. Maximum ~2KB total metadata supported.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
d6d98df to
2fe0b05
Compare
- Fix BASE_HEADERS constant: 9 → 11 (content_encoding was added upstream) - Add named constants for buffer sizes with size calculations - Document signRequest() with step-by-step AWS Sig V4 process - Add AWS documentation links for all x-amz-* headers: - x-amz-content-sha256, x-amz-date, x-amz-acl, x-amz-security-token - x-amz-storage-class, x-amz-request-payer, x-amz-meta-* - Add AWS spec references to all src/s3/ files: - acl.zig: Canned ACL spec - storage_class.zig: Storage class descriptions - multipart.zig: Multipart upload constraints - credentials.zig: SignedHeaders and CanonicalRequest format docs - client.zig, simple_request.zig, error.zig, list_objects.zig, download_stream.zig 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ce31fce to
01e22ae
Compare
Document all public functions and key internal structs in: - simple_request.zig: Result types (S3StatResult, S3DownloadResult, etc.), S3HttpSimpleTask lifecycle, executeSimpleS3Request() - download_stream.zig: S3HttpDownloadStreamingTask state machine, thread safety, atomic State struct - list_objects.zig: S3ListObjectsOptions params, XML parser, S3ListObjectsV2Result.toJS() - client.zig: All 10 public functions (stat, download, downloadSlice, delete, listObjects, upload, writableStream, uploadStream, downloadStream, readableStream) - multipart.zig: MultiPartUpload lifecycle, UploadPart struct, commit/rollback with AWS API links Each function includes AWS API reference links where applicable. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace manual JSValue protect/unprotect with JSRef for proper lifecycle management of the metadata JS object. - Change metadata field type from jsc.JSValue to jsc.JSRef - Use JSRef.initStrong() instead of manual protect() - Use JSRef.tryGet() instead of direct access - Use JSRef.deinit() instead of manual unprotect() 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@coderabbitai review it |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (7)
src/bun.js/webcore/Blob.zig (1)
965-983: Apply consistent metadata ownership pattern across all S3 upload call sites.The S3 upload APIs store direct references to metadata.keys/values within SignResult headers (line 1584-1586 of credentials.zig), which are then used asynchronously by S3HttpSimpleTask. However,
defer aws_options.deinit()at line 934 immediately frees the metadata arrays after scheduling the upload, creating a use-after-free condition. This pattern is inconsistent with the writableStream path (line 2717), which correctly duplicates metadata to take ownership. Apply metadata_dupe consistently across all upload call sites (including simple uploads, uploadStream, and writableStream) to ensure metadata remains valid for the entire async operation lifecycle.This affects: 965-983, 1115-1132, 1161-1179, 1190-1206, 1399-1415, 1463-1479, 2443-2459, and 2701-2730.
src/s3/download_stream.zig (1)
228-228: Missingthis.prefix when callingupdateState.The call to
updateStateat line 228 is missing thethis.prefix. In Zig, calling a method onthisrequires explicitthis.notation.🐛 Proposed fix
- const wait_until_done = updateState(this, async_http, result, &state); + const wait_until_done = this.updateState(async_http, result, &state);src/s3/list_objects.zig (4)
14-35: Use#-prefixed private backing fields inS3ListObjectsOptions.The backing
ZigString.Slicefields are internal and should use#to avoid accidental public exposure and align with Zig private-field conventions. As per coding guidelines, please align private field naming.♻️ Suggested change
pub const S3ListObjectsOptions = struct { continuation_token: ?[]const u8, delimiter: ?[]const u8, encoding_type: ?[]const u8, fetch_owner: ?bool, max_keys: ?i64, prefix: ?[]const u8, start_after: ?[]const u8, - _continuation_token: ?jsc.ZigString.Slice, - _delimiter: ?jsc.ZigString.Slice, - _encoding_type: ?jsc.ZigString.Slice, - _prefix: ?jsc.ZigString.Slice, - _start_after: ?jsc.ZigString.Slice, + `#continuation_token`: ?jsc.ZigString.Slice, + `#delimiter`: ?jsc.ZigString.Slice, + `#encoding_type`: ?jsc.ZigString.Slice, + `#prefix`: ?jsc.ZigString.Slice, + `#start_after`: ?jsc.ZigString.Slice, pub fn deinit(this: *@This()) void { - if (this._continuation_token) |slice| slice.deinit(); - if (this._delimiter) |slice| slice.deinit(); - if (this._encoding_type) |slice| slice.deinit(); - if (this._prefix) |slice| slice.deinit(); - if (this._start_after) |slice| slice.deinit(); + if (this.#continuation_token) |slice| slice.deinit(); + if (this.#delimiter) |slice| slice.deinit(); + if (this.#encoding_type) |slice| slice.deinit(); + if (this.#prefix) |slice| slice.deinit(); + if (this.#start_after) |slice| slice.deinit(); } }; @@ - ._continuation_token = null, - ._delimiter = null, - ._encoding_type = null, - ._prefix = null, - ._start_after = null, + .#continuation_token = null, + .#delimiter = null, + .#encoding_type = null, + .#prefix = null, + .#start_after = null, @@ - listObjectsOptions._continuation_token = str.toUTF8(bun.default_allocator); - listObjectsOptions.continuation_token = listObjectsOptions._continuation_token.?.slice(); + listObjectsOptions.#continuation_token = str.toUTF8(bun.default_allocator); + listObjectsOptions.continuation_token = listObjectsOptions.#continuation_token.?.slice(); @@ - listObjectsOptions._delimiter = str.toUTF8(bun.default_allocator); - listObjectsOptions.delimiter = listObjectsOptions._delimiter.?.slice(); + listObjectsOptions.#delimiter = str.toUTF8(bun.default_allocator); + listObjectsOptions.delimiter = listObjectsOptions.#delimiter.?.slice(); @@ - listObjectsOptions._encoding_type = str.toUTF8(bun.default_allocator); - listObjectsOptions.encoding_type = listObjectsOptions._encoding_type.?.slice(); + listObjectsOptions.#encoding_type = str.toUTF8(bun.default_allocator); + listObjectsOptions.encoding_type = listObjectsOptions.#encoding_type.?.slice(); @@ - listObjectsOptions._prefix = str.toUTF8(bun.default_allocator); - listObjectsOptions.prefix = listObjectsOptions._prefix.?.slice(); + listObjectsOptions.#prefix = str.toUTF8(bun.default_allocator); + listObjectsOptions.prefix = listObjectsOptions.#prefix.?.slice(); @@ - listObjectsOptions._start_after = str.toUTF8(bun.default_allocator); - listObjectsOptions.start_after = listObjectsOptions._start_after.?.slice(); + listObjectsOptions.#start_after = str.toUTF8(bun.default_allocator); + listObjectsOptions.start_after = listObjectsOptions.#start_after.?.slice();Also applies to: 552-624
50-56: Fix thechecksumAlgorithmetypo in the public API.The struct field and JS property are misspelled (
Algorithme), which is inconsistent with the parsed XML tag and expected camelCase naming. This is user-facing and will surprise consumers.🐛 Suggested fix
const S3ListObjectsContents = struct { key: []const u8, etag: ?bun.ptr.OwnedIn([]const u8, bun.allocators.MaybeOwned(bun.DefaultAllocator)), checksum_type: ?[]const u8, - checksum_algorithme: ?[]const u8, + checksum_algorithm: ?[]const u8, @@ - var checksum_algorithme: ?[]const u8 = null; + var checksum_algorithm: ?[]const u8 = null; @@ - } else if (strings.eql(inner_tag_name_or_tag_end, "ChecksumAlgorithm")) { + } else if (strings.eql(inner_tag_name_or_tag_end, "ChecksumAlgorithm")) { if (strings.indexOf(xml[i..], "</ChecksumAlgorithm>")) |__tag_end| { - checksum_algorithme = xml[i .. i + __tag_end]; + checksum_algorithm = xml[i .. i + __tag_end]; i = i + __tag_end + 20; } } @@ - .checksum_algorithme = checksum_algorithme, + .checksum_algorithm = checksum_algorithm, @@ - if (item.checksum_algorithme) |checksum_algorithme| { - objectInfo.put(globalObject, jsc.ZigString.static("checksumAlgorithme"), try bun.String.createUTF8ForJS(globalObject, checksum_algorithme)); + if (item.checksum_algorithm) |checksum_algorithm| { + objectInfo.put(globalObject, jsc.ZigString.static("checksumAlgorithm"), try bun.String.createUTF8ForJS(globalObject, checksum_algorithm)); }Also applies to: 155-157, 265-266, 311-315, 421-424
239-244: Parser can skip the tail of the XML response.The loop bounds use
xml[delete_result_pos..].lenbut index fromxml[i], which truncates scanning when<ListBucketResult>isn’t at offset 0. That can drop elements near the end of the response.🐛 Suggested fix
- if (strings.indexOf(xml, "<ListBucketResult")) |delete_result_pos| { - var i: usize = 0; - while (i < xml[delete_result_pos..].len) { + if (strings.indexOf(xml, "<ListBucketResult")) |list_result_pos| { + var i: usize = list_result_pos; + while (i < xml.len) {
236-237: Avoid leakingcontents/common_prefixeswhen<ListBucketResult>is missing.These lists are allocated unconditionally but only deinitialized inside the
ifbranch. If the XML is malformed or the tag is absent, the allocations leak.🐛 Suggested fix
- var contents = std.array_list.Managed(S3ListObjectsContents).init(bun.default_allocator); - var common_prefixes = std.array_list.Managed([]const u8).init(bun.default_allocator); - - if (strings.indexOf(xml, "<ListBucketResult")) |list_result_pos| { + if (strings.indexOf(xml, "<ListBucketResult")) |list_result_pos| { + var contents = std.array_list.Managed(S3ListObjectsContents).init(bun.default_allocator); + var common_prefixes = std.array_list.Managed([]const u8).init(bun.default_allocator); ... }Also applies to: 534-536
src/s3/client.zig (1)
284-300: Metadata ownership ambiguity requires clarification or defensive cloning.Metadata is accepted by value but stored directly in request/multipart state without cloning, unlike path and content_type parameters which are explicitly cloned with
.dupe(). This creates inconsistent ownership semantics: the request task will deinit metadata (simple_request.zig:170, multipart.zig:318), but there's no defensive clone to prevent double-free if a caller attempts to reuse or deinit the original map.Either document this as a transfer-of-ownership API (caller must not deinit), or apply defensive cloning consistent with other string parameters using
metadata.dupe(allocator).Also applies to: 319-366, 512-559
🤖 Fix all issues with AI agents
In `@src/bun.js/webcore/S3Stat.zig`:
- Around line 56-68: The length check excludes headers whose name equals the
prefix exactly and should be defensive: change the condition from
header.name.len > prefix_len to header.name.len >= prefix_len so the slice
header.name[prefix_len..] is valid for empty suffixes; also ensure failures are
propagated/handled by treating obj.put as fallible—use try (or otherwise handle
its error) when calling obj.put(globalThis, key, value_js). Reference symbols:
the for (headers) loop, header.name.len, prefix_len,
strings.eqlCaseInsensitiveASCII, header.name[prefix_len..],
bun.String.createUTF8ForJS, and obj.put.
In `@src/s3/credentials.zig`:
- Around line 1003-1007: The code currently treats list.items.len == 0 as a
silent fallback to SignedHeaders.get(header_key) via the brk_signed label when
the fixed buffer allocator overflows; instead, change this behavior to surface
the allocator failure by returning an error (or at minimum logging a warning) so
callers know metadata signing was skipped. Locate the check on list.items.len
and the brk_signed SignedHeaders.get(header_key) path and replace the silent
break with an explicit error return (or call to the module logger with a clear
message including allocator/list context) so signature mismatches can be
detected and handled upstream.
- Around line 83-86: When duplicating this.keys/this.values into
new_keys/new_values using allocator.dupe, ensure partial allocations are cleaned
up on failure: if allocator.dupe fails for either a key or a value at index i,
loop backwards to free all previously allocated new_keys[0..i-1] and
new_values[0..i-1], then free the new_keys and new_values arrays themselves (and
any other owned resources) before returning null; update the duplication loop
around new_keys/new_values to handle failures for both key and value
duplications and reference the symbols allocator.dupe, new_keys, new_values,
this.keys, and this.values when implementing the cleanup.
- Around line 411-414: The code silently converts non-string metadata values to
empty strings (values[i] = ""), which hides bad input; change this to match the
function's other type-validation behavior by returning an error or converting
the value to a string. Specifically, in the block that currently sets values[i]
= "", inspect the metadata entry (use the metadata key variable and index i) and
either: 1) return a descriptive error (e.g., error.InvalidMetadataType)
including the metadata key and actual type, or 2) coerce the value to a string
representation before assigning to values[i]; ensure the new behavior is
consistent with existing validation patterns in this function and include the
metadata key in the error message for easier debugging.
In `@src/s3/download_stream.zig`:
- Line 259: The call to processHttpCallback is missing the required instance
prefix; update the call site so it invokes this.processHttpCallback(this,
async_http, result) (i.e., use this.processHttpCallback) to match other method
calls on the same object and ensure the method resolves correctly; locate the
usage of processHttpCallback in the function around async_http/result and add
the this. prefix.
In `@src/s3/multipart.zig`:
- Around line 124-125: MultiPartUpload and SimpleRequest are both holding and
deinitting the same MetadataMap, causing a double-free; fix by making
MultiPartUpload duplicate the metadata before passing it into
executeSimpleS3Request (so SimpleRequest owns its copy) using
MetadataMap.dupe(), or alternatively transfer ownership by not
storing/deinitting metadata in one of the structs. Specifically, update the call
sites in executeSimpleS3Request (the invocations from MultiPartUpload at the
locations referenced) to pass metadata.dupe() instead of the original, and
ensure deinit() of SimpleRequest and MultiPartUpload continue to call deinit()
only on the metadata they own.
- S3Stat.zig: Use >= instead of > for prefix length check and skip empty keys - credentials.zig: Fix memory leak on partial allocation failure in MetadataMap.dupe - credentials.zig: Throw error for non-string metadata values instead of silent fallback 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
MultiPartUpload stores metadata and frees it in deinit(). When passing metadata to uploadStream/writableStream, the caller's S3CredentialsWithOptions also frees the same metadata on defer deinit() - causing double-free. Fix: Duplicate metadata before passing to multipart functions, then null out the original to prevent double-free in the caller's cleanup. Also refactors credentials.zig to use errdefer pattern for cleaner memory management in parseMetadataFromJS() and MetadataMap.dupe(). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use method call syntax for processHttpCallback (download_stream.zig:259) - Return error.MetadataSigningBufferOverflow instead of silently falling back when signed headers buffer overflows (credentials.zig) - Previous behavior: silent fallback caused signature mismatches - New behavior: explicit error helps diagnose metadata signing issues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace 4 instances of `catch bun.outOfMemory()` with `bun.handleOom()` in the S3 metadata parsing code. The former pattern is banned because it can catch unrelated errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When minioCredentials is undefined (no Docker), the describe callback is still evaluated even though the tests are skipped. Using optional chaining with fallbacks prevents the TypeError during evaluation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/bun.js/webcore/Blob.zig (1)
2715-2731: Guard metadata_dupe on writableStream errors.Both locations (lines 2715 and 2734) have a potential memory leak. When
metadata_dupeis allocated and passed toS3.writableStream(), if the function throws before the task takes ownership, the duplicate metadata leaks because the original metadata is nulled (to prevent double-free) and there is no cleanup for the allocated copy.Line 2734 is particularly vulnerable—it has no defer pattern at all protecting
metadata_dupe.Add an
errdeferto clean up the metadata ifwritableStreamfails:Proposed fix
- const metadata_dupe = if (credentialsWithOptions.metadata) |meta| meta.dupe(bun.default_allocator) else null; + var metadata_dupe = if (credentialsWithOptions.metadata) |meta| meta.dupe(bun.default_allocator) else null; + errdefer if (metadata_dupe) |*meta| meta.deinit(bun.default_allocator); credentialsWithOptions.metadata = null; // Prevent double-free in deinit defer credentialsWithOptions.deinit();- const metadata_dupe = if (s3.metadata) |meta| meta.dupe(bun.default_allocator) else null; + var metadata_dupe = if (s3.metadata) |meta| meta.dupe(bun.default_allocator) else null; + errdefer if (metadata_dupe) |*meta| meta.deinit(bun.default_allocator); return try S3.writableStream(src/bun.js/webcore/fetch.zig (1)
1312-1329: Add errdefer for metadata_dupe to prevent leak if uploadStream fails.Line 1314 creates
metadata_dupewith.dupe(), and line 1315 nulls the original. IfuploadStreamon line 1316 throws an error before taking ownership of the duplicate,metadata_dupeleaks. Changeconsttovarand adderrdeferto clean it up:- const metadata_dupe = if (credentialsWithOptions.metadata) |meta| meta.dupe(bun.default_allocator) else null; + var metadata_dupe = if (credentialsWithOptions.metadata) |meta| meta.dupe(bun.default_allocator) else null; + errdefer if (metadata_dupe) |*meta| meta.deinit(bun.default_allocator); credentialsWithOptions.metadata = null; // Prevent double-free in deinit _ = try bun.S3.uploadStream(
🤖 Fix all issues with AI agents
In `@src/s3/credentials.zig`:
- Around line 124-181: After counting properties in parseMetadataFromJS, enforce
a hard cap using the existing MAX_METADATA_HEADERS constant: if property_count
== 0 return null (keep), but if property_count > MAX_METADATA_HEADERS return an
error (e.g., use globalObject.throwRangeError or a suitable throw* helper) so we
never allocate/produce more entries than the fixed-size metadata arrays. Add
this guard immediately after the property_count loop and before allocating
keys/values (refer to property_count, MAX_METADATA_HEADERS, and
parseMetadataFromJS).
♻️ Duplicate comments (1)
src/bun.js/webcore/S3Stat.zig (1)
56-68: Confirm whetherobj.putis fallible and should be handled.If
putcan throw/return an error, this path should propagate it to avoid silently dropping metadata entries.#!/bin/bash # Inspect JSValue.put signature to determine if it returns an error union. rg -n "pub fn put\\(" --type=zig
- Add guard in parseMetadataFromJS to enforce MAX_METADATA_HEADERS limit (32 headers max) and throw RangeError if exceeded - Add errdefer to clean up metadata_dupe if writableStream/uploadStream fails before taking ownership (prevents memory leaks in Blob.zig and fetch.zig) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/s3/credentials.zig`:
- Around line 124-190: parseMetadataFromJS currently enforces only
MAX_METADATA_HEADERS but not the total bytes, which can overflow
CANONICAL_REQUEST_BUF_SIZE; add a running counter (e.g., total_metadata_bytes)
and for each key and value increment it by key_owned.len + val_slice.len (plus
any separators if relevant) and if it exceeds the AWS-enforced limit (2048
bytes) return a RangeError via globalObject.throwRangeError; perform this check
before duplicating/allocating (before bun.default_allocator.dupe/alloc for
key_lower and values[i]) to avoid OOM/overflow and mirror the same guard in any
other metadata-parsing code paths that build canonical requests.
♻️ Duplicate comments (1)
src/s3/credentials.zig (1)
972-1005: Handle FixedBufferAllocator overflow in presignedSignedHeadersbuild.In the presigned branch,
appendSlicefailures are ignored; the resultingsigned_headerscan be truncated without error, yielding invalid signatures. Mirror the overflow handling used in the header-based branch.🐛 Proposed fix
- list.appendSlice("host") catch {}; + list.appendSlice("host") catch return error.MetadataSigningBufferOverflow; @@ - list.appendSlice(";x-amz-meta-") catch {}; - list.appendSlice(meta.keys[idx]) catch {}; + list.appendSlice(";x-amz-meta-") catch return error.MetadataSigningBufferOverflow; + list.appendSlice(meta.keys[idx]) catch return error.MetadataSigningBufferOverflow;
| fn parseMetadataFromJS(metadata_value: jsc.JSValue, globalObject: *jsc.JSGlobalObject) bun.JSError!?MetadataMap { | ||
| const metadata_obj = try metadata_value.toObject(globalObject); | ||
|
|
||
| // Count properties first | ||
| var property_count: usize = 0; | ||
| var iter = try jsc.JSPropertyIterator(.{ .skip_empty_name = true, .include_value = true }).init(globalObject, metadata_obj); | ||
| defer iter.deinit(); | ||
| while (try iter.next()) |_| { | ||
| property_count += 1; | ||
| } | ||
|
|
||
| if (property_count == 0) return null; | ||
|
|
||
| // Enforce maximum metadata headers limit to prevent buffer overflows | ||
| if (property_count > MAX_METADATA_HEADERS) { | ||
| return globalObject.throwRangeError(@as(i64, @intCast(property_count)), .{ | ||
| .min = 0, | ||
| .max = MAX_METADATA_HEADERS, | ||
| .field_name = "metadata properties", | ||
| }); | ||
| } | ||
|
|
||
| // Allocate arrays for keys and values | ||
| const keys = bun.handleOom(bun.default_allocator.alloc([]const u8, property_count)); | ||
| errdefer bun.default_allocator.free(keys); | ||
|
|
||
| const values = bun.handleOom(bun.default_allocator.alloc([]const u8, property_count)); | ||
| errdefer bun.default_allocator.free(values); | ||
|
|
||
| // Track allocations for cleanup on error | ||
| var keys_allocated: usize = 0; | ||
| var values_allocated: usize = 0; | ||
|
|
||
| errdefer { | ||
| for (0..keys_allocated) |j| bun.default_allocator.free(keys[j]); | ||
| for (0..values_allocated) |j| bun.default_allocator.free(values[j]); | ||
| } | ||
|
|
||
| // Second pass to extract keys and values | ||
| var iter2 = try jsc.JSPropertyIterator(.{ .skip_empty_name = true, .include_value = true }).init(globalObject, metadata_obj); | ||
| defer iter2.deinit(); | ||
|
|
||
| var i: usize = 0; | ||
| while (try iter2.next()) |key| { | ||
| // Convert key to lowercase (AWS requires lowercase metadata keys) | ||
| const key_owned = try key.toOwnedSlice(bun.default_allocator); | ||
| defer bun.default_allocator.free(key_owned); | ||
|
|
||
| const key_lower = bun.handleOom(bun.default_allocator.alloc(u8, key_owned.len)); | ||
| keys[i] = strings.copyLowercase(key_owned, key_lower); | ||
| keys_allocated = i + 1; | ||
|
|
||
| // Get value - metadata values must be strings | ||
| const val = iter2.value; | ||
| if (!val.isString()) { | ||
| return globalObject.throwInvalidArgumentTypeValue("metadata value", "string", val); | ||
| } | ||
|
|
||
| const val_str = try bun.String.fromJS(val, globalObject); | ||
| defer val_str.deref(); | ||
| const val_slice = val_str.toUTF8(bun.default_allocator); | ||
| defer val_slice.deinit(); | ||
|
|
||
| values[i] = bun.handleOom(bun.default_allocator.dupe(u8, val_slice.slice())); | ||
| values_allocated = i + 1; | ||
|
|
||
| i += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
AWS S3 user-defined metadata size limit bytes per object
💡 Result:
User-defined metadata per object is limited to 2 KB (2,048 bytes). The PUT request header overall is limited to 8 KB (8,192 bytes). [1][2]
Sources:
- AWS S3 — Working with object metadata (notes on 2 KB user metadata and 8 KB PUT header). [1]
- AWS S3 — Editing object metadata (restates 2 KB user-defined metadata). [2]
🏁 Script executed:
# Find and examine the credentials.zig file
fd credentials.zig src/Repository: oven-sh/bun
Length of output: 77
🏁 Script executed:
# Check the file size and structure
wc -l src/s3/credentials.zigRepository: oven-sh/bun
Length of output: 82
🏁 Script executed:
# Look for the constants mentioned in the review
rg "MAX_METADATA_HEADERS|SIGNED_HEADERS_BUF_SIZE|CANONICAL_REQUEST_BUF_SIZE" src/s3/credentials.zigRepository: oven-sh/bun
Length of output: 1202
🏁 Script executed:
# Read the parseMetadataFromJS function and surrounding code
sed -n '100,200p' src/s3/credentials.zigRepository: oven-sh/bun
Length of output: 3584
🏁 Script executed:
# Read lines 464-473
sed -n '450,480p' src/s3/credentials.zigRepository: oven-sh/bun
Length of output: 1559
🏁 Script executed:
# Read lines 1335-1489
sed -n '1320,1510p' src/s3/credentials.zigRepository: oven-sh/bun
Length of output: 9002
🏁 Script executed:
# Search for where metadata is added to canonical request to understand buffer usage
rg -A 5 -B 5 "canonical" src/s3/credentials.zig | head -100Repository: oven-sh/bun
Length of output: 5182
🏁 Script executed:
# Search for any bounds checking on pos variable in canonical request building
rg -A 2 -B 2 "pos >" src/s3/credentials.zig | head -30Repository: oven-sh/bun
Length of output: 37
🏁 Script executed:
# Also check if there's any validation on metadata size elsewhere
rg "total.*byte|total.*size|metadata.*byte|metadata.*size" src/s3/credentials.zigRepository: oven-sh/bun
Length of output: 149
🏁 Script executed:
# Check the exact TEMP_BUFFER_SIZE and other related constants
rg "TEMP_BUFFER_SIZE|BUFFER_SIZE" src/s3/credentials.zigRepository: oven-sh/bun
Length of output: 846
🏁 Script executed:
# Final verification: check if there's any runtime bounds checking around pos assignments
sed -n '1400,1500p' src/s3/credentials.zig | grep -E "(pos|CANONICAL|unreachable|panic)"Repository: oven-sh/bun
Length of output: 2709
Add total metadata size guard in parseMetadataFromJS to prevent buffer overflow in canonical request building.
The current implementation only validates the count of metadata headers (max 32), but does not cap the total byte size. This allows attackers to supply large key/value pairs that overflow the fixed buffers (CANONICAL_REQUEST_BUF_SIZE = 8192 bytes) during signature generation. AWS enforces a 2 KB limit on user-defined metadata, which must be enforced in parsing to prevent silent memory corruption in release builds.
Example overflow: 32 headers × 256 bytes each = 8192 bytes for metadata alone, plus METHOD/PATH/QUERY/base headers = buffer overflow.
Proposed fix
fn parseMetadataFromJS(metadata_value: jsc.JSValue, globalObject: *jsc.JSGlobalObject) bun.JSError!?MetadataMap {
const metadata_obj = try metadata_value.toObject(globalObject);
+ const max_total_bytes: usize = 2048; // AWS user metadata limit
@@
// Second pass to extract keys and values
var iter2 = try jsc.JSPropertyIterator(.{ .skip_empty_name = true, .include_value = true }).init(globalObject, metadata_obj);
defer iter2.deinit();
var i: usize = 0;
+ var total_bytes: usize = 0;
while (try iter2.next()) |key| {
@@
const val_slice = val_str.toUTF8(bun.default_allocator);
defer val_slice.deinit();
+ total_bytes += "x-amz-meta-".len + key_owned.len + 1 + val_slice.slice().len;
+ if (total_bytes > max_total_bytes) {
+ return globalObject.throwRangeError(`@as`(i64, `@intCast`(total_bytes)), .{
+ .min = 0,
+ .max = `@as`(i64, `@intCast`(max_total_bytes)),
+ .field_name = "metadata total size",
+ });
+ }
+
values[i] = bun.handleOom(bun.default_allocator.dupe(u8, val_slice.slice()));
values_allocated = i + 1;Also applies to: 464-473, 1335-1489
🤖 Prompt for AI Agents
In `@src/s3/credentials.zig` around lines 124 - 190, parseMetadataFromJS currently
enforces only MAX_METADATA_HEADERS but not the total bytes, which can overflow
CANONICAL_REQUEST_BUF_SIZE; add a running counter (e.g., total_metadata_bytes)
and for each key and value increment it by key_owned.len + val_slice.len (plus
any separators if relevant) and if it exceeds the AWS-enforced limit (2048
bytes) return a RangeError via globalObject.throwRangeError; perform this check
before duplicating/allocating (before bun.default_allocator.dupe/alloc for
key_lower and values[i]) to avoid OOM/overflow and mirror the same guard in any
other metadata-parsing code paths that build canonical requests.
Summary
Add support for S3 custom metadata headers (
x-amz-meta-*) that allow users to attach key-value pairs to S3 objects.Write path
metadata: Record<string, string>in S3Options{ sku: "12345" }tox-amz-meta-sku: 12345headerswrite(),writer(), and presigned PUT URLsRead path
stat()x-amz-meta-*headers from HEAD responsemetadata: Record<string, string>Presigned URLs
Usage
Test plan
S3File.write()andstat()S3Client.write()static method with metadatawriter()and metadataAll 6 metadata tests pass with MinIO via docker-compose.
Changelog
Added
metadataoption to S3Options for attaching custom metadata (x-amz-meta-* headers) to S3 objects during upload. Addedmetadataproperty to S3Stats returned bystat()to retrieve stored metadata. Keys are automatically lowercased per AWS requirements. Maximum ~2KB total metadata supported.🤖 Generated with Claude Code