Skip to content

HDDS-15600. Fix ListObjects response for encoding-type, empty delimiter, and control-character prefix#10586

Open
Gargi-jais11 wants to merge 4 commits into
apache:masterfrom
Gargi-jais11:HDDS-15600
Open

HDDS-15600. Fix ListObjects response for encoding-type, empty delimiter, and control-character prefix#10586
Gargi-jais11 wants to merge 4 commits into
apache:masterfrom
Gargi-jais11:HDDS-15600

Conversation

@Gargi-jais11

@Gargi-jais11 Gargi-jais11 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Ozone S3 Gateway fails s3-tests ListObjects cases that AWS S3 passes:

  • test_bucket_list_encoding_basic — With EncodingType=url, keys and CommonPrefixes containing spaces are encoded with + (Java URLEncoder form encoding) instead of AWS-style %20.
    Example: prefix quux ab/ is returned as quux+ab/ but should be quux%20ab/.
  • test_bucket_list_delimiter_empty — When Delimiter='' is sent, listing behavior is correct (all keys returned, no CommonPrefixes), but the response incorrectly includes a Delimiter field. AWS omits Delimiter from the XML when the client passes an empty delimiter.
  • test_bucket_list_prefix_unreadable — ListObjects with Prefix='\x0a' (newline) should echo the prefix in the response and return empty Contents/CommonPrefixes. Ozone may not preserve or echo the control-character prefix correctly.

https://ozone.s3.peterxcli.dev/#latest-run-section

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15600

How was this patch tested?

Before fix:

  1. encoding type:
bash-5.1$ aws s3api list-objects   --bucket buck1   --encoding-type url   --delimiter /   --endpoint-url http://s3g:9878/
{
    "Contents": [
        {
            "Key": "asdf%2Bb",
            "LastModified": "2026-06-19T09:51:39.569Z",
            "ETag": "\"f75b8179e4bbe7e2b4a074dcef62de95\"",
            "Size": 8,
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "testuser",
                "ID": "bb2bd7ca4a327f84e6cd3979f8fa3828a50a08893c1b68f9d6715352c8d07b93"
            }
        }
    ],
    "CommonPrefixes": [
        {
            "Prefix": "foo/"
        },
        {
            "Prefix": "foo%2B1/"
        },
        {
            "Prefix": "quux+ab/".                       <----------------- wrong output
        }
    ],
    "RequestCharged": null,
    "Prefix": ""
}
  1. empty delimiter:
bash-5.1$ aws --debug s3api list-objects \
  --bucket buck1 \
  --delimiter "" \
  --endpoint-url http://s3g:9878 2>&1 \
  | grep -oE '<Prefix>[^<]*</Prefix>|<Delimiter>[^<]*</Delimiter>|<KeyCount>[^<]*</KeyCount>'
<Prefix></Prefix>
<KeyCount>0</KeyCount>
<Delimiter></Delimiter>.           <---------- should not be present

3 echoed prefix always url-encoded:

bash-5.1$ aws s3api list-objects \
  --bucket buck1 \
  --prefix $'\n' \
  --endpoint-url http://s3g:9878/
{
    "RequestCharged": null,
    "Prefix": "%0A"               <------------------ wrong behaviour
}

After fix:

  1. encoding-type:
bash-5.1$ aws s3api list-objects   --bucket buck1   --encoding-type url   --delimiter /   --endpoint-url http://s3g:9878/
{
    "Contents": [
        {
            "Key": "asdf%2Bb",
            "LastModified": "2026-06-19T09:08:58.933Z",
            "ETag": "\"f75b8179e4bbe7e2b4a074dcef62de95\"",
            "Size": 8,
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "testuser",
                "ID": "bb2bd7ca4a327f84e6cd3979f8fa3828a50a08893c1b68f9d6715352c8d07b93"
            }
        }
    ],
    "CommonPrefixes": [
        {
            "Prefix": "foo/"
        },
        {
            "Prefix": "foo%2B1/"
        },
        {
            "Prefix": "quux%20ab/"                <------------------------correct output
        }
    ],
    "RequestCharged": null,
    "Prefix": null
}
  1. empty delimeter :
bash-5.1$ aws --debug s3api list-objects \
  --bucket buck1 \
  --delimiter "" \
  --endpoint-url http://s3g:9878 2>&1 \
  | grep -oE '<Prefix>[^<]*</Prefix>|<Delimiter>[^<]*</Delimiter>|<KeyCount>[^<]*</KeyCount>'
  
<KeyCount>0</KeyCount>.                 <-------------------- no empty delimiter present
  1. echoed prefix should not be url encoded:
bash-5.1$ aws s3api list-objects \
  --bucket buck1 \
  --prefix $'\n' \
  --endpoint-url http://s3g:9878/
{
    "RequestCharged": null,
    "Prefix": "\n"
}

@adoroszlai adoroszlai added the s3 S3 Gateway label Jun 23, 2026

@adoroszlai adoroszlai left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Gargi-jais11 for the patch.

@Gargi-jais11 Gargi-jais11 marked this pull request as ready for review June 23, 2026 08:29
@Gargi-jais11 Gargi-jais11 marked this pull request as draft June 29, 2026 08:11
@Gargi-jais11 Gargi-jais11 marked this pull request as ready for review June 29, 2026 11:57

@chungen0126 chungen0126 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Gargi-jais11 for working on this. Overall, looks good to me. Just a nit.

@Gargi-jais11

Copy link
Copy Markdown
Contributor Author

@ivandika3 please review this patch.

@Russole Russole left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Gargi-jais11 for the patch. I’ve left a few review comments.

Comment on lines +174 to +176
if (prefixSpecified) {
response.setPrefix(EncodingTypeObject.createNullable(prefix, null));
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix should also be encoded when encoding-type=url is specified.

AWS docs say EncodingType applies to Delimiter, Prefix, Key, and StartAfter:
https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html#AmazonS3-ListObjectsV2-response-EncodingType

Local test:

aws s3api list-objects \
  --bucket buck1 \
  --prefix $'\n' \
  --encoding-type url \
  --endpoint-url http://localhost:9878/

Actual:

"Prefix": "\n"

Expected:

"Prefix": "%0A"

Looks like Prefix is created with a null encoding type here. Should this use encodingType instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

s3 S3 Gateway

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants