Translate char as libc::c_char in unsafe by lucic71 · Pull Request #194 · Cpp2Rust/cpp2rust

lucic71 · 2026-06-13T19:02:32Z

Translates char as c_char instead of u8 in both unsafe. Refcount is unchanged.

This contains:

change all unsafe rules to use c_char instead of u8
use c_char instead of u8 in mapper and VisitBuiltinType
use c_char in argv for both unsafe
delete the IsCharPointerFieldFromLibc and IsCharArrayFieldFromLibc hacks
use the edition 2024 c"" syntax to define c_char compatible string literals
use getTypedLiteral to write 0 as c_char (correct) instead of 0_c_char (incorrect)

The big advantage of this PR is 4. IsCharPointerFieldFromLibc and IsCharArrayFieldFromLibc were temporary solutions to interact between our u8 and libc's c_char. After this PR, both functions are deleted.

lucic71 · 2026-06-22T17:45:19Z

@nunoplopes this is ready for review

nunoplopes · 2026-06-24T08:02:21Z

What's the advantage of using c_char vs u8? The code seems a lot more verbose and less idiomatic. Refcount should not use c_char.

lucic71 · 2026-06-24T08:32:44Z

What's the advantage of using c_char vs u8? The code seems a lot more verbose and less idiomatic. Refcount should not use c_char.

u8 is incompatible with char* or char[] from libc or FFI. We use u8 while libc and FFI use c_char (typedef over i8). The job of IsCharPointerFieldFromLibc and IsCharArrayFieldFromLibc was to workaround this discrepancy. I deleted both functions now.

We can use c_char instead of core:ffi::c_char to make the code less verbose and more idiomatic.

It's ok for refcount to use c_char. core::ffi::c_char does not depend on libc and c_char is the correct and portable solution, consider the following code:

#include <cassert>
int main() {
    char a = -2;
    char b = 1;
    assert(a + b == -1);
    return 0;
}

With the u8 translation this fails. With the c_char translation, this is ok.

nunoplopes · 2026-06-24T09:40:30Z

I disagree. I don't think core:ffi::c_char is more idiomatic. Native Rust code does not use that.
The sign of char is platform dependent. It's true that in x86 it's signed (so equivalent to i8). On ARM it's often unsigned (equivalent to u8).

I don't know why the translation of that example fails, but c_char is not the answer, for recount at least.

lucic71 · 2026-06-24T11:27:20Z

I disagree. I don't think core:ffi::c_char is more idiomatic. Native Rust code does not use that. The sign of char is platform dependent. It's true that in x86 it's signed (so equivalent to i8). On ARM it's often unsigned (equivalent to u8).

I don't know why the translation of that example fails, but c_char is not the answer, for recount at least.

Then for unsafe I will continue with c_char so that I can get rid of IsCharPointerFieldFromLibc and IsCharArrayFieldFromLibc. For refcount we keep u8 instead of c_char because we don't expect to interact with libc.

I don't know why the translation of that example fails

On a platform where char is signed, in a + b both arguments are promoted to signed int, -2 + 1 becomes -1 and the result is downgraded back to char which results in (char) -1. But becaue we translate char as u8, the result becomes: -2i32 as u8 + 1_u8 = 254 + 1 = 255

fn main_0() -> i32 {
    let a: Value<u8> = Rc::new(RefCell::new((-2_i32 as u8)));
    let b: Value<u8> = Rc::new(RefCell::new(1_u8));
    assert!(((((*a.borrow()) as i32) + ((*b.borrow()) as i32)) == -1_i32));
    return 0;
}

…nal alloc

lucic71 marked this pull request as draft June 13, 2026 19:02

lucic71 marked this pull request as ready for review June 22, 2026 17:45

lucic71 added 23 commits June 25, 2026 22:16

Use C size when size_of diverges between C and Rust

0dca34e

Add byte_size in ByteRepr to capture C size

8fa8bec

Delete unused function

a07a40f

Add correct from_bytes type

7db5002

Update tests

c66a712

Add panic default implementations

5af65da

Implement byte_size for primitives

8084fc9

Allow enums to have byterepr

532c0b5

Don't add ByteRepr for system header types

0573f9c

Ignore unions in ByteRepr for now

32d916a

Update tests

96cd270

Memcpy a poitner to a struct through ErasedPtr

3402042

Add memcpy_struct_bytes test

fb7f030

Update tests

e4bb83e

Merge branch 'master' into anyptr-reinterpreted

d979745

Trigger CI

48d6908

Merge branch 'master' into anyptr-reinterpreted

3436b46

Use reinterpret_cast to round-trip through void

658b9ed

Update tests

ba3fd40

clang-format

93681bd

Fix expected output

70bf770

Merge branch 'master' into void-round-trip

b1705c3

Merge branch 'fix-union' into void-round-trip

dd22145

lucic71 added 19 commits July 1, 2026 14:30

Fix 0 initialization for c_char

97cbfa6

Fix implicit and explicit casts between u8/c_char

7050c08

Fix definition of main to use c_char

6098f86

Convert Vec<c_char> to Vec<u8> when printing

6d0a3e9

Fix char array initialization to use c_char

7206596

Add libcc2rs::char_array to initialize fixed sized arrays

cbcbf3d

cargo gmt

470fc0e

Snapshot/write-back the entire region between offset and end of origi…

fff36f8

…nal alloc

Replace ::core::ffi::c_char with core::ffi::c_char

d28f225

Merge branch 'void-round-trip' into u8-as-c_char

0962678

Use CharRustType() between unsafe and refcoutn

455df3c

Fix argv for unsafe and refcount

068573a

Update tests

85c0c13

Use c_char in unsafe and u8 in refcount

0f3d428

Fix merge artifact

f299656

Use u8 in rc.rs

49bea74

Use c_char in unsafe and u8 in refcount

574ecf0

Delete libcc2rs::char_array

d9499af

Update tests

3bb2814

lucic71 force-pushed the u8-as-c_char branch from 43865ca to 3bb2814 Compare July 1, 2026 14:17

lucic71 added 4 commits July 1, 2026 15:34

Delete useless cast form rule

3e71cd3

Delete separate Ptr impl for u8

2e89ab2

Drop unsafe from transmute

7830f8a

Use c_char in unsafe

503ea51

lucic71 changed the title ~~Translate char as core::ffi::c_char instead of u8~~ Translate char as libc::c_char in unsafe Jul 1, 2026

lucic71 added 3 commits July 1, 2026 15:43

Update tests

e5356d0

Add safe rules using u8 instead of c_char

0904ff6

Merge branch 'master' into u8-as-c_char

b8ead47

nunoplopes merged commit 1b372c1 into Cpp2Rust:master Jul 1, 2026
9 checks passed

lucic71 deleted the u8-as-c_char branch July 1, 2026 16:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Translate char as libc::c_char in unsafe#194

Translate char as libc::c_char in unsafe#194
nunoplopes merged 60 commits into
Cpp2Rust:masterfrom
lucic71:u8-as-c_char

lucic71 commented Jun 13, 2026 •

edited

Loading

Uh oh!

lucic71 commented Jun 22, 2026

Uh oh!

nunoplopes commented Jun 24, 2026

Uh oh!

lucic71 commented Jun 24, 2026

Uh oh!

nunoplopes commented Jun 24, 2026

Uh oh!

lucic71 commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

lucic71 commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucic71 commented Jun 22, 2026

Uh oh!

nunoplopes commented Jun 24, 2026

Uh oh!

lucic71 commented Jun 24, 2026

Uh oh!

nunoplopes commented Jun 24, 2026

Uh oh!

lucic71 commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lucic71 commented Jun 13, 2026 •

edited

Loading