Skip to content

Translate char as libc::c_char in unsafe#194

Merged
nunoplopes merged 60 commits into
Cpp2Rust:masterfrom
lucic71:u8-as-c_char
Jul 1, 2026
Merged

Translate char as libc::c_char in unsafe#194
nunoplopes merged 60 commits into
Cpp2Rust:masterfrom
lucic71:u8-as-c_char

Conversation

@lucic71

@lucic71 lucic71 commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Translates char as c_char instead of u8 in both unsafe. Refcount is unchanged.

This contains:

  1. change all unsafe rules to use c_char instead of u8
  2. use c_char instead of u8 in mapper and VisitBuiltinType
  3. use c_char in argv for both unsafe
  4. delete the IsCharPointerFieldFromLibc and IsCharArrayFieldFromLibc hacks
  5. use the edition 2024 c"" syntax to define c_char compatible string literals
  6. use getTypedLiteral to write 0 as c_char (correct) instead of 0_c_char (incorrect)

The big advantage of this PR is 4. IsCharPointerFieldFromLibc and IsCharArrayFieldFromLibc were temporary solutions to interact between our u8 and libc's c_char. After this PR, both functions are deleted.

@lucic71 lucic71 marked this pull request as draft June 13, 2026 19:02
@lucic71 lucic71 marked this pull request as ready for review June 22, 2026 17:45
@lucic71

lucic71 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@nunoplopes this is ready for review

@nunoplopes

Copy link
Copy Markdown
Contributor

What's the advantage of using c_char vs u8? The code seems a lot more verbose and less idiomatic. Refcount should not use c_char.

@lucic71

lucic71 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

What's the advantage of using c_char vs u8? The code seems a lot more verbose and less idiomatic. Refcount should not use c_char.

u8 is incompatible with char* or char[] from libc or FFI. We use u8 while libc and FFI use c_char (typedef over i8). The job of IsCharPointerFieldFromLibc and IsCharArrayFieldFromLibc was to workaround this discrepancy. I deleted both functions now.

We can use c_char instead of core:ffi::c_char to make the code less verbose and more idiomatic.

It's ok for refcount to use c_char. core::ffi::c_char does not depend on libc and c_char is the correct and portable solution, consider the following code:

#include <cassert>
int main() {
    char a = -2;
    char b = 1;
    assert(a + b == -1);
    return 0;
}

With the u8 translation this fails. With the c_char translation, this is ok.

@nunoplopes

Copy link
Copy Markdown
Contributor

I disagree. I don't think core:ffi::c_char is more idiomatic. Native Rust code does not use that.
The sign of char is platform dependent. It's true that in x86 it's signed (so equivalent to i8). On ARM it's often unsigned (equivalent to u8).

I don't know why the translation of that example fails, but c_char is not the answer, for recount at least.

@lucic71

lucic71 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

I disagree. I don't think core:ffi::c_char is more idiomatic. Native Rust code does not use that. The sign of char is platform dependent. It's true that in x86 it's signed (so equivalent to i8). On ARM it's often unsigned (equivalent to u8).

I don't know why the translation of that example fails, but c_char is not the answer, for recount at least.

Then for unsafe I will continue with c_char so that I can get rid of IsCharPointerFieldFromLibc and IsCharArrayFieldFromLibc. For refcount we keep u8 instead of c_char because we don't expect to interact with libc.

I don't know why the translation of that example fails

On a platform where char is signed, in a + b both arguments are promoted to signed int, -2 + 1 becomes -1 and the result is downgraded back to char which results in (char) -1. But becaue we translate char as u8, the result becomes: -2i32 as u8 + 1_u8 = 254 + 1 = 255

fn main_0() -> i32 {
    let a: Value<u8> = Rc::new(RefCell::new((-2_i32 as u8)));
    let b: Value<u8> = Rc::new(RefCell::new(1_u8));
    assert!(((((*a.borrow()) as i32) + ((*b.borrow()) as i32)) == -1_i32));
    return 0;
}

@lucic71 lucic71 changed the title Translate char as core::ffi::c_char instead of u8 Translate char as libc::c_char in unsafe Jul 1, 2026
@nunoplopes nunoplopes merged commit 1b372c1 into Cpp2Rust:master Jul 1, 2026
9 checks passed
@lucic71 lucic71 deleted the u8-as-c_char branch July 1, 2026 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants