Jixin Chen <jchen304 at ucsc dot edu>
mentored by Micro Architecture at Santa Cruz Group, SOE and Center for Research in Open Source Software at UC Santa Cruz:
sponsored by Google Summer of Code
When I first dived into the LiveHD project, it did not conform to many of the best practices in the open source world. This is not unexpected for a project that starts as an academic research and has a limited number of external users at the moment.
My first instinct, upon seeing a long list of commits with red cross ❌ (failing since Jan 2021), is that the infrastructure of the project needs a cleanup, and CI should be made green again.
Bazel is a new build system, primarily developed by Google. Bazel projects (i.e. LiveHD) often need to write build rules for non-Bazel dependencies.
rules_hdl aims to consolidate those efforts. It is a set of rules that allows Bazel projects to easily depend on crucial HDL (Hardware Description Language) libraries and tools.
By switching to it, the maintenance burden becomes lower due to collaboration, and other members in the HDL community can enjoy the benefits of our works.
Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc (a minimal C standard library) and busybox, which is popular in the embedded and container use cases.
However, most Linux distributions, including Debian and Enterprise Linux, use glibc (GNU C standard library). Developers and software distributors often make the assumption that a Linux system has glibc. As a result, prebuilt executables are often not runnable on Alpine Linux, and source code often relies on specific glibc behaviors.
Not only do I have to port LiveHD to Alpine Linux, but I also have to port its dependencies. The most challenging tool that I have to port is the build system (Bazel).
rules_hdlpatched for better portability, merged in upstream
Three types of char are specified: signed, plain, and unsigned. A plain char may be represented as either signed or unsigned, depending upon the implementation, as in prior practice. - 184.108.40.206, ANSI C Rationale
x86 and ARM64 diverges on the signedness of
char, where x86 uses
signed char but ARM64 uses
unsigned char. It is common for developers who only use x86 machines to make the inappropriate assumption that
signed. To support ARM64, it is necessary to fix those instances.
Additionally, unlike x64, ARM architectures have poor support for unaligned access, which could cause exceptions in some cases. Plus, unaligned access violates strict aliasing rule, and is classified as an undefined behavior in C/C++.
The last and the most obvious problem is the x86 intrinsics. Intrinsics are minimal wrappers around a small piece of assembly code, which allow developers to use assembly directly, but in a more elegant way. Due to their architecture-specific nature, they have to be replaced with pure C implementations on non-x86 platforms.
charfound and corrected
macOS is a different operating system:
The first hurdle encountered is a series of
error: no member named ??? in namespace 'std' compile errors. It turns out that in each version of C++, Standards Committee not only adds features, but also removes existing ones. GCC folks decide to continue implementing the removed features whenever possible, possibly for better compatibility. However, LLVM choose to be stricter. LiveHD uses C++ 17, but there are uses of deprecated and removed standard library functions.
After I finally got LiveHD to compile, most tests fail, and the logs suggest that no operation has been performed. With some breakpoints in command line argument parsing functions, it becomes known that invalid arguments are passed to the main executable 🤨. So I decided to add some
echoes before test script's invocations of the executable. The arguments pass to the executable unparsed! With more diggings, I noticed that
getopt produces different results on macOS. A quick search reveals that BSD's
getopt has different behaviors from GNU one.
With that trouble gone, there is only one test failure remaining. Weirdly, similar tests in the same group do not fail, and the differences between results do not make sense. I have to put breakpoints around the crucial functions, and rotate between macOS and Linux to compare the intermediate results. It turns out that algorithm in
std::sort is different in libc++, and one of the
less function in LiveHD made incorrect assumptions about the sorting algorithm.
I would like to thank members of the MASC group and CROSS staffs, especially Professor Renau, for the guidance.
This work is funded by Google, via the Summer of Code program. Google's commitments to Open Source are much appreciated!
I intend to continue working in the MASC lab, as an undergraduate student and (possibly) in the future as a graduate student. Currently, I am working on other tasks in LiveHD, some assigned to me by Professor Renau.