Towards a better libc primitive for handling path names

De Ensiwiki
Aller à : navigation, rechercher


Towards a better libc primitive for handling path names

Labo INIRIA
Equipe CONVECS
Encadrants Garavel Hubert Garavel

Context

The standard C library [1] is being used by hundreds of thousands programmers to develop a wealth of software programs. This library is available on many operating systems, including BSD, Linux, Solaris, and Windows. It provides many predefined functions for accomplishing basic programming tasks. In particular, there are two primitives, respectively basename() and dirname(), for handling path names, i.e., character strings that denote files and directories [2]. An implementation of these functions can be found, e.g., in [3] and [4].

Project description

The proposed work is based on the finding that basename() and dirname() are usable but not optimal for doing the usual tasks with path names, such as checking if a file name has a given extension, add a given extension to a file name if this extension is missing, replacing the extension of a file name by another extension, etc. To perform such tasks, one needs to write extra C code that is cumbersome, repetitive, and error prone. Although basename() and dirname() have been around and stable for nearly 20 years, it is perhaps time to revisit them in light of the experience accumulated during two decades.

Contributions

To address this issue, a new library function should be developed, which would be better adapted to the usual tasks related to path names. The goals of the proposed project are the following:

  • Analyse in which way basename() and dirname() are used in existing software, e.g., by inspecting large open source software. Document the various code fragments and provide statistics about their usage.
  • Contribute to the specification and implementation (in C language) of this new library function.
  • Make sure that this function is correctly written (no memory leaks, no buffer overflows, no extra memory allocation).
  • Implement a C program that provides a command-line wrapper for this function, in the same way as the Unix programs printf(1), basename(1), and dirname(1) are command-line wrappers for the C library functions printf(3), basename(3), and dirname(3).
  • Write tests to exercise the written C code in all possible cases.
  • Check the C code written using automated tools for controlling the quality of source code, such as Coverity Scan.

All the software will released in open source under the BSD license. A long-term goal of this project would be to have this code accepted and integrated in major C libraries (e.g., BSD, GNU/Linux, etc.).

References

[1] https://en.wikipedia.org/wiki/C_standard_library

[2] http://man7.org/linux/man-pages/man3/basename.3.html, https://man.openbsd.org/basename.3, https://man.openbsd.org/dirname.3

[3] https://github.com/lattera/glibc/blob/master/string/basename.c, https://github.com/freebsd/freebsd/blob/master/lib/libc/gen/basename.c

[4] https://github.com/lattera/glibc/blob/master/misc/dirname.c, https://github.com/freebsd/freebsd/blob/master/lib/libc/gen/dirname.c