Best Practices and Personal Tech Notes
by Jeffrey S. Jonas
Ive been a Unix/Linux user and developer for over 30 years, starting Sept 1978
with Unix version 6 on the Cooper Unions PDP11/45.
Ive used many versions and derivatives of Unix (and Linux)
so I tend to use tried-and-true portable
tricks of the trade
to avoid getting too dependent on any particular environment.
Code portability used to be a virtue.
At home, Im running Linux.
Most of my work is from the bash command line, using shortcuts such as
- "du -a | grep filename" to find files, with full pathname
- "locate" instead of "find" (only if ALL the files existed before the most recent database rebuild)
- my .bashrc
- hard vs. soft (symbolic) links
- file permission bits
- essential & required reading
- other rants
- Silly Linux trinkets
Heres my .bashrc full of useful aliases
hard vs. soft (symbolic) links
There are many reasons to assign more than one name to a file (some of them valid!)
From the very beginning, the classic Unix file system allowed
(primarily due to the way files are described by inodes
but given names by separate directory entries).
How to make hard links
ln(1) or link(2) creates another name to a pre-existing file.
The names do not have to be in the same directory, but must reside within the same file system.
ALL hard link filenames are equally valid.
That has interesting implications. It means that the file continues to exist
even if any of the names are deleted, moved or re-named.
How to find and track hard links
as reported by stat(2) or ls(1) is the number of names
a files has
(thats why the system call is unlink(2) instead of delete or remove).
If the link count is >1 then the file name more than one name!
To find all the other names:
- "ls -i" to find the files i-number (the internal number that really describes the file)
- "df ." (or files pathname) to find the filesystem containing the file (the last field)
- "find <filesys> -inum <i-number> -print"
prints all the names to the file.
Note: not all names may be found because you may not have permission to reach all the directories.
What are symbolic links?
Symbolic (or soft) links are another way to give alternate names (and locations)
It alters the name of the file as if its a character string
(similar to what Windows calls a
Unlike hard links that work only within the file system,
they work on any and all file types, across file systems.
Why I dont like symbolic links
Many utilities are confused by symbolic links and default to NOT following symbolic links
(such as creating backups or TAR files).
Unlike hard links, theres no way to backtrack soft links since they can be ANYWHERE,
even on remote systems that are offline.
They can point to non-existant files.
Theres little to no checking on their creation, allowing directory loops.
Hard links have the following ADVANTAGES:
- All the names work the same (no confusion about follow/no-follow symlinks)
- All file names are equally valid
(delete one name and the file remains so long as ANY name links to the file)
- The link count (in ls or stat) always shows the number of names to the file
- All the names can be found by search: find(1) or ncheck(1)
- Only allowed on leaf-nodes (NON-directories) so the file system tree structure is guaranteed
Hard links have the following DRAWBACKS:
- Limited to files within the file system
- Not supported by all file system types (such as FAT)
- BeeGFS (formerly FhGFS)
supports hard links only within the same directory
because the file system is spread among many servers.
Symbolic/Soft links have the following ADVANTAGES:
- Works for all file types (regular file, directory, special file), even across mount points
- May work in file system dependent ways (allowing new features such as conditional symbolic links)
Symbolic/Soft links have the following DRAWBACKS:
- Works for all file types (may create loops or invalid tree structures)
- May work in file system dependent ways that are inconsistent or unexpected
- May point to non-existant files (because they were deleted, or not mounted anymore)
- May work differently whether its full or relative pathname
- Not all utilities understand symbolic links (will usually follow them, unaware of possible consequences)
Wikipedia also explains
more to come!
This is a work-in-progress.
I will give examples of creating links, side-effects and programs I use to tame them.
file permission bits
One of my peeves: the chmod(2) man page IS STILL WRONG.
First of all, the modes still have their Unix v6
which do not properly describe their current context-sensitive meanings:
How to describe the mode bits properly yet clearly? Let me try.
S_ISUID 04000 set user ID on execution
S_ISGID 02000 set group ID on execution
S_ISVTX 01000 sticky bit
S_IRUSR 00400 read by owner
S_IWUSR 00200 write by owner
S_IXUSR 00100 execute/search by owner
S_IRGRP 00040 read by group
S_IWGRP 00020 write by group
S_IXGRP 00010 execute/search by group
S_IROTH 00004 read by others
S_IWOTH 00002 write by others
S_IXOTH 00001 execute/search by others
File mode = file type + permissions
The file type is immutable: it cannot be changed once a file is created.
In Unix-type file systems, every file has a mode, as shown by ls, stat(2) and such.
Historically, the bits are represented in octal, or may be shown symbolically.
old timers using "chmod 0444" instead of "chmod go-w")
"ls -l" shows file types as:
[although file system & OS specific file types may be added]
- Regular file
b Block special file
c Character special file
l Symbolic link
p FIFO (p is for pipe)
w Whiteout (relates to stacking file systems such as translucent file system)
The lower 12 bits are file permissions, determining access by file owner, group and all-others (world).
The many meanings of the permission bits
The meanings of the permission bits are
overloaded: it depends on the context.
- read allows listing the directory (ls, find, du, open/getdents, shell filename expansion).
Directories with executable permission but NOT read permission
allows filename access IF YOU KNOW THE FILE NAME since listing/reading the directory is forbidden.
- write allows modifying directory entries (create, delete, move, rename files in that directory)
- execute allows searching the directory (using it as part of pathname)
- sticky bit restricts file deletion to the files owner or directory owner
(useful for shared directories such as /tmp, preventing people from deleting each others files).
This explains sticky directory nicely.
- set group ID:
wikipedia: setgid on directories:
setting the setgid permission on a directory (chmod g+s)
causes new files and subdirectories created within it to inherit its groupID,
rather than the primary groupID of the user who created the file.
This is not supported for all OS and file system types.
- set user ID works similarly for certain system implementations, see
wikipedia: setgid on directories.
For symbolic links, the permission bits have no meaning
because thats determined by the target file (which may be of any type: directory, etc.).
For regular files, things get tricky because the Unix file system presents all regular files
as a series of bytes with no structure or record format.
Pure executable (binary) files have no special status to the file system,
although many file formats self identify themselves with a header and magic number
as reported by file(1).
When a regular file is accessed by the open(2) or creat(2) system calls (directly or indirectly),
- read allows read(2)
- write allows write(2) (file writing, modification or appending)
- set group ID sets manditory file locking enforcement:
write(2) blocks, or fails with EAGAIN if O_NONBLOCK is enabled,
whereas advisory file locking depends on all processes properly collaborating
with flock(2) or fcntl(2) for file locking.
- execute has no meaning in this context
- set user ID has no meaning in this context
- sticky bit has no meaning in this context
For a regular file containing a pure executable (machine code)
then exec(2) interprets the file mode differently
- execute is required to access the file
- set user ID sets the process effective UID to the files UID (instead of inheriting it from the execs environment)
- set group ID similarly sets the process effecive GID to the files GID
- sticky bit used to mean keep in swap in old swapping systems.
That allowed faster loading of frequently used binaries (such as the editor).
According to wikipedia
a few systems still support it, but not Linux
- read is not required (but debuggers may require it)
- write has no effect where the virtual-memory-system makes code read-only instead of COW (copy-on-write)
(a clue: some systems allow deleting a file while executing, others dont.)
This may be needed for debugging live code.
caveats and details
- File mode handling depends on the the OS (operating system, sometimes version specific),
the file system type (ext, resier) and the way its implemented on the OS.
- mount(2) options override many properties,
such as read-only, no-set-UID, no special devices, enforce manditory file locking.
- File permissions are often supplemented by other facilities
such as ACL: Access Control Lists, SELinux, tripwire, etc.
One of my friends
It was originally a good idea for cost savings
but has been perverted into a form of planned obsolescence:
deliberately determining how much can be REMOVED
before an item or design is rendered useless or too unreliable.
Another friend refers to this trend as
The Race To The Bottom.
Thats why most DVD/VHS/CD players are totally dependent on their remote controls:
too many of the buttons and front panel controls have been removed
just to save a few pennies.
A friends now forced to buy professional video cameras
just for an external mic input.
To me, it an insult to my profession and an irritation as a user.
Antoine de Saint-Exupery
Perfection is achieved, not when there is nothing more to add,
but when there is nothing left to take away.
Sadly, this is being taken to illogical extremes.
For instance: ATAPI/IDE hard drives no longer have any LED for disk activity or diagnostics.
Not even a connector for an external LED
(not that cases or drive bays offer a LED for that anymore).
This is not just for blinkenlights entertainment: its a useful diagnostic
to assure that the drive is active when expected, particularly for backups,
RAID operation, during formatting, etc.
Similarly, the write protect switch has been eliminated
from hard drives, USB hard drives and most SD-card readers/sockets.
SCSI drives always had those features.
But even Apple and Sun went the economy route and switched to ATAPI drives.
Yea, new drives use the SCSI protocol and offer SMART monitoring,
but most systems dont have any software monitoring or logging of disk errors
(or case temperature or fan tach). Its enough to make a hardware engineer cry.
Which brings me to …
essential and required reading
Donald Normans The Design of Everyday Things
(originally published as The Psychology of Everyday Things)
ought to be REQUIRED READING for all engineers of all disciplines.
It goes way beyond the user interface.
Read it and take it to heart before your former customers do!
It ranks high enough for inclusion in
this top 13 list.
Anybody who has ever complained that they dont make things like they used to
will immediately connect with this book.
Normans thesis is that when designers fail to understand the processes
by which devices work, they create unworkable technology.
Director of the Institute for Cognitive Sciences at University of California, San Diego,
the author examines the psychological processes needed in operating and comprehending devices.
Examples include doors you dont know whether to push or pull
and VCRs you cant figure out how to program.
Written in a readable, anecdotal, sometimes breezy style,
the books scholarly sophistication is almost transparent.
Essential reading for Electrical Engineers:
Robert A. Pease, the legendary analog expert
Why did Bob Pease declare himself the Czar of Bandgaps??
Because a lot of people were repeating old mistakes in their new bandgap reference circuits.
Pease has been able to cut down the repetition of old errors.
From here on in, engineers have to make NEW errors.