When archiving large folders for storage across multiple DVDs, encryption can be added to enhance security. The following details the steps for archiving, splitting, and encrypting, with a focus on flexible and secure methods.
To efficiently manage large folders for single-layer DVD storage (approx. 4.5GB), the tar and split commands can be combined:
tar -zcvf - folder_name | split -b 4500M - archive_name.tar.gz.
.tar.gz archive, outputting it to standard output.tar command and splits the archive into 4.5GB chunks. The files are named sequentially, e.g., archive_name.tar.gz.aa, archive_name.tar.gz.ab, etc.To restore the split archive, the following steps can be used to concatenate the parts and extract the archive:
cat archive_name.tar.gz.* > full_archive.tar.gz
tar -zxvf full_archive.tar.gz
Alternatively, both steps may be combined into one:
cat archive_name.tar.gz.* | tar -zxvf -
type archive_name.tar.gz.* | tar -zxvf - (Windows)
The split command allows for adjustments based on DVD size:
FOLDER="./folder_name" && tar -zcvf - "$FOLDER" | split -b 4480M - "${FOLDER}.tar.gz."
FOLDER="./folder_name" && tar -zcvf - "$FOLDER" | split -b 8150M - "${FOLDER}.tar.gz."
In each case, the archive is split into chunks that can fit onto DVDs, with the files being named sequentially.
Encryption can be added using external tools like GPG or OpenSSL since neither tar nor split directly support password protection.
GPG provides AES-256 symmetric encryption for securing the tarball with a password.
tar -zcvf - folder_name | gpg --symmetric --cipher-algo AES256 -o archive_name.tar.gz.gpg
split -b 4500M archive_name.tar.gz.gpg archive_name_split.
cat archive_name_split.* > full_archive.tar.gz.gpg
gpg -o full_archive.tar.gz -d full_archive.tar.gz.gpg
tar -zxvf full_archive.tar.gz
This restores the archive by combining the parts, decrypting, and extracting the tarball.
OpenSSL is another option for adding password-based encryption to the tar archive.
tar -zcvf - folder_name | openssl enc -aes-256-cbc -e -k 'password' -out archive_name.tar.gz.enc
split -b 4500M archive_name.tar.gz.enc archive_name_split.
cat archive_name_split.* > full_archive.tar.gz.enc
openssl enc -aes-256-cbc -d -k 'password' -in full_archive.tar.gz.enc -out full_archive.tar.gz
tar -zxvf full_archive.tar.gz
This process ensures that large folders can be efficiently archived, split, and encrypted for secure storage across multiple DVDs.
sips (Scriptable Image Processing System)To convert a PNG file to a JPEG file, the following command can be used:
sips -s format jpeg input.png --out output.jpg
sips: The command-line utility for image processing on macOS.-s format jpeg: Sets the output format to JPEG.input.png: Refers to the input PNG file.--out output.jpg: Specifies the name of the output file, which will be in JPEG format.Note: The extensions .jpg and .jpeg are interchangeable. The sips command processes both formats the same way.
To convert a JPEG (or JPG) file to a PNG file, the following command is used:
sips -s format png input.jpg --out output.png
-s format png: Sets the output format to PNG.input.jpg: Refers to the input JPEG or JPG file.--out output.png: Specifies the name of the output file, which will be in PNG format.sips & ImageMagick)On macOS, JPG files in the current directory can be reduced from approximately 3 MB to around 250 KB either by using the built-in sips tool or by employing ImageMagick if it is installed via Homebrew. Two approaches are outlined below:
sips (built-in macOS tool)The sips command allows adjustment of image quality and compression. Example:
mkdir resized
for f in *.jpg; do
sips -s formatOptions 50 "$f" --out "resized/$f"
done
-s formatOptions 50 sets JPEG quality on a scale from 0 to 100.resized folder.The sips utility does not provide a direct method to specify the target file size. Adjustments to formatOptions are necessary until the desired average file size is achieved.
ImageMagick offers more precise control of file size. It may be installed via Homebrew as follows:
brew install imagemagick
Once installed, the following command sequence can be applied:
mkdir resized
for f in *.jpg; do
magick "$f" -define jpeg:extent=250kb "resized/$f"
done
-define jpeg:extent=250kb compresses the output automatically to fit under approximately 250 KB.Written on September 1, 2025
sips) (Written October 10, 2025)macOS includes a built-in command-line utility called sips (Scriptable Image Processing System), which can convert HEIC images to PNG in batch mode. The following example scans the current directory for all files with the extensions .heic or .HEIC and converts them to PNG format with the same base filenames:
for f in *.(heic|HEIC); do
base="${f%.*}"
sips -s format png "$f" --out "${base}.png"
done
for f in *.(heic|HEIC); do ... done: Iterates through all files ending with .heic or .HEIC in the current directory.base="${f%.*}": Removes the file extension from the filename and stores the base name in the variable base.sips -s format png "$f" --out "${base}.png": Converts the original HEIC file into a PNG file with the same base name.Example:
If a file named photo1.HEIC exists, this script will produce photo1.png in the same directory.
Tip: If you want to place all converted files into a separate folder (e.g., converted), create the directory first and adjust the output path like so:
mkdir converted
for f in *.(heic|HEIC); do
base="${f%.*}"
sips -s format png "$f" --out "converted/${base}.png"
done
This approach keeps the original HEIC files intact while storing the converted PNG images in a dedicated folder.
Written on October 10, 2025
This document provides a systematic, reproducible method to prepare and print passport-size photos on macOS while preserving the original aspect ratio and physical dimensions. The guidance covers two reliable workflows: a professional 300 dpi flow and a macOS-friendly 144 dpi flow that avoids unintended enlargement during printing. Practical shell commands and a complete, ready-to-run script are included.
| Item | Details |
|---|---|
| macOS tools | sips (built into macOS), ImageMagick (install via Homebrew) |
| Homebrew (optional) | /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" |
| Install ImageMagick | brew install imagemagick |
| Source image | Front-facing ID photo with adequate headroom and neutral background (PNG or JPG) |
Most ID photos are 3.5 cm × 4.5 cm. The correct pixel size depends on the dpi used for printing. When macOS or printer drivers ignore the intended dpi or default to a different value, physical output scales incorrectly. Correctly setting dpi metadata avoids this behavior.
| Target physical size | DPI assumption | Pixel size (width × height) | Recommended usage |
|---|---|---|---|
| 3.5 cm × 4.5 cm | 300 dpi | 413 × 531 px | Photo labs and professional printers |
| 3.5 cm × 4.5 cm | 144 dpi | 198 × 255 px | macOS desktop printing; minimizes unintended enlargement |
| 3.5 cm × 4.5 cm | 96 dpi | 132 × 170 px | Some Windows desktop environments |
| 3.5 cm × 4.5 cm | 72 dpi | 99 × 128 px | Legacy or dpi-agnostic viewers (not recommended) |
The pixel sizes above are approximate due to rounding. Setting correct dpi metadata is critical so that macOS interprets the physical size correctly at print time.
This flow yields a 413 × 531 px image tagged at 300 dpi, suitable for professional printing environments.
sips -Z 531 input.jpg --out passport_300dpi.jpg
sips --setProperty dpiWidth 300 --setProperty dpiHeight 300 passport_300dpi.jpg
sips -c 531 413 passport_300dpi.jpg --out passport_300dpi_exact.jpg
montage passport_300dpi.jpg passport_300dpi.jpg passport_300dpi.jpg \
passport_300dpi.jpg passport_300dpi.jpg passport_300dpi.jpg \
-geometry 413x531+50+50 -tile 2x3 -background white -density 300 A4_passport_300dpi.pdf
This flow has proven robust on macOS desktops, preventing oversized prints by aligning image pixels and dpi metadata with common macOS interpretations.
sips -Z 255 input.jpg --out passport_144dpi.jpg
sips --setProperty dpiWidth 144 --setProperty dpiHeight 144 passport_144dpi.jpg
montage passport_144dpi.jpg passport_144dpi.jpg passport_144dpi.jpg \
passport_144dpi.jpg passport_144dpi.jpg passport_144dpi.jpg \
-geometry 198x255+40+40 -tile 2x3 -background white -density 144 A4_passport_144dpi.pdf
When unexpected enlargement occurs on macOS, the 144 dpi flow is typically the most reliable solution.
Written on October 12, 2025
A reproducible command-line workflow for (i) rotating all .HEIC files in a directory by 90° counterclockwise, then (ii) generating a single multi-page PDF named bible.pdf in alphabetical order.
*.HEIC.rotated/rotated/_tmp_png/rotated/bible.pdf| Step | Working directory | Command | Purpose / notes |
|---|---|---|---|
| 1 | bible/ |
mkdir rotated |
Create a destination folder to avoid overwriting originals. |
| 2 | bible/ |
for f in *.HEIC; do sips -r -90 "$f" --out "rotated/$f"; done |
Rotate all HEIC files by 90° counterclockwise and write rotated copies into rotated/. |
| 3 | rotated/ |
mkdir _tmp_png |
Create a staging folder for PNG conversion. |
| 4 | rotated/ |
for f in $(ls *.HEIC | sort); do sips -s format png "$f" --out "_tmp_png/${f%.HEIC}.png" >/dev/null; done |
Convert rotated HEIC files to PNG in alphabetical order. Outputs into _tmp_png/. |
| 5 | rotated/_tmp_png/ |
img2pdf $(ls *.png | sort) -o ../bible.pdf |
Attempt PDF compilation; failed initially because img2pdf was not installed. |
| 6 | rotated/_tmp_png/ |
sips -s format pdf $(ls *.png | sort) --out ../bible.pdf |
Attempt PDF compilation via sips; produced an error because sips does not merge multiple inputs into a single PDF file using --out as a file path. |
| 7 | rotated/_tmp_png/ |
brew install img2pdf |
Install the missing PDF merge utility. |
| 8 | rotated/_tmp_png/ |
img2pdf $(ls *.png | sort) -o ../bible.pdf |
Successful creation of bible.pdf in alphabetical order. |
Question. How to rotate all *.HEIC images in the current directory by 90° counterclockwise using macOS command line?
Answer. Use sips (built-in) for batch rotation. For safe output (no overwrite), create an output directory and write rotated copies there.
mkdir rotated for f in *.HEIC; do sips -r -90 "$f" --out "rotated/$f" done
Notes:
-r -90 applies 90° counterclockwise rotation.rotated/.Question. After rotation, how to combine all HEIC files into bible.pdf after alphabetical sorting?
Answer. Convert HEIC to PNG in sorted order, then merge PNGs into a multi-page PDF. This avoids HEIC-to-PDF edge cases and ensures stable ordering.
cd rotated mkdir _tmp_png for f in $(ls *.HEIC | sort); do sips -s format png "$f" --out "_tmp_png/${f%.HEIC}.png" >/dev/null done
Notes:
ls *.HEIC | sort enforces alphabetical processing.${f%.HEIC}.png maps each HEIC filename to a PNG filename./dev/null suppresses conversion logs.Question. How to merge the sorted PNG files into a single PDF?
Answer. Use img2pdf for deterministic, multi-page PDF generation. If not installed, install it with Homebrew.
cd _tmp_png brew install img2pdf img2pdf $(ls *.png | sort) -o ../bible.pdf cd ..
Question. Why did img2pdf initially fail with zsh: command not found: img2pdf?
Answer. The utility was not installed in the environment. Installing it via brew install img2pdf resolved the missing command.
Question. Why did sips -s format pdf ... --out ../bible.pdf fail with an error indicating no destination directory?
Answer. For multiple input files, sips expects --out to be a destination directory, not a single output file. sips is suitable for one-image-to-one-PDF conversions but is not suitable for reliably merging multiple images into one PDF via a single output file path.
# Example of the failing pattern (multiple inputs -> single file via sips) sips -s format pdf $(ls *.png | sort) --out ../bible.pdf
Rotate HEIC images into a separate directory.
mkdir rotated for f in *.HEIC; do sips -r -90 "$f" --out "rotated/$f" done
Convert rotated HEIC to PNG in sorted order.
cd rotated mkdir _tmp_png for f in $(ls *.HEIC | sort); do sips -s format png "$f" --out "_tmp_png/${f%.HEIC}.png" >/dev/null done
Install and use img2pdf to create a single multi-page PDF.
cd _tmp_png brew install img2pdf img2pdf $(ls *.png | sort) -o ../bible.pdf cd ..
Confirm PDF presence and size.
ls -lh bible.pdf
Open the PDF for visual verification.
open bible.pdf
Remove intermediate PNG files after confirming output.
rm -rf _tmp_png
Written on December 29, 2025
FFmpeg or Graphical Editors on Windows and macOS, and Resolving Homebrew PATH IssuesThis guide provides a structured approach to trimming a specific section from a video using either FFmpeg or a graphical editor on Windows and macOS. It explains how to install and use FFmpeg, explores alternative editing methods, and offers troubleshooting steps to resolve PATH issues when installing FFmpeg via Homebrew on macOS. Every step and consideration is presented to ensure a smooth and professional workflow.
FFmpeg is a free, open-source tool that supports a wide range of audio and video operations. It is available on both Windows and macOS.
| Platform | Installation Steps |
|---|---|
| Windows |
|
| macOS |
|
Note: If Homebrew is used on Apple Silicon (M1/M2) Macs, binaries often reside in
/opt/homebrew/bin. On Intel Macs, they often reside in/usr/local/bin.
Once FFmpeg is installed, the following command trims a segment from abc.mp4—starting at 00:00:49 and ending at 00:04:41—and saves the trimmed content into abc_edited.mp4:
ffmpeg -i abc.mp4 -ss 00:00:49 -to 00:04:41 -c copy abc_edited.mp4
| Feature | Stream Copy | Re-encoding |
|---|---|---|
| Quality | Original (No quality loss) | May degrade slightly depending on settings |
| Speed | Very fast (no compression needed) | Slower (requires processing and compression) |
| Editing | Limited to cutting/trimming | Flexible (supports format conversion, resizing, etc.) |
| Command Example | -c copy |
-c:v libx264 -c:a aac (or other codecs) |
-c copy and specifying codecs (e.g., -c:v libx264 -c:a aac) will force FFmpeg to re-encode:
ffmpeg -i abc.mp4 -ss 00:00:49 -to 00:04:41 -c:v libx264 -c:a aac abc_edited.mp4
-ss before -i can cause FFmpeg to seek to the nearest keyframe, which occasionally introduces slight timing differences. If necessary, experiment with placing -ss either before or after -i:
ffmpeg -ss 00:00:49 -i abc.mp4 -to 00:04:41 -c copy abc_edited.mp4
-ss after -i is sufficient with -c copy.Although FFmpeg is command-line based, some may prefer graphical methods. These editors typically re-encode video, which can take longer and potentially reduce quality, but they offer an intuitive visual interface.
abc.mp4 and select Open with → Photos (or Video Editor).abc_edited.mp4.abc.mp4 into a new or existing project timeline.abc_edited.mp4.Occasionally, macOS users who install FFmpeg via Homebrew experience “command not found” errors. This typically indicates that the shell cannot locate the installed FFmpeg binary, often due to PATH misconfiguration.
brew list ffmpeg
or
brew info ffmpeg
These commands display details about the FFmpeg package. If no information appears, consider reinstalling:
brew reinstall ffmpeg
Homebrew generally installs software in one of the following directories:
/opt/homebrew/bin/usr/local/binTo confirm the exact location of FFmpeg, run:
find "$(brew --prefix)" -name ffmpeg -type f
This command returns the full path to the installed ffmpeg binary (for example, /opt/homebrew/bin/ffmpeg).
To see if the correct installation directory is in the PATH, run:
echo $PATH
If /opt/homebrew/bin (Apple Silicon) or /usr/local/bin (Intel) is absent, the shell will not be able to locate FFmpeg.
Apple Silicon (M1/M2) Macs
If /opt/homebrew/bin is missing, add the following line to the shell configuration file (e.g., ~/.zshrc), then reload:
export PATH="/opt/homebrew/bin:$PATH"
source ~/.zshrc
Intel Macs
If /usr/local/bin is missing (uncommon, but possible), add the following line to the shell configuration file (e.g., ~/.zshrc or ~/.bash_profile):
export PATH="/usr/local/bin:$PATH"
source ~/.zshrc # or source ~/.bash_profile if using bash
Running ffmpeg -version afterward verifies a successful configuration.
Written on February 12, 2025
FFmpegFFmpeg is a powerful, open‐source multimedia framework capable of handling a wide range of video and audio operations. On macOS, it provides an efficient way to trim, concatenate, and re‐encode video clips via command‐line instructions. This guide focuses on installing and configuring FFmpeg on macOS, trimming videos (both single and multiple segments), and verifying the tool’s installation path.
brew install ffmpeg
ffmpeg binary to a convenient directory, such as ~/ffmpeg.
~/.zshrc or ~/.bash_profile).
echo 'export PATH="/opt/homebrew/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc
echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.bash_profile
source ~/.bash_profile
ffmpeg -version
A successful output indicates FFmpeg is correctly installed and accessible.
In some cases, it may be necessary to confirm the exact path where FFmpeg has been installed (for example, when configuring external tools or diagnosing “command not found” errors). The following command uses Homebrew’s prefix to locate the ffmpeg binary:
find "$(brew --prefix)" -name ffmpeg -type f
brew --prefix returns the base directory where Homebrew is installed. On Apple Silicon systems, this is commonly /opt/homebrew; on Intel-based Macs, /usr/local.
$(...) instructs the shell to execute brew --prefix and insert that output into the find command.
find "$(brew --prefix)" -name ffmpeg -type f searches all subdirectories under Homebrew’s prefix for any file named ffmpeg, restricting results to regular files (-type f).
Trimming one continuous portion of a video is simple using -ss (start time), -to (end time), and -c copy (stream copy). Stream copy avoids re-encoding, preserving original quality and saving time.
ffmpeg -i abc.mp4 -ss 00:00:49 -to 00:04:41 -c copy abc_edited.mp4
When multiple non-contiguous sections of a video need to be combined into a single output, there are two primary approaches:
Assume three segments are required from abc.mp4:
# Segment 1: 2:03–3:12
ffmpeg -i abc.mp4 -ss 00:02:03 -to 00:03:12 -c copy part1.mp4
# Segment 2: 3:40–4:03
ffmpeg -i abc.mp4 -ss 00:03:40 -to 00:04:03 -c copy part2.mp4
# Segment 3: 5:02–5:55
ffmpeg -i abc.mp4 -ss 00:05:02 -to 00:05:55 -c copy part3.mp4
mylist.txt) with each extracted segment in order:
file 'part1.mp4'
file 'part2.mp4'
file 'part3.mp4'
ffmpeg -f concat -safe 0 -i mylist.txt -c copy abc_edited.mp4
mylist.txt.Note: This two-step method is fast and lossless but requires creating multiple intermediate files.
For a one-step method or when advanced processing (like overlays, resizing, or format changes) is needed, FFmpeg’s filter_complex can be used. This process involves re-encoding:
ffmpeg -i abc.mp4 \
-filter_complex "
[0:v]trim=start=123:end=192,setpts=PTS-STARTPTS[v0];
[0:a]atrim=start=123:end=192,asetpts=PTS-STARTPTS[a0];
[0:v]trim=start=220:end=243,setpts=PTS-STARTPTS[v1];
[0:a]atrim=start=220:end=243,asetpts=PTS-STARTPTS[a1];
[0:v]trim=start=302:end=355,setpts=PTS-STARTPTS[v2];
[0:a]atrim=start=302:end=355,asetpts=PTS-STARTPTS[a2];
[v0][a0][v1][a1][v2][a2]concat=n=3:v=1:a=1[v][a]
" \
-map "[v]" -map "[a]" \
-c:v libx264 -c:a aac -crf 18 -preset veryfast abc_edited.mp4
Note: Re-encoding can reduce quality unless CRF or bitrate settings are high, and it generally takes longer than stream copy.
| Criteria | Concat Demuxer | Filter Complex |
|---|---|---|
| Workflow | Two-step (extract → concatenate) | Single command |
| Re-encoding | No (lossless) | Yes (may affect quality unless configured carefully) |
| Speed | Faster (stream copy only) | Slower (due to re-encoding) |
| Flexibility | Limited to trimming and joining | Supports resizing, overlays, format changes, etc. |
-ss before -i can sometimes lead to frame-inaccurate trims in stream copy mode. If exact frame accuracy is critical, consider placing -ss after -i or re-encode for more precise cuts.
Written on February 21, 2025
Purpose: Provide a compact, reliable set of commands for two recurring tasks.
Assumption: ffmpeg is installed and available in the terminal environment.
| Task | Reliable method used | Output characteristics |
|---|---|---|
| Rotate 90° counterclockwise | -vf "transpose=2" with prores_ks (-profile:v 3) |
High-fidelity (visually lossless), larger files, broadly compatible |
| Concatenate MOV files sequentially | filter_complex concat with ProRes + PCM |
Stable timestamps, reliable seeking/scrubbing, larger files |
How to rotate a MOV video 90° counterclockwise on macOS, preserving high visual fidelity and avoiding playback/compatibility issues?
Command (single input)
ffmpeg -i INPUT.MOV -vf "transpose=2" -c:v prores_ks -profile:v 3 -c:a copy OUTPUT_ROTATED.mov
Step-by-step procedure
INPUT.MOV).transpose=2.Notes
transpose=2: 90° counterclockwise rotation.prores_ks -profile:v 3: Produces an editing-friendly, high-quality output (file size may increase significantly).-c:a copy: Preserves the original audio without re-encoding (when compatible).How to concatenate several MOV clips sequentially into a single MOV output, avoiding timestamp errors and maintaining stable playback?
Command (three inputs)
ffmpeg \
-i CLIP_01.MOV -i CLIP_02.MOV -i CLIP_03.MOV \
-filter_complex "[0:v][0:a][1:v][1:a][2:v][2:a]concat=n=3:v=1:a=1[v][a]" \
-map "[v]" -map "[a]" \
-c:v prores_ks -profile:v 3 \
-c:a pcm_s16le \
-movflags +faststart \
OUTPUT_CONCAT.mov
Step-by-step procedure
CLIP_01.MOV, CLIP_02.MOV, CLIP_03.MOV (any filenames are acceptable; placeholders are shown).Why this method is preferred in practice
Stream-copy concatenation (-c copy) can emit warnings such as Non-monotonic DTS and may produce incorrect timestamps. The filter-based concat method generates a clean, continuous timeline and has proven stable.
Notes
concat=n=3:v=1:a=1: Concatenates 3 segments, producing 1 continuous video stream and 1 continuous audio stream.-c:a pcm_s16le: Uncompressed PCM audio minimizes timestamp edge cases and improves editability (file size increases).-movflags +faststart: Improves responsiveness for preview/streaming workflows by placing key metadata earlier in the file.Which filename conventions support repeated use without rewriting commands?
Rotation naming
PROJECT_SCENE_TAKE.movPROJECT_SCENE_TAKE_rotCCW90_prores.movConcatenation naming
SEG_01.mov, SEG_02.mov, SEG_03.movSEG_01-03_concat_prores.movWritten on January 1, 2026
ProRes 422 HQ (prores_ks -profile:v 3) is an editing mezzanine codec with a far higher bitrate than typical phone/camera “delivery” MOV files.
As a result, 20 MB source clips commonly expand to hundreds of MB or multiple GB after rotation/concatenation when ProRes HQ is enforced.
| Goal | Method | Video codec | Audio codec | File size | Notes |
|---|---|---|---|---|---|
| Smallest output, good quality | Re-encode | H.264 (libx264) | AAC | Low–Medium | Best general-purpose choice |
| Smaller output than H.264 at same quality | Re-encode | H.265 (libx265) | AAC | Low | Slower encode; compatibility depends on devices/editors |
| Fastest, no quality loss | Stream copy | Copy existing | Copy existing | Lowest | Only works when inputs already match perfectly (codec/params) |
| Editing-friendly, but not huge | Re-encode | ProRes LT / Proxy | PCM or AAC | High | Use profile 1 (LT) or 0 (Proxy) instead of HQ |
ffmpeg -i INPUT.MOV \
-vf "transpose=2" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-c:a aac -b:a 160k \
-movflags +faststart \
OUTPUT_ROTATED.mp4
yuv420p improves broad compatibility.ffmpeg -i INPUT.MOV \
-vf "transpose=2" \
-c:v prores_ks -profile:v 1 \
-c:a pcm_s16le \
-movflags +faststart \
OUTPUT_ROTATED_PRORES_LT.mov
-profile:v 1 = ProRes LT (materially smaller than HQ).-profile:v 0 (Proxy).This is the smallest and fastest approach, but requires matching codecs/parameters across clips.
list.txtprintf "file '%s'\n" CLIP_01.MOV CLIP_02.MOV CLIP_03.MOV > list.txt
ffmpeg -f concat -safe 0 -i list.txt \
-c copy \
-movflags +faststart \
OUTPUT_CONCAT.mov
If this fails (errors about non-monotonous DTS, differing streams, or missing audio), use re-encode below.
ffmpeg \
-i CLIP_01.MOV -i CLIP_02.MOV -i CLIP_03.MOV \
-filter_complex "[0:v][0:a][1:v][1:a][2:v][2:a]concat=n=3:v=1:a=1[v][a]" \
-map "[v]" -map "[a]" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-c:a aac -b:a 160k \
-movflags +faststart \
OUTPUT_CONCAT.mp4
-r 30 (or -r 60) as appropriate.Small ┃ H.265 (CRF ~22–26)
┃ H.264 (CRF ~18–23)
┃ ProRes Proxy (profile 0)
┃ ProRes LT (profile 1)
Large ┃ ProRes HQ (profile 3)
libx264 -crf 20 -preset slow, AAC 160kprores_ks -profile:v 1 (LT)-c copyWritten on January 4, 2026
Provide a compact, repeatable workflow to combine two video files into a single output in either side-by-side (left–right) or vertical stack (top–bottom) layouts, including reliable handling for unequal durations.
IMG_9438.MOV and ai_yolov8.mp4.| Goal | Layout | Duration rule | Recommended command | Result |
|---|---|---|---|---|
| Visual comparison, clean ending | Side-by-side (left–right) | Stop at shorter clip | hstack + shortest=1 |
Output ends when either input ends |
| Visual comparison, full run | Side-by-side (left–right) | Match longer clip | tpad clone + hstack |
Shorter clip freezes on last frame |
| AI result overlay review, stacked views | Vertical (top–bottom) | Stop at shorter clip | vstack + shortest=1 |
Output ends when either input ends |
| AI result overlay review, full run | Vertical (top–bottom) | Match longer clip | tpad clone + vstack |
Shorter clip freezes on last frame |
Each input is normalized to a shared height (720) while preserving aspect ratio via scale=-1:720.
Pixel aspect ratio is normalized using setsar=1 to avoid unintended stretching.
hstack for left–right, vstack for top–bottomshortest=1) or freeze the shorter clip (tpad stop_mode=clone)libx264 -crf 20 -preset slow for good quality and manageable size-an) for clarity in visual comparisonsffmpeg -i IMG_9438.MOV -i ai_yolov8.mp4 \
-filter_complex "[0:v]scale=-1:720,setsar=1[v0];[1:v]scale=-1:720,setsar=1[v1];[v0][v1]hstack=inputs=2:shortest=1[v]" \
-map "[v]" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-an \
-movflags +faststart \
side_by_side.mp4
IMG_9438.MOV (left) and ai_yolov8.mp4 (right).side_by_side.mp4.ffmpeg -i IMG_9438.MOV -i ai_yolov8.mp4 \
-filter_complex "[0:v]scale=-1:720,setsar=1,tpad=stop_mode=clone:stop_duration=3600[v0];[1:v]scale=-1:720,setsar=1,tpad=stop_mode=clone:stop_duration=3600[v1];[v0][v1]hstack=inputs=2[v]" \
-map "[v]" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-an \
-movflags +faststart \
side_by_side_full_length.mp4
side_by_side_full_length.mp4.stop_duration from 3600 to a smaller number (in seconds).ffmpeg -i IMG_9438.MOV -i ai_yolov8.mp4 \
-filter_complex "[0:v]scale=-1:720,setsar=1[v0];[1:v]scale=-1:720,setsar=1[v1];[v0][v1]vstack=inputs=2:shortest=1[v]" \
-map "[v]" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-an \
-movflags +faststart \
vertical_stack.mp4
IMG_9438.MOV (top) and ai_yolov8.mp4 (bottom).vertical_stack.mp4.ffmpeg -i IMG_9438.MOV -i ai_yolov8.mp4 \
-filter_complex "[0:v]scale=-1:720,setsar=1,tpad=stop_mode=clone:stop_duration=3600[v0];[1:v]scale=-1:720,setsar=1,tpad=stop_mode=clone:stop_duration=3600[v1];[v0][v1]vstack=inputs=2[v]" \
-map "[v]" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-an \
-movflags +faststart \
vertical_stack_full_length.mp4
vertical_stack_full_length.mp4.stop_duration from 3600 to a smaller number (in seconds).
The default commands above disable audio (-an). To keep audio from a single input, remove -an and add one of the mappings below.
-map 0:a -c:a aac -b:a 160k
-map 1:a -c:a aac -b:a 160k
-an instead.hstack is used for side-by-side, and vstack is used for vertical stacking.[v0][v1] followed by the intended stack operator.scale=-1:720).720 with 1080 (or another height) consistently for both inputs.-r 30 (or -r 60) to output options.
When two inputs have unequal durations, a naive use of vstack will stop
the output as soon as one input ends. To ensure the final stacked video runs for the
full duration of the longer clip, the shorter clip must be
extended by freezing its final frame.
This is achieved using the tpad filter with
stop_mode=clone, which repeats the last decoded frame for a specified
duration. To avoid unintended infinite or excessively long outputs, the padding
duration should be chosen deliberately.
tpad=stop_mode=clone freezes the last frame
-t
ffmpeg -t 26.62 -i IMG_9438.MOV -t 26.62 -i ai_yolov8.mp4 \
-filter_complex "
[0:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
tpad=stop_mode=clone:stop_duration=7.8
[v0];
[1:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1
[v1];
[v0][v1]
vstack=inputs=2
[v]
" \
-map "[v]" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-an -movflags +faststart \
vertical_stack_full_length.mp4
In this example, the shorter clip is padded by approximately the duration difference
(26.62 − 18.83 ≈ 7.8 seconds), and the output is explicitly capped at the
longer clip’s duration.
This approach avoids the common pitfall of using arbitrarily large padding values
(e.g. 3600 seconds), which can cause unexpectedly long or seemingly
“never-ending” encodes.
Text labels are inserted before stacking, directly onto each individual video stream. This ensures that each label remains confined to its own panel (top or bottom), rather than spanning the combined output.
The drawtext filter is used with explicit centering logic based on
rendered text width.
x=(w-text_w)/2
y=40)
ffmpeg -t 26.62 -i IMG_9438.MOV -t 26.62 -i ai_yolov8.mp4 \
-filter_complex "
[0:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='PyScript version':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12,
tpad=stop_mode=clone:stop_duration=7.8
[v0];
[1:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='JavaScript version':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[v1];
[v0][v1]
vstack=inputs=2
[v]
" \
-map "[v]" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-an -movflags +faststart \
vertical_stack_labeled_full_length.mp4
trunc(iw*720/ih/2)*2 to satisfy H.264
yuv420p requirements.
A common refinement after experimenting with multi-row stacks is to eliminate unused (empty) space and present content as a compact 2×2 grid.
Both hstack and vstack require strict dimensional compatibility:
For a 2×2 grid, the safest pattern is to force every tile to a fixed geometry
(example: 1280×720 per tile), producing a combined frame of 2560×1440.
Static screenshots rarely match the target aspect ratio. A predictable approach is:
scale to fit within the tile, then pad to the exact target size.
This guarantees the final vstack does not fail due to width drift (e.g., 2528 vs 2560).
scale=1280:720:force_original_aspect_ratio=decreasepad=1280:720:(ow-iw)/2:(oh-ih)/2ffmpeg \
-i IMG_9574.MOV \
-i IMG_9577.MOV \
-i IMG_9581.MOV \
-loop 1 -i third.png \
-filter_complex "
[0:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='Traditional motor w/ remote control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r1l];
[1:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='LEGO Mindstorms w/ manual motor control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r1r];
[r1l][r1r]
hstack=inputs=2
[row1];
[2:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='MicroPython script automated motor control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r2l];
[3:v]
scale=1280:720:force_original_aspect_ratio=decrease,
pad=1280:720:(ow-iw)/2:(oh-ih)/2:color=black,
setsar=1,
drawtext=
text='Source code (VS Code capture)':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r2r];
[r2l][r2r]
hstack=inputs=2
[row2];
[row1][row2]
vstack=inputs=2
[v]
" \
-map "[v]" \
-t 35.37 \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-an -movflags +faststart \
grid_2x2_videos_plus_code.png.mp4
-loop 1 -i third.png keeps the screenshot available throughout the encode.
1280×720 via scale+pad, ensuring row2 becomes exactly 2560 pixels wide.
-t 35.37 caps the output to the intended master duration (often the longest video).
tpad=stop_mode=clone to the shorter video streams if freezing the last frame is preferred over early termination.
ffmpeg \
-i IMG_9574.MOV \
-i IMG_9577.MOV \
-i IMG_9581.MOV \
-loop 1 -i third.png \
-filter_complex "
[0:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='Traditional motor w/ remote control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r1l];
[1:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='LEGO Mindstorms w/ manual motor control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r1r];
[r1l][r1r]
hstack=inputs=2
[row1];
[2:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='MicroPython script automated motor control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r2l];
[3:v]
scale=1280:720:force_original_aspect_ratio=decrease,
pad=1280:720:(ow-iw)/2:(oh-ih)/2:color=black,
setsar=1,
drawtext=
text='Source code (VS Code capture)':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r2r];
[r2l][r2r]
hstack=inputs=2
[row2];
[row1][row2]
vstack=inputs=2
[v]
" \
-map "[v]" \
-map "2:a?" \
-t 35.37 \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-c:a aac -b:a 192k \
-movflags +faststart \
grid_2x2_videos_plus_code_with_audio.mp4
Inputs are indexed in declaration order. In this command:
0 → IMG_9574.MOV1 → IMG_9577.MOV2 → IMG_9581.MOV (audio source)3 → static image (no audio)
The mapping -map "2:a?" selects audio from the third video input.
The trailing ? makes the mapping optional, preventing failure if
the audio stream is absent.
Quotation marks are required on macOS to prevent zsh from interpreting
? as a filename wildcard.
yuv420p requirements.
-map is deterministic and reproducible.
In multi-panel compositions that combine several videos and static images into a 2×2 grid, visual information is distributed across panels, but audio should typically originate from one authoritative video source.
This section presents a complete, standalone workflow that:
zsh)Audio is taken exclusively from Video C.
ffmpeg \
-i IMG_9574.MOV \
-i IMG_9577.MOV \
-i IMG_9581.MOV \
-loop 1 -i third.png \
-filter_complex "
[0:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='Traditional motor w/ remote control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r1l];
[1:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='LEGO Mindstorms w/ manual motor control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r1r];
[r1l][r1r]
hstack=inputs=2
[row1];
[2:v]
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='MicroPython script automated motor control':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r2l];
[3:v]
scale=1280:720:force_original_aspect_ratio=decrease,
pad=1280:720:(ow-iw)/2:(oh-ih)/2:color=black,
setsar=1,
drawtext=
text='Source code (VS Code capture)':
x=(w-text_w)/2:
y=40:
fontsize=36:
fontcolor=white:
box=1:
boxcolor=black@0.45:
boxborderw=12
[r2r];
[r2l][r2r]
hstack=inputs=2
[row2];
[row1][row2]
vstack=inputs=2
[v]
" \
-map "[v]" \
-map "2:a?" \
-t 35.37 \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-c:a aac -b:a 192k \
-movflags +faststart \
grid_2x2_videos_plus_code_with_audio.mp4
Inputs are indexed in declaration order. In this command:
0 → IMG_9574.MOV1 → IMG_9577.MOV2 → IMG_9581.MOV (audio source)3 → static image (no audio)
The mapping -map "2:a?" selects audio from the third video input.
The trailing ? makes the mapping optional, preventing failure if
the audio stream is absent.
Quotation marks are required on macOS to prevent zsh from interpreting
? as a filename wildcard.
yuv420p requirements.
-map is deterministic and reproducible.
This workflow composes four echocardiography clips into a labeled 2×2 grid, then appends
a short PNG slide sequence (1.png → 4.png, 2 seconds each)
at the end. Audio is intentionally disabled (-an), which is appropriate for typical echo captures that
carry no meaningful audio stream.
PLAX.mp4 → label PLAXPLAX_c.mp4 → label PLAX (Color Doppler: blood flow across MV & AV)PSAX.mp4 → label PSAXPSAX_c.mp4 → label PSAX (Color Doppler: blood flow across Pulmonary valve)fps=30) to reduce irregular timing effects
across clips that may have different original frame rates.
720) while preserving aspect ratio,
with even widths enforced for H.264 yuv420p compatibility.
drawtext prior to stacking, keeping text confined to each tile.
xstack, producing a combined frame of approximately 2560×1440.
grid + 1.png + 2.png + 3.png + 4.png into a single output.
ffmpeg -y \
-i PLAX.mp4 \
-i PLAX_c.mp4 \
-i PSAX.mp4 \
-i PSAX_c.mp4 \
-loop 1 -t 2 -i 1.png \
-loop 1 -t 2 -i 2.png \
-loop 1 -t 2 -i 3.png \
-loop 1 -t 2 -i 4.png \
-filter_complex "
[0:v]
fps=30,
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='PLAX':
x=(w-text_w)/2:
y=40:
fontsize=40:
fontcolor=white:
box=1:
boxcolor=black@0.50:
boxborderw=14
[r1l];
[1:v]
fps=30,
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='PLAX (Color Doppler\: blood flow across MV & AV)':
x=(w-text_w)/2:
y=40:
fontsize=40:
fontcolor=white:
box=1:
boxcolor=black@0.50:
boxborderw=14
[r1r];
[2:v]
fps=30,
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='PSAX':
x=(w-text_w)/2:
y=40:
fontsize=40:
fontcolor=white:
box=1:
boxcolor=black@0.50:
boxborderw=14
[r2l];
[3:v]
fps=30,
scale=trunc(iw*720/ih/2)*2:720,
setsar=1,
drawtext=
text='PSAX (Color Doppler\: blood flow across Pulmonary valve)':
x=(w-text_w)/2:
y=40:
fontsize=40:
fontcolor=white:
box=1:
boxcolor=black@0.50:
boxborderw=14
[r2r];
[r1l][r1r][r2l][r2r]
xstack=inputs=4:layout=0_0|w0_0|0_h0|w0_h0:fill=black
[grid];
[4:v]
fps=30,
scale=2560:1440:force_original_aspect_ratio=decrease,
pad=2560:1440:(ow-iw)/2:(oh-ih)/2:color=black,
setsar=1
[p1];
[5:v]
fps=30,
scale=2560:1440:force_original_aspect_ratio=decrease,
pad=2560:1440:(ow-iw)/2:(oh-ih)/2:color=black,
setsar=1
[p2];
[6:v]
fps=30,
scale=2560:1440:force_original_aspect_ratio=decrease,
pad=2560:1440:(ow-iw)/2:(oh-ih)/2:color=black,
setsar=1
[p3];
[7:v]
fps=30,
scale=2560:1440:force_original_aspect_ratio=decrease,
pad=2560:1440:(ow-iw)/2:(oh-ih)/2:color=black,
setsar=1
[p4];
[grid][p1][p2][p3][p4]
concat=n=5:v=1:a=0
[v]
" \
-map "[v]" \
-c:v libx264 -preset slow -crf 20 -pix_fmt yuv420p \
-an \
-movflags +faststart \
echo_grid_2x2_then_4pngs.mp4
PLAX.mp4, PLAX_c.mp4, PSAX.mp4, PSAX_c.mp4 in the working directory.1.png, 2.png, 3.png, 4.png in the same directory.echo_grid_2x2_then_4pngs.mp4.-t 2).fontsize, boxcolor, and boxborderw
while keeping the overlay applied before xstack.
720 to 540 or 1080,
and update the slide scaling target accordingly (grid becomes ~2×width by 2×height).
-t 2 per PNG to another value.
xstack.
-an) simplifies output when source clips are silent and prevents unexpected stream mapping issues.
Written on January 4, 2026
This workflow documents a pragmatic procedure to reduce file size for multiple GoPro-class MP4 sources and then concatenate them into a single MP4 efficiently. The approach emphasizes operational clarity and reproducibility, so that the procedure may be recalled and repeated without ambiguity at a later time.
| Stage | Action | Output |
|---|---|---|
| Encode | Re-encode each source using a fixed target bitrate. | *_small.mp4 |
| List | Prepare a concat demuxer list file in playback order. | list.txt |
| Join | Concatenate streams without further re-encoding. | GH_combined.mp4 |
The following block is intended to be executed as a single shell workflow. Line breaks are preserved for natural reading and later reuse.
# Step 1 — Re-encode all sources to a smaller, uniform format for f in GH010880.MP4 GH020880.MP4 GH030880.MP4 GH040880.MP4; do base="${f%.MP4}" ffmpeg -y -i "$f" \ -c:v libx264 -preset slow -b:v 900k \ -c:a aac -b:a 128k \ -movflags +faststart \ "${base}_small.mp4" done # Step 2 — Create concat list printf "file 'GH010880_small.mp4'\nfile 'GH020880_small.mp4'\nfile 'GH030880_small.mp4'\nfile 'GH040880_small.mp4'\n" > list.txt # Step 3 — Concatenate without re-encoding ffmpeg -f concat -safe 0 -i list.txt -c copy GH_combined.mp4
-b:v 700k).-b:v 1200k).This workflow records a quality-oriented approach that preserves the original spatial resolution while reducing file size through constant-quality encoding. The intent is to maintain visual coherence and fine detail, particularly for action-camera footage, while still enabling fast concatenation.
| Stage | Action | Output |
|---|---|---|
| Encode | Re-encode using CRF to maintain consistent perceptual quality. | *_hq.mp4 |
| List | Create concat demuxer list file. | list.txt |
| Join | Concatenate without re-encoding. | GH_combined_hq.mp4 |
CRF is set to 18 as a high-quality baseline. Line breaks are preserved to maintain readability and recall.
# Step 1 — High-quality re-encode (constant quality, full resolution) for f in GH010880.MP4 GH020880.MP4 GH030880.MP4 GH040880.MP4; do base="${f%.MP4}" ffmpeg -y -i "$f" \ -c:v libx264 -preset slow -crf 18 \ -pix_fmt yuv420p \ -c:a aac -b:a 160k \ -movflags +faststart \ "${base}_hq.mp4" done # Step 2 — Concatenate without re-encoding printf "file 'GH010880_hq.mp4'\nfile 'GH020880_hq.mp4'\nfile 'GH030880_hq.mp4'\nfile 'GH040880_hq.mp4'\n" > list.txt ffmpeg -f concat -safe 0 -i list.txt -c copy GH_combined_hq.mp4
| CRF | Intent | Expected outcome |
|---|---|---|
| 16 | Near-archival | Very large files with strong detail retention. |
| 18 | High-quality baseline | Balanced size reduction with minimal perceptual loss. |
| 20 | Moderate compression | Generally transparent for many scenes. |
| 22 | Size-prioritized | Compression artifacts may become visible. |
Written on February 9, 2026
During undergraduate studies in Computer Science, Emacs was recommended and has been used for over two decades. Familiarity with its shortcuts has facilitated work in C kernel programming and debugging. This document serves both as a guide for readers to grasp the benefits of Emacs and as a resource for personal learning, combining well-known features with newly explored aspects intended for future use.
Emacs is particularly well-suited for individuals who prefer a fully keyboard-driven workflow. This feature enables the execution of virtually any task—be it editing text, managing files, running commands, or browsing the web—without relying on a mouse. Such efficiency stands as one of the most compelling reasons users continue to utilize Emacs even after many years.
Ctrl+a moves to the beginning of a line.Ctrl+e moves to the end of a line.Ctrl+f and Ctrl+b move the cursor forward and backward by characters, respectively.Buffers in Emacs are fundamental components that refer to any open file, running process, or even a help screen. They allow the management of multiple tasks or documents simultaneously without cluttering the workspace with numerous windows or applications.
Ctrl+x b. This command presents a list of buffers, enabling quick navigation. Alternatively, Ctrl+x Ctrl+b opens a more detailed buffer list, displaying all current buffers, including unsaved ones, shell outputs, or other processes.Ctrl+3 (vertical split) and Ctrl+2 (horizontal split) allow viewing multiple buffers concurrently. This is particularly useful for comparing documents, keeping notes open while coding, or reading documentation alongside writing. Additionally, windows can be resized dynamically using Ctrl+x + or Ctrl+x - to adjust the layout according to current tasks.Ctrl+x o cycles through the open windows, enabling seamless multitasking. Each window can display a different buffer, and Emacs retains the window configuration, facilitating easy return to specific setups.Emacs provides robust tools for compiling and debugging code, which are essential for tasks such as kernel programming in C. These features streamline the development process by integrating compilation and debugging directly within the editor.
| Functionality | Command | Description |
|---|---|---|
| Executing the Compile Command | ESC+x compile |
Initiates the compilation process for the current project or file. Prompts for the compile command, which can be customized as needed (e.g., make for kernel programming). |
| Navigating Compilation Errors | Ctrl+x ` (backtick) |
Jumps to the next error in the compilation output. Emacs parses the compilation buffer and highlights errors, enabling quick navigation to problematic lines in the source code. |
| Launching GDB | ESC+x gdb |
Launches the GNU Debugger (GDB) within Emacs, providing an interface to set breakpoints, step through code, inspect variables, and evaluate expressions directly from the editor. |
| Setting Breakpoints | Ctrl+x Ctrl+b |
Sets a breakpoint at the current line in the source code. Breakpoints allow the debugger to pause execution at specific points, facilitating the inspection of program state. |
| Stepping Through Code | n (next)s (step)c (continue) |
Executes the next line of code, steps into functions for detailed inspection, and continues execution until the next breakpoint or end of the program. |
| Inspecting Variables | ESC+x gdb-many-windows |
Opens multiple debugging windows, including source code, assembly, registers, and variable lists, aiding in monitoring the state of variables and program flow during debugging sessions. |
Emacs' compilation and debugging capabilities make it a powerful tool for kernel programming in C, offering an all-encompassing environment that supports efficient and effective development practices.
EWW (Emacs Web Wowser) is a built-in web browser in Emacs that allows browsing the web within a text-based environment. Although minimal compared to graphical browsers, EWW provides an efficient means to navigate the web while fully leveraging the keyboard-driven workflow appreciated by many Emacs users.
| Functionality | ___Command___ | Description |
|---|---|---|
| Opening a URL | ESC+x eww |
Enter the URL or search term to visit a webpage. EWW will load the page within a buffer. |
| Navigating Between Pages | l (Back)r (Forward)g (Reload) |
l returns to the previous page, r moves forward in history, and g reloads the current page. |
| Scrolling | 1' | Scroll through the page by screen or line increments. |
| Following Links | Enter |
Position the cursor over a link and press Enter to follow it. |
| Opening Links in New Buffers | Ctrl + Shift + Enter |
Opens the link in a new buffer, allowing multitasking across several web pages. |
| Returning to the Home Page | h |
Navigates back to the home page (if set) or the default Emacs home page. |
| Bookmark a Page | b |
Bookmarks the current page for quick access later without remembering the URL. |
| View Bookmarks | B |
Lists all bookmarks, allowing direct access to any saved page. |
| Viewing Browsing History | H |
Displays a list of previously visited pages, navigable with arrow keys or by entering corresponding numbers. |
| Toggle Images | I |
Toggles the display of images on or off. |
| Source View | 2' | Opens the raw HTML source code of the current page in a new buffer. |
| Change Search Engine | 3' | Customizes the default search engine used by EWW. |
1': Ctrl+v (Page Down), Meta+v (Page Up), Arrow Keys / Ctrl+n (Down) / Ctrl+p (Up)
2': ESC+x eww-view-source
3': Add (setq eww-search-prefix "https://www.google.com/search?q=") to configuration
While EWW does not replace full-featured browsers like Firefox for multimedia-heavy browsing or complex web applications, it offers an efficient, minimalistic browsing experience for those who prefer staying within the Emacs ecosystem and rely on text-based content.
Dired (Directory Editor) mode in Emacs provides a powerful and interactive method for managing files. It facilitates browsing and manipulating files and directories within the editor, thereby streamlining file system operations.
| Functionality | ___Command___ | Description |
|---|---|---|
| Launching Dired | ESC+x dired |
Opens Dired mode, prompting for a directory path. The specified directory is then displayed for file and directory management within Emacs. |
| File Operations | C (Copy)R (Rename)D (Delete) |
Executes basic file operations such as copying, renaming, and deleting. Can be performed on single or multiple files for batch operations. |
| Directory Navigation | Enter^ |
Enter opens the directory or file under the cursor, while ^ moves up one directory level. |
| Marking Files | m (Mark)u (Unmark) |
Marks files for batch operations and unmarks them as needed, allowing multiple files to be acted upon simultaneously. |
| Opening Files | Enter or f |
Opens the file under the cursor in a new buffer. |
| Sorting Files | s (Sort) |
Sorts files by various criteria such as name, size, or modification date to enhance file management efficiency. |
| Recursive Directory Management | g (Revert Buffer) |
Performs recursive operations on files within subdirectories without needing to navigate into each one individually. |
| Executing Shell Commands | ! (Shell Command) |
Executes shell commands directly from within Dired on selected files, facilitating tasks like batch renaming or compression. |
Dired mode transforms Emacs into a comprehensive file management system, providing the necessary tools to handle complex file operations without leaving the editor environment.
Emacs Lisp (Elisp) is the programming language embedded within Emacs, allowing for extensive customization and extension of the editor's capabilities. Emacs Lisp enables the writing of scripts, defining new commands, and creating custom workflows tailored to individual needs.
Emacs Lisp can be used to remap existing key bindings or create new ones, enhancing the efficiency of the keyboard-driven workflow. For example, binding a frequently used command to a simpler key combination can streamline operations.
;; Example: Bind F5 to save all buffers (global-set-key (kbd "<f5>") 'save-some-buffers)
Repetitive tasks can be automated using Emacs Lisp, reducing the need for manual intervention and minimizing the potential for errors. Automating file operations, text transformations, or buffer management are common applications.
;; Example: Automatically delete trailing whitespace on save (add-hook 'before-save-hook 'delete-trailing-whitespace)
Users can define new interactive commands to perform specialized functions, enhancing the editor's functionality to suit specific workflows or projects.
;; Example: Define a command to insert the current date (defun insert-current-date () "Insert the current date at point." (interactive) (insert (format-time-string "%Y-%m-%d"))) (global-set-key (kbd "C-c d") 'insert-current-date)
Emacs Lisp allows for the creation of new major or minor modes, providing tailored environments for different programming languages, file types, or project requirements.
;; Example: Define a simple minor mode
(define-minor-mode my-custom-mode
"A simple custom minor mode."
:lighter " MyMode"
:keymap (let ((map (make-sparse-keymap)))
(define-key map (kbd "C-c m") 'insert-current-date)
map))
(add-hook 'text-mode-hook 'my-custom-mode)
Emacs Lisp empowers users to transform Emacs into a highly personalized and powerful development environment. By leveraging Emacs Lisp, users can tailor Emacs to meet their unique requirements, enhancing productivity and fostering an efficient workflow.
Several command-line switches enhance Emacs' operation, similar to the -nw (no-window) option. These switches provide flexibility in how Emacs is launched, catering to various user needs and preferences.
| Switch_Options | Description |
|---|---|
-q |
Starts Emacs without loading the initialization file (.emacs or init.el). Useful for troubleshooting configuration issues or starting Emacs with default settings. |
--no-splash |
Launches Emacs without displaying the splash screen, resulting in a cleaner and faster startup experience. |
--daemon |
Runs Emacs in the background as a daemon, allowing subsequent Emacs instances to open more quickly by connecting to the already running process. Particularly beneficial for users who frequently start and stop Emacs sessions. |
-batch |
Executes Emacs in batch mode, without opening the graphical or text interface. Typically used for script execution or automation tasks, enabling Emacs to process files and perform operations without user interaction. |
--debug-init |
Starts Emacs with debugging enabled for the initialization process, aiding in the identification and resolution of errors within startup configuration files. |
These switches provide users with the ability to customize the Emacs startup behavior, enhancing the overall user experience by aligning Emacs' operation with specific requirements and use cases.
Ctrl+k: Deletes from the cursor to the end of the line and stores the deleted content in the kill ring (Emacs' clipboard equivalent).ESC+x compile: Executes the compile command, enabling code compilation within Emacs, which is particularly useful for developers.ESC+x query-replace: Initiates an interactive find-and-replace operation, prompting for confirmation before each replacement.ESC+x replace-string: Performs a non-interactive find-and-replace, replacing all occurrences of the specified string.ESC+x shell: Opens a shell within Emacs, providing access to a command-line interface directly from the editor.Ctrl+space, ESC+w: Marks a region for copying and then copies the selected text into the kill ring.Ctrl+y: Pastes (or "yanks") the most recently copied or cut text from the kill ring.Ctrl+y followed by ESC+y: Cycles through the kill ring, enabling the pasting of previously copied or cut items.Ctrl+x u: Undoes the most recent changes. This command can be repeated to undo multiple actions.Emacs offers several native commands for interactive or automatic string substitution. The macOS convention is used throughout (⌥ = Meta (M), ⌘ = Super (s)).
query-replace) — step‑by‑step confirmation for each match.query-replace-regexp) — regular‑expression variant with identical prompts.| Command | Scope & confirmation | Pattern type | Typical keystroke |
|---|---|---|---|
| query‑replace | Interactive, buffer or region | Literal | M % |
| query‑replace‑regexp | Interactive, buffer or region | Emacs Lisp regexp | M ⇧ % |
| replace‑string | Automatic, buffer or region | Literal | M‑x replace-string |
The following examples illustrate practical refactoring patterns and the reasoning behind each step.
M ⇧ % ^\(defun\s-+\)old_\(.*\)$ RET \1new_\2 RET !
What happens:
^\(defun\s-+\) captures the function keyword plus its required space into Group 1.old_\(.*\)$ captures the remainder of the symbol (e.g. old_process) into Group 2.\1new_\2 rebuilds each definition as (defun new_process …), preserving the original suffix.This technique is ideal for systematic API renaming after a naming‑policy change.
Regional replacement is particularly useful when refactoring temporary variables inside a long file while leaving other sections untouched.
C r M %
The prefix C r calls query-replace in reverse, scanning from point toward the beginning of the buffer (BOB).
Reverse traversal prevents accidental double replacements when iterating through matches already passed during forward edits.
M ⇧ % ,\s-*\\n RET ,\n\t RET !
Purpose: re‑formatting comma‑separated JSON arrays so that each element begins on a new, indented line.
\s-* matches any horizontal whitespace.\\n, ensuring the match includes the line break itself.\n) followed by a tab (\t) before the next array element.
The command may be combined with narrowing (C‑x n n) to focus on a JSON block without disturbing surrounding code.
M‑x occur followed by C‑c C‑o turns the *Occur* buffer writable; committed changes propagate back.Q invokes dired-do-query-replace-regexp across marked files.next-error or grep.C‑x n n) or operate within occur/grep buffers to avoid unintended files.\n \t \1) follow Emacs Lisp conventions, not POSIX syntax.Written on May 11, 2025
In 2019, Apple officially adopted Zsh (Z Shell) as the default shell, starting with macOS Catalina (10.15). This transition marked a significant change from the previously utilized Bash, which had been the default since the inception of macOS. The switch was largely driven by licensing issues and the enhanced features offered by Zsh, making it a more appealing choice for modern developers and power users.
Apple's decision to shift from Bash to Zsh was influenced substantially by licensing concerns. Until version 3.2, Bash was licensed under the GNU General Public License v2 (GPLv2), which posed fewer restrictions on redistribution and modification. Apple continued using this version for many years.
However, with the release of Bash 4.0, the license changed to GPLv3, which introduced stricter conditions. Under GPLv3:
By transitioning to Zsh, which is licensed under an MIT-like license, Apple was able to circumvent these issues. This permissive license allowed Apple to include Zsh without the obligation to disclose proprietary modifications, aligning more effectively with Apple’s distribution model.
Apart from addressing licensing concerns, Zsh provided various technical advantages that improved the user experience and rendered it a more suitable choice for Apple’s ecosystem.
1. Permissive Licensing
The MIT-like license associated with Zsh afforded Apple greater flexibility. Unlike GPLv3, it does not impose the requirement to share modifications, permitting Apple to distribute Zsh freely without concerns over proprietary rights.
2. Enhanced Features for Power Users
Zsh offers a range of features that enhance productivity and streamline shell interactions, which are particularly beneficial for developers:
3. User-Configurable Options and Prompt Customization
Zsh supports a broad spectrum of configuration options, enabling users to personalize nearly every aspect of the shell. This includes the capability to create dynamic prompts that display real-time information, contributing to a more informative and engaging terminal experience.
Zsh’s popularity among developers and system administrators has fostered a vibrant community that actively provides resources, such as:
In adopting Zsh as the default shell, Apple aligned with the preferences of a considerable portion of its developer user base. Many developers had already embraced Zsh for its advanced features, and the switch made macOS more intuitive and appealing to this audience.
Shifting to Zsh also facilitated a departure from the aging Bash 3.2, bringing several advantages in terms of security and maintainability:
scp in Zsh1. Local vs. Remote Expansion
When utilizing wildcards with scp, it is important to recognize that Zsh may attempt to expand these wildcards locally before executing the command. For example, a command intended to copy all .txt files from a remote server might resemble:
scp user@remote:/path/to/files/*.txt /local/destination/
Zsh might expand *.txt based on the local file system, potentially leading to unintended behavior. This happens because Zsh’s default behavior involves expanding wildcards during the globbing phase, which occurs before the command is executed. If matching files exist in the specified local path, Zsh replaces the wildcard with these files.
2. Why Escaping Wildcards Works
To ensure the wildcard is interpreted on the remote server rather than locally, escaping the wildcard with \* is necessary:
scp user@remote:/path/to/files/\*.txt /local/destination/
Escaping the asterisk directs Zsh to pass the wildcard to scp without local expansion, allowing the remote shell to interpret *.txt and carry out the intended file selection.
Several techniques can prevent Zsh from performing local expansion on wildcards meant for remote servers:
scp 'user@remote:/path/to/files/*.txt' /local/destination/
noglob: Zsh’s noglob directive disables wildcard expansion for the specified command.
noglob scp user@remote:/path/to/files/*.txt /local/destination/
rsync for Complex Transfers: For advanced file transfers, especially those involving recursion and selective inclusion/exclusion, rsync offers better control over wildcard patterns.
rsync -av --include='*.txt' --exclude='*' user@remote:/path/to/files/ /local/destination/
scpFor verification and troubleshooting, the manner in which Zsh interprets a command can be checked by prepending it with echo:
echo scp user@remote:/path/to/files/\*.txt /local/destination/
Alternatively, using the -v option with scp yields verbose output, aiding in the diagnosis of file transfer issues:
scp -v 'user@remote:/path/to/files/*.txt' /local/destination/
\*, quotes, or noglob to ensure wildcards are processed on the remote server.setopt and unsetopt commands in Zsh allow for adjustments to wildcard handling. Reviewing these settings can assist in tailoring Zsh’s behavior to specific needs.Configuring environment variables and adding aliases or functions for frequently used commands can greatly enhance efficiency in the command-line environment. This guide provides detailed instructions on how to set environment variables temporarily and permanently, both for individual users and system-wide, as well as how to add aliases and functions in zsh or bash shells, applicable to both Linux and macOS systems.
To set an environment variable for the current terminal session, use the export command. This change will only persist for the duration of the session and will be cleared once the terminal is closed.
Example: To temporarily set the PYTHONPATH environment variable:
# Temporarily set PYTHONPATH
export PYTHONPATH="/path/to/python/libs"
This sets the PYTHONPATH variable to include the specified directory for the current session.
To make environment variables persist across sessions, they must be added to the shell's configuration file. For zsh users, this is typically ~/.zshrc; for bash users, it is ~/.bashrc.
# Open .zshrc with a text editor
emacs ~/.zshrc
# Open .bashrc with a text editor
emacs ~/.bashrc
For example, to set the PYTHONPATH environment variable permanently:
# Set PYTHONPATH permanently
export PYTHONPATH="/path/to/python/libs"
If using pyenv, it may be necessary to add:
# Set up pyenv
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
Save the file and exit the text editor.
# Apply changes
source ~/.zshrc
# Apply changes
source ~/.bashrc
Note: On macOS, the default shell is zsh (since macOS Catalina). The same steps apply for setting environment variables in zsh on macOS.
For environment variables that should be available to all users on the system, add them to system-wide configuration files. On Linux, these files are /etc/environment, /etc/profile, or /etc/bash.bashrc. On macOS, system-wide configurations for zsh can be added to /etc/zshenv or /etc/zshrc.
/etc/environment:
sudo emacs /etc/environment
/etc/profile:
sudo emacs /etc/profile
sudo emacs /etc/zshrc
For example, to set PYTHONPATH globally:
# Set PYTHONPATH globally
export PYTHONPATH="/usr/local/lib/python3.9/site-packages"
Save the file and exit the text editor.
To apply the changes, log out and log back in, or source the configuration file. Note that changes to some system-wide files may require a system reboot or re-login to take effect.
Aliases and functions allow for efficient command reuse and can be added to the shell's configuration files.
Aliases are shortcuts for commands. To add aliases:
# Open .zshrc. If absent, use .zprofile
emacs ~/.zshrc
# Open .bashrc
emacs ~/.bashrc
# Alias to compress and split a folder
alias compress_folder='FOLDER="folder_name" && tar -zcvf - "$FOLDER" | split -b 4480M - "${FOLDER}.tar.gz."'
# Alias to search a specific folder for a pattern
alias grep_designated_folder='find /path/to/designated/folder -type f -print0 | xargs -0 grep -i "###" 1> tmp1 2> tmp2'
compress_folder: Compresses and splits a folder into chunks.grep_designated_folder: Searches a specific folder for a pattern.# Apply alias changes for zsh
source ~/.zshrc
# Apply alias changes for zsh when using .zprofile
source ~/.zprofile
# Apply alias changes for bash
source ~/.bashrc
Functions provide more flexibility with parameters than aliases. To add functions:
Add Function Definitions
# Function to compress a folder with a given name
compress_folder() {
FOLDER="$1"
tar -zcvf - "$FOLDER" | split -b 4480M - "${FOLDER}.tar.gz."
}
# Function to search a specified folder for a given pattern
grep_designated_folder() {
find "$1" -type f -print0 | xargs -0 grep -i "$2" 1> tmp1 2> tmp2
}
compress_folder: Accepts a folder name as an argument and compresses it.grep_designated_folder: Searches a specified folder for a given pattern.alias gonginx="cd /opt/homebrew/etc/nginx/"
alias gohttp="cd /opt/homebrew/var/www/"
alias mm="make clean && make"
alias kprint="sudo journalctl -k -f"
cd /opt/homebrew/var/www
function tar_backup_prototype() {
cd /opt/homebrew/var || return
filename="WEB$(date +"%Y%m%d")"
tar -zcvf "${filename}.tar.gz" www/
cd /opt/homebrew/var/www || return
}
function tar_backup() {
cd /opt/homebrew/var || return
# Initial base filename
base_filename="WEB$(date +"%Y%m%d")"
filename="${base_filename}.tar.gz"
# Check if the filename already exists, and append a counter if necessary
counter=1
while [ -e "$filename" ]; do
filename="${base_filename}_${counter}.tar.gz"
counter=$((counter + 1))
done
# Create the tar archive with the unique filename
tar -zcvf "$filename" www/
# Return to the specified directory
cd /opt/homebrew/var/www || return
}
function tar_web() {
# Check if a filename argument is provided
if [ -z "$1" ]; then
echo "Usage: tar_web "
return 1 # Exit the function with a non-zero status
fi
# Navigate to the specified directory or exit if it fails
cd /opt/homebrew/var || return
# Create the tar.gz archive with the provided filename
tar -zcvf "${1}.tar.gz" www/
# Navigate back to the www directory or exit if it fails
cd /opt/homebrew/var/www || return
}
function scp_backup_today() {
scp "ngene.org:/opt/homebrew/var/WEB$(date +"%Y%m%d")*.tar.gz" ~/Desktop/
}
function scp_backup_all() {
scp "ngene.org:/opt/homebrew/var/*.tar.gz" ~/Desktop/
}
scp2web() {
local filename="$1"
scp "${filename}"* ngene.org:/opt/homebrew/var/www/
}
scp2src() {
local filename="$1"
scp "${filename}"* ngene.org:/opt/homebrew/var/www/src/
}
scp2src_wild() {
if [[ $# -eq 0 ]]; then
echo "Usage: scp2src_wild PATTERN..."
return 1
fi
local files=()
# Collect matches for all patterns given
for pattern in "$@"; do
matches=($pattern) # zsh expands wildcards here
if [[ ${#matches[@]} -eq 0 ]]; then
echo "No files found matching: $pattern"
else
files+=("${matches[@]}")
fi
done
if [[ ${#files[@]} -eq 0 ]]; then
echo "No files to upload."
return 1
fi
echo "Uploading files: ${files[*]}"
scp "${files[@]}" ngene.org:/opt/homebrew/var/www/src/
}
eval "$(/opt/homebrew/bin/brew shellenv)"
export PATH="/opt/homebrew/sbin:$PATH"
alias zprofile_change='emacs ~/.zprofile'
alias zprofile_apply='source ~/.zprofile'
# Function to search for a specific term within files under a specified directory, case-insensitive.
function file_grep() {
# Check if both search term and search path are provided
if [[ -z "$1" || -z "$2" ]]; then
echo "Usage: file_grep "
echo "Example: file_grep \"nginx\" /opt/homebrew"
return 1
fi
# Assign arguments to variables for clarity
search_term="$1"
search_path="$2"
# Execute the search command
sudo find "$search_path" -type f -print0 | xargs -0 grep -i "$search_term"
}
# Function to search files by regex in a specified directory (case-insensitive)
function find_re() {
# Display usage instructions if arguments are missing
if [[ -z "$1" || -z "$2" ]]; then
echo "Usage: find_re "
echo "Example: find_re /opt/homebrew '.*frank.*'"
return 1
fi
# Assign arguments to variables for readability
search_path="$1"
regex_pattern="$2"
# Execute the find command with case-insensitive regex
find "$search_path" -type f -iregex "$regex_pattern"
}
# Function to search for files with a case-insensitive substring match in the filename
function find_str() {
# Display usage instructions if arguments are missing
if [[ -z "$1" || -z "$2" ]]; then
echo "Usage: find_str "
echo "Example: find_str /opt/homebrew '###'"
return 1
fi
# Assign arguments to variables for clarity
search_path="$1"
search_text="$2"
# Execute the find command with case-insensitive name matching
find "$search_path" -type f -iname "*$search_text*"
}
##################################
alias emacs="emacs -nw"
Once the environment variables, aliases, or functions are set in the shell's configuration file, they become available in every new terminal session.
Using the compress_folder Function: To compress and split a folder named folder_name, run:
# Compress and split a folder
compress_folder folder_name
Automatic Environment Variables: The PYTHONPATH variable will be automatically set upon opening a new terminal, allowing Python to locate additional libraries specified in the path.
Linux provides several commands to inspect and manipulate text data. Some of the most frequently used are wc, head, cut, grep, sort, and uniq. These tools can be combined in pipelines to filter and process textual information efficiently.
wc – Word, Line, and Byte CountThe wc command ("word count") counts lines, words, and bytes (characters) in files or input. It is often used to get a quick size estimate of a file or output.
wc -l filename (use -l to count lines only).wc -w filename (use -w to count words).wc -c filename (use -c for byte count).For example, wc -l /var/log/syslog will output the number of lines in the system log file.
head – View Beginning of FilesThe head command displays the first lines of a file. By default, it shows the first 10 lines. It is useful for previewing large files or outputs.
head -n 20 filename will display the first 20 lines.head to preview results, e.g. ps aux | head to see the top of a process list.$ head -n 3 sample.txt
Line 1 of the file
Line 2 of the file
Line 3 of the file
cut – Extract Columns of Textcut is used to extract columns or fields from lines of text, especially when data is structured with a delimiter (such as CSV or log files). It can cut by bytes, characters, or fields separated by a delimiter.
cut -d ':' -f1,3 file.txt splits each line by ':' and extracts the 1st and 3rd fields.cut -c 1-10 file.txt extracts the first 10 characters of each line.$ echo "user:password:uid:gid" | cut -d ':' -f1,4
user:gid
grep – Search Text with Patternsgrep finds lines in text that match a given pattern. It is a powerful search tool supporting plain text searches as well as regular expressions. By default, grep prints the matching lines.
grep -i "error" logfile.txt matches “error” in any case (Error, ERROR, etc.).grep -n "main" source.c shows matching lines with line numbers.grep -v "^#" filters out lines starting with # (common to skip comments).$ grep -n "root" /etc/passwd
1:root:x:0:0:root:/root:/bin/bash
sort – Sort Text LinesThe sort command sorts lines of text alphabetically or numerically. It’s often used to organize data or prepare for other operations (like finding duplicates).
sort -n numbers.txt treats the content as numbers (so 2 comes before 10).sort -r outputs results in descending order.sort -k 2 file.txt sorts by the 2nd field (default delimiter is whitespace).uniq – Uniquify and Count Duplicatesuniq filters out adjacent duplicate lines in a sorted file (or stream). It is commonly used after sort to count or remove duplicates.
sort file.txt | uniq will output each line once (all duplicates collapsed).uniq -c prefixes each unique line with the number of occurrences. (Use sort first, since uniq only collapses consecutive duplicates.)uniq -d shows only lines that were duplicated.$ sort items.txt | uniq -c
3 apple
1 banana
2 orange
md5sum and PipelinesIdentifying duplicate files can be automated using cryptographic hashes. The md5sum tool computes MD5 hashes of files, which can be used as unique fingerprints. By hashing files and then sorting or counting identical hashes, duplicates can be found.
Using md5sum with sorting and uniq: One approach is to generate an MD5 for each file and sort the results so that identical hashes line up consecutively, then use uniq to find duplicates. For example:
$ md5sum *.jpg | sort | uniq -w32 -d
In this command, md5sum *.jpg prints lines with the hash and filename for each .jpg file. We sort them (so identical hashes are adjacent) and then uniq -w32 -d checks for duplicates considering only the first 32 characters (the MD5 hash), printing lines that are duplicated (i.e., files with the same hash). This quickly lists files that have the same content.
Using awk for detailed duplicate reports: For more detail, awk can aggregate file names by their hash:
$ md5sum *.jpg | awk '{ hash=$1; file=$2; files[hash] = files[hash] file " "; count[hash]++; } END { for(h in files) if(count[h] > 1) { print count[h] " duplicates of hash " h ": " files[h]; } }' | sort -nr
This one-liner computes all hashes, builds a list of files for each hash, and at the end prints out any hash that appeared more than once, along with the count and file names. The output is sorted numerically reverse (sort -nr) to show the largest groups of duplicates first.
Using checksums is much faster than manually comparing files byte by byte, and it helps to detect duplicates even if file names differ.
printenvEnvironment variables are dynamic values that affect the processes and behavior of the shell and system. They include settings like PATH (which lists directories to search for executables), HOME (the user’s home directory), and many others. They can be viewed and set in the shell.
printenv command prints all environment variables (or a specific one if you provide a name). Similarly, env without arguments also displays the environment.echo $VARIABLE or printenv VARIABLE. For example, echo $HOME prints your home directory path.export command to set and export to the environment, e.g. export EDITOR=nano (so that $EDITOR is now “nano”).$ printenv | head -n 3
HOME=/home/alice
LANG=en_US.UTF-8
PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
Environment variables can influence program behavior. For example, a program might check $LANG to determine language, or $TZ for timezone. The env command can also run a command under a modified environment, e.g. env TZ=UTC date will run the date command with the timezone set to UTC (without changing the global environment).
.bashrc FileFrequently used or long commands can be turned into shortcuts using aliases. An alias is like a custom command that expands into a longer command. Aliases are usually defined in the shell startup file (like ~/.bashrc for Bash) so they are available in every session.
alias name='command'. For example, alias ll='ls -l --color' creates an alias ll that lists directory contents in long format with color.~/.bashrc (or ~/.bash_profile on some systems) ensures it is set for each new shell. You can edit the file and append such lines..bashrc, new terminals will pick up the changes. To apply them immediately in the current shell, use the source command (or . which is a synonym). For example, source ~/.bashrc reloads the configuration without logging out and back in.$ alias gs='git status'
$ gs
On branch main
...
In the above example, an alias gs is set to run git status. After defining it (or after reloading your config), typing gs will execute git status.
Aliases are a simple way to save keystrokes. For more complex logic (like multiple commands or conditional behavior), a shell function or script might be more appropriate.
There are multiple commands to locate files and identify executables in your system. Depending on what you need (to find an executable’s path, or search for files by name or content), you might use different tools like which, type, find, grep, and xargs.
which vs typeThe which command searches your PATH for an executable file and prints the first match. For example, which python might output /usr/bin/python. It is an external command and only knows about executables in the PATH.
Bash’s built-in type command is more powerful: it tells you how a given name would be interpreted by the shell. It can identify if a command is built-in, an alias, a function, or an external executable. For instance, if ls is an alias, type ls would show the alias definition, whereas which ls would just show the path to the ls executable.
$ which ls
/bin/ls
$ type ls
ls is aliased to `ls --color=auto`
In the above example, which finds the ls binary’s location, while type reveals that ls is actually an alias (which eventually calls the binary with options). If you define custom aliases or functions, which won’t show them, but type will.
findThe find command performs a recursive search through directories. It is extremely flexible, with options to filter by name, type, size, modification time, and more. Basic usage requires a path to search in (like /etc or . for current directory) and a condition.
find /etc -name "*.conf" finds all files under /etc ending with .conf.find . -type d -name "test*" finds directories (-type d) in current folder matching “test*”. Use -type f for files.find . -maxdepth 2 -name "*.txt" limits how deep to search in subdirectories.$ find /etc -type f -name "*.conf"
/etc/nginx/nginx.conf
/etc/ssh/sshd_config
...
If file names contain spaces or special characters, find will handle them safely (printing each path on a separate line). You can combine multiple conditions as well, for example finding files modified in the last day and matching a name pattern.
-execfind not only locates files; it can also perform actions on them using the -exec option. This allows you to execute a command for each file found. In the -exec syntax, {} is replaced by the current file name, and it ends with either \; or +.
find /var/log -name "*.log" -exec ls -l {} \; will ls -l each log file found.{} is a placeholder for the file path. It’s often wrapped in quotes to handle spaces, as in "{}", though in many cases find handles it.echo to preview. For example, find ~/tmp -name "*~" -exec echo rm {} \; will output lines like rm /home/alice/tmp/file~ without actually deleting anything.$ find ~/tmp -type f -name "*~" -exec echo rm {} \;
rm /home/alice/tmp/doc1.txt~
rm /home/alice/tmp/image.png~
After verifying the list, you could run the actual remove by replacing echo rm with rm in the command above. The -exec action executes once per file match. If you use -exec ... {} + (note the plus sign), find will try to pass multiple files at once to the command for efficiency.
grep and xargsTo search for text inside files across directories, you have a couple of approaches:
grep -R (recursive grep): grep -R "search text" /path/to/dir will recursively search files under the given directory for the pattern, printing matches. This is a quick one-command solution for finding text in files.find and xargs: Another approach is to use find to list files, and then pipe the list to xargs which runs grep on batches of those files. xargs takes input (like filenames) from standard input and appends them as arguments to the given command. For example:$ find . -name "*.py" -print0 | xargs -0 grep -H "TODO"
./app/main.py: # TODO: improve error handling
./app/utils.py: # TODO: optimize this function
Here, find lists all .py files in the current directory and below. -print0 causes find to separate file names with a null character (instead of newline), and xargs -0 reads those safely (this handles file names with spaces). The -H option in grep forces it to always print the filename before matches. The result is all occurrences of "TODO" in Python files, with their file paths.
The combination of find and xargs is powerful for cases where grep -R might not be available or if you want to perform other operations on the found files. Note that xargs will handle long lists by splitting into multiple command invocations if necessary. In most modern Linux systems, grep -R is the simplest method for searching within files.
The Bash shell keeps a history of commands that have been entered. This history allows quick access to previously run commands, making it easy to repeat or modify them without retyping. There are also settings to control how many commands to remember and how they are stored.
Bash by default remembers a certain number of commands (often 500). Two important environment variables control this:
HISTSIZE: the number of commands to keep in memory for the current session’s history.HISTFILESIZE: the maximum number of lines to store in the ~/.bash_history file (which persists across sessions).By increasing these values in your ~/.bashrc, you can have a larger history. For example, adding HISTSIZE=10000 and HISTFILESIZE=20000 will allow up to 10k commands in the session history and 20k in the file.
Another useful variable is HISTCONTROL, which can be set to ignoredups (to not record duplicate commands consecutively) or ignorespace (to not record lines starting with a space, allowing you to omit sensitive commands by prefixing a space).
Bash provides “history expansion” shortcuts using the ! (exclamation mark) for quickly referring to commands in the history:
!! – Repeat the last command. (This is equivalent to !-1.)!n – Run command number n from your history list (as shown by history or history | tail). For example, !1203 runs the command with index 1203.!-n – Run the command from n lines back. For example, !-3 executes the third from last command. Adding :p at the end (e.g. !-3:p) will print that command instead of executing it.!string – Run the last command that begins with string. For instance, !grep runs the most recent command that started with “grep”.!?string? – Run the last command that contains string anywhere in it. Example: !?ssh? would find the last command that had “ssh” in it (whether at the beginning or middle).^old^new – This is a quick substitution in the previous command. It finds the first occurrence of “old” in the last command and replaces it with “new”, then executes that command. For example, if you typed ls *.jp by mistake, entering ^jp^jpg will run ls *.jpg.These shortcuts can save a lot of time. For example, !! is often used after running a command that needed sudo: if you get a permission error, you can just do sudo !! to rerun the last command with sudo.
You can search your history interactively as well by pressing Ctrl-R and typing a part of a command; Bash will autofill the most recent match, which you can then execute or edit.
The Bash shell has two editing modes for the command line: Emacs mode and Vi mode. By default, Bash uses Emacs keybindings, which allow familiar cursor movement and editing shortcuts similar to the Emacs text editor (or common shortcuts in many apps).
set -o emacs). It allows quick editing via keys like Ctrl-A (move to start of line), Ctrl-E (move to end of line), Alt-B (backward one word), Alt-F (forward one word), Ctrl-K (cut to end of line), Ctrl-Y (yank/paste the cut text), and Ctrl-R (search history backward). These shortcuts make it efficient to fix or rerun commands.set -o vi for those who prefer Vi/Vim keybindings. In Vi mode, you press Esc to toggle between insert mode and command mode, and use h, j, k, l for navigation, b, w for word moves, etc., similar to editing text in Vi.These modes affect only how you edit the command line, not how commands execute. Emacs mode is generally easier for beginners (with arrow keys and Ctrl shortcuts working), whereas Vi mode is appreciated by those with Vi experience. You can choose the one that best fits your workflow and muscle memory.
Navigating deep directory structures can be time-consuming. Bash allows creation of shell functions to create smart shortcuts. One example is a function that quickly changes directory based on a keyword. Consider this function defined in your .bashrc:
qcd() {
case "$1" in
work) cd /home/alice/projects/work ;;
recipes) cd /home/alice/Documents/Recipes ;;
*) echo "Unknown shortcut: $1" ;;
esac
pwd
}
This qcd function takes one argument and switches to a preset directory depending on the keyword. If qcd work is run, it jumps to /home/alice/projects/work; qcd recipes goes to /home/alice/Documents/Recipes. After changing directory, it prints the current directory (via pwd) as confirmation.
To make it even more convenient, bash completion can be set up for this function. For example:
complete -W "work recipes" qcd
This line, when added after the function, tells Bash that valid completions for qcd are “work” and “recipes”. So you can type qcd w and press TAB to auto-complete “work”.
Custom directory shortcuts like these can save a lot of typing. Another built-in tip for quick navigation is the CDPATH variable – you can list base directories in CDPATH, and then you can cd into subdirectories from anywhere. But defining explicit functions or aliases like qcd gives you more control over naming and behavior.
cd -Bash provides a built-in shorthand to jump to the previous working directory: cd -. This command toggles you back to the last directory you were in, which is very handy when switching between two locations frequently.
$ pwd
/home/alice/projects
$ cd /var/www/html
$ pwd
/var/www/html
$ cd -
/home/alice/projects
In this example, the user was in /home/alice/projects, then went to /var/www/html. Executing cd - returns them to /home/alice/projects, and also prints that path. Running cd - again would flip back to /var/www/html, and so on.
This works because Bash remembers the previous directory in the variable OLDPWD, and cd - effectively does cd "$OLDPWD" and then prints it. It's a quick way to bounce between two directories without retyping the paths.
date, seq, yes, and Advanced grepBeyond the basic text tools, here are a few more commands that prove useful in scripting and automation:
date – Display or Format DatesThe date command prints the current date and time, or can format dates in various ways. This is helpful for timestamps in scripts or logs.
date with no arguments shows the current date/time in a standard format.date "+%Y-%m-%d %H:%M:%S" outputs the date as YYYY-MM-DD HH:MM:SS. The + sign indicates a format string, where %Y is year, %m month, etc. You can compose many variations (e.g., %A for weekday name).$(date +%s) to get the current Unix timestamp (seconds since 1970) or include a date in a filename (e.g., backup_$(date +%Y%m%d).tar.gz).seq – Generate Sequences of Numbersseq outputs a sequence of numbers, one per line (by default). It’s handy for generating loops or numbering things.
seq 5 will output 1, 2, 3, 4, 5 on separate lines.seq 2 2 10 starts at 2, steps by 2, and goes up to 10 (printing 2, 4, 6, 8, 10).seq -w 1 3 prints 01, 02, 03 – the -w option pads numbers with leading zeros to equal width.Sequences often complement other commands. For example, generating numbered filenames or running a loop in a shell script by iterating over $(seq N).
yes – Automatic Confirmation or RepetitionThe yes command outputs a string repeatedly (by default “y” followed by a newline, if no string is provided). Its primary use is to feed a continuous stream of “yes” (or any text) into programs that prompt for input, effectively auto-confirming prompts.
yes | command to send it a stream of "y" answers. For example, yes | apt-get install -y package is redundant (since -y already auto-confirms in apt-get), but demonstrates the concept.yes no | command would send "no" repeatedly. Use this carefully – it will continue until the receiving program terminates or the stream is closed.yes can quickly generate a large volume of data. For instance, yes X | head -n 1000 > file.txt will produce a file with 1000 lines of "X". Remember to stop yes with Ctrl-C if you run it without a terminating pipe or limit, as it will otherwise run indefinitely.grep TechniquesWe introduced grep earlier for basic searches. Here are a few more useful options and patterns for powerful text searching:
grep -R (or grep -r) searches through directories recursively. Add -n to see line numbers and -H to always show filenames.grep -A 2 -B 2 "ERROR" logfile will show 2 lines After and 2 lines Before each "ERROR" match, providing context around matches (-C is shorthand for same number before & after).grep -c "pattern" file returns just the number of matching lines (useful for quick stats).grep -E "fail|error|critical" enables extended regular expressions (alternation in this case to match “fail” or “error” or “critical”). grep -P can even use Perl-compatible regex for advanced needs.grep -l "text" *.md lists just the filenames of files that contain the text (useful to find which files have a certain string).With these options, grep becomes not only a finder of exact lines, but also a context-aware search tool and even a rudimentary file content scanner across directories.
Beyond the commonly used commands, there are a number of other text processing utilities that can be very handy in specific situations:
tac – Concatenate and print files in reverse. (tac is essentially cat backwards.) It reads a file from bottom to top, outputting the last line first. For example, tac file.txt will print the file with lines in reverse order.paste – Merge lines of files. It takes lines from multiple files and pastes them side by side separated by tabs (by default). For instance, if you have one file with names and another with scores, paste names.txt scores.txt will produce a two-column output combining corresponding lines.diff – Compare files line by line. It outputs the differences between two files. By default, diff file1 file2 produces a set of instructions to change file1 into file2. Using -u gives unified diff format (popular for patches), and tools like colordiff or graphical diff viewers can make it easier to interpret. diff -r can compare directories recursively.tr – Translate or delete characters. It reads from standard input and writes to standard output, replacing characters as specified. For example, tr 'A-Z' 'a-z' will uppercase input text (it replaces lowercase letters with their uppercase counterparts). You can also use tr -d to delete characters (e.g., tr -d '\r' to remove carriage returns from Windows text files).rev – Reverse the characters in each line. It’s the character-wise counterpart to tac. Running rev on a line like “Hello” outputs “olleH”. This can be useful for certain text processing tasks or just for quirky uses (like reversing a string quickly in shell).Each of these commands solves a particular kind of problem. While they might not be used every day, knowing them means you won’t have to resort to more complex tools or code when a simple pipeline using these would do the job.
awk and sedawk and sed are two classic Unix utilities for processing text streams. They are very powerful, but even basic uses of them can save time.
sed – Stream Editorsed is a non-interactive editor that reads input (from files or standard input), transforms it, and outputs the result. It is commonly used for substitution (find and replace) or deleting lines.
sed 's/old/new/g' file.txt will replace all occurrences of “old” with “new” in each line of file.txt. (The g at the end means “global” replacement on the line; without it, only the first occurrence on each line is replaced.) This outputs the result to standard output; add -i for in-place editing of the file (be careful with in-place edits, consider backing up first).sed '/^$/d' file.txt will delete all empty lines (lines that match the regex ^$ for start and end with nothing between). Or use sed '5,10d' to delete lines 5 through 10, for example.sed prints all lines (after applying edits). You can suppress automatic printing with -n and use p to print only certain lines. For example, sed -n '1,3p' file.txt prints only lines 1 through 3 of the file.awk – Pattern Scanning and Processingawk is a programming language designed for text processing and typically used as a single-command tool. It operates on each line of input (or record) and can perform actions on lines that match patterns.
awk splits each line into fields (by whitespace by default, or by a custom delimiter using -F). Fields are referenced by $1, $2, etc. For example, awk '{print $2, $1}' names.txt will print the second field, then the first field from each line of names.txt, effectively swapping two columns.{ ... } action block. For instance, awk '$3 > 100 {print $1,$2}' data.txt would print the first two fields of lines where the third field is greater than 100. If you have a CSV with columns, this is a quick way to filter by a numeric value in a certain column.awk has the concept of BEGIN and END blocks for actions before reading input and after processing all input. It also retains variables. For example, awk '{sum += $1} END {print sum}' numbers.txt will calculate and print the sum of the first field of all lines (assuming a file of numbers).Both sed and awk can become very complex (each has entire books dedicated to them). But even using them for simple tasks as above can replace more cumbersome manual steps or multiple commands. They truly shine when you need to do text manipulation that goes beyond what grep or cut can do alone.
&&, ||, and ;In bash (and many shells), you can string commands together on one line or in a script using control operators. The three most common operators are &&, ||, and the semicolon ;. They allow conditional or sequential execution of commands:
command1 && command2 – Executes command2 only if command1 succeeded (i.e., returned an exit status of 0). This is useful for chaining dependent commands. Example: mkdir newdir && cd newdir (change directory only if folder creation succeeded).command1 || command2 – Executes command2 only if command1 failed (non-zero exit status). Example: grep "pattern" file || echo "Pattern not found". Here, if the grep does not find the pattern (and thus exits with status 1), the echo will run; otherwise, if grep succeeds, the echo is skipped.command1; command2 – Executes command1 and then command2 regardless of success or failure. Commands separated by ; always run sequentially, effectively ignoring exit status. Example: echo "Starting update"; sudo apt update (the message will echo, then apt update runs after).Using && and || can make shell scripts and one-liners more robust by handling success/failure paths succinctly. They function like logical AND and OR for command execution.
$(...) to Embed CommandsCommand substitution allows the output of one command to be inserted into the arguments of another command. In Bash, this is done with the syntax $( ... ). (Older shell scripts may use backticks `...`, but $(...) is preferred for its nesting capabilities and clarity.)
For example, suppose you have a bunch of .txt files and you want to move those that contain the word "Kansas" to a folder named "kansas". You can use grep to find which files contain the text, and then mv to move them:
$ mv $(grep -l "Artist: Kansas" *.txt) kansas/
Here’s what happens:
grep -l "Artist: Kansas" *.txt lists all .txt filenames in the current directory that contain the string "Artist: Kansas". The -l option makes grep output only the filenames of matching files, not the matching lines.$( ... ) around that command captures its output. So the shell replaces $(grep -l ...) with the list of files that grep found.mv command then sees something like mv file1.txt file2.txt file3.txt kansas/, which will move all those files into the kansas/ directory (which should already exist as a directory).Command substitution is extremely useful for feeding the result of one operation into another. Another quick example: echo "Today is $(date +%A)" will print something like "Today is Wednesday" by substituting the output of the date +%A command.
Process substitution is a feature of Bash (and some other shells) that allows the output of a command to be referenced as if it were a file. The syntax uses <(command) for substituting a command’s output and >(command) (less common) for feeding input from a file into a command's input. It's often used with commands that normally compare or merge files.
A classic example is using diff to compare outputs of two commands, rather than files. Normally, diff compares two files line by line. With process substitution, you can do:
$ diff <(ls *.jpg | sort) <(seq -f "%g.jpg" 1 1000)
In this command:
ls *.jpg | sort lists all .jpg files in the current directory and sorts them (maybe numerically or alphabetically). This is one “virtual file”.seq -f "%g.jpg" 1 1000 generates a sequence of numbers from 1 to 1000 and formats them as “n.jpg” (using the -f format option). This produces lines like 1.jpg, 2.jpg, ... 1000.jpg. This is the second “virtual file”.diff then compares the output of those two commands as if they were files. Essentially, it’s comparing the list of existing image files to the list of expected image filenames from 1.jpg to 1000.jpg. The diff output will show which files are missing or extra.The <( ) syntax causes the shell to run the commands inside and provide a temporary file name (like /dev/fd/63) that diff can read from. This trick can be used for many situations where a command insists on a file, but you want to supply dynamic content. Note that process substitution is not POSIX standard and is specific to Bash, Zsh, and a few other shells.
Normally, we run commands directly in the shell or by executing a script file. However, you can also send a command (or series of commands) into a shell via a pipeline or redirect a file into the shell. This is useful for automation or dynamically generating commands.
For example, consider:
$ echo "ls -l /home" | bash
This will take the string "ls -l /home" and pipe it into the bash program. Bash will execute the commands it receives from its standard input. In this case, it will run ls -l /home and output the directory listing. Essentially, you are feeding a command to a new bash process via a pipe.
Similarly, you could have a file commands.txt that contains a list of commands, and run bash < commands.txt to have bash execute all the commands from that file in sequence.
Be cautious when using this technique, especially if the input is coming from an untrusted source, because you are effectively executing whatever commands are provided. However, it can be very powerful when used to automate tasks. One use case is generating a sequence of commands via script and piping them into sh or bash to execute, instead of writing them to a temporary file.
On a Linux terminal, appending an ampersand & to a command will run it in the background. This means you get the prompt back immediately and can continue doing other work while the command executes. Background jobs are useful for long-running processes.
For example:
$ long_running_script.sh &
[1] 4821
When you run this, you get a job number (in brackets) and a process ID. Here, [1] is the job number, and 4821 is the PID of the background process. The shell prompt returns right away.
You can manage and monitor background jobs with a few built-in commands:
jobs – Lists current background jobs with their job number and status (Running, Stopped, etc.).fg %1 – Bring job number 1 back to the foreground. (Using %jobnumber references a job; fg without arguments brings the most recent job to foreground.)bg %1 – If a job is stopped or in foreground, bg will resume it in the background.kill – Send a signal to a process by PID or job ID. For example, kill 4821 will send SIGTERM to the process with PID 4821. You can also use kill %1 to kill job number 1. Use kill -9 (SIGKILL) if a process doesn’t terminate with a gentle kill.To check how far a background process has progressed, it depends on the process itself. Many command-line programs will output their progress periodically or upon completion. If the output is not redirected, you will see it appear in the terminal even while you have the prompt (the text may intermix with your typing). You can redirect output to a file (using > output.txt 2>&1) and then inspect that file or use tail -f output.txt to watch it live.
Bash will notify you with a message when a background job finishes or stops (next time you press Enter). For instance, you might see [1]+ Done long_running_script.sh when job 1 completes. This is a handy reminder of background job completion.
If you need a command to keep running even after you log out, consider using nohup (no hang-up) or running the process inside a terminal multiplexer like screen or tmux. nohup command & will make the process immune to hangup signals (so it won't terminate when your session ends), and it writes output to nohup.out by default.
A subshell is a separate shell process launched by your current shell. Parentheses ( ... ) around commands cause them to execute in a subshell. One common use of subshells is to localize a change in directory or environment that shouldn’t affect the main shell. This is frequently seen in advanced shell one-liners, especially involving tar for archiving files.
Consider these use cases:
cat archive.tar.gz | (mkdir -p /tmp/other && cd /tmp/other && tar xzvf -) – This command creates a directory /tmp/other (if it doesn’t exist), then changes into it within a subshell, and extracts the archive from standard input (the - tells tar to read from stdin). The current shell’s directory is not affected by the cd because that happens in the subshell. After extraction, you’ll find the files in /tmp/other.tar czf - dir1 | (cd /tmp/dir2 && tar xzf -) – Here tar czf - dir1 creates a tarball of dir1 and sends it to stdout (the - as the output file means stdout). That output is piped into a subshell where we cd /tmp/dir2 and then tar xzf - to extract from stdin into /tmp/dir2. This effectively copies dir1 into /tmp/dir2 in one line without creating an intermediate file on disk.tar czf - dir1 | ssh user@remote "(cd /path/on/remote && tar xzf -)" – This variation uses ssh to run a subshell on the remote host. The local tar writes to stdout, which ssh sends to the remote shell’s stdin for the tar extraction. The remote subshell ensures the extraction happens in the desired directory on the remote machine. This is a convenient way to transfer directories between machines without intermediate files.These examples demonstrate the elegance of combining subshells with pipes. The subshell (enclosed in parentheses) can change directories or other settings temporarily. The && ensures that each step (like making the directory or changing directory) must succeed before proceeding to the next, adding safety. Using - with tar for stdin/stdout, along with ssh for remote execution, can accomplish in one line what might otherwise take several steps.
When you run a command like bash from your current shell, you start a new subshell (a child process). Your prompt might change to indicate a new shell, and any changes or commands in that subshell won’t affect the parent shell once you exit it. For example:
$ echo $PS1
\u@\h:\w\$
$ bash # start a new Bash subshell
$ PS1="Doomed> " # change the prompt in subshell
Doomed> echo "Hello"
Hello
Doomed> exit # exit the subshell
$
In this snippet, the user’s normal prompt \u@\h:\w\$ (username@host:cwd$) is shown. After running bash, a new shell starts (which inherited the same prompt format by default). The user then changes PS1 (the prompt variable) to "Doomed> " inside the subshell, and we can see the prompt change. The echo command runs in the subshell and prints "Hello". After exiting the subshell, the original prompt and shell are restored.
Sometimes you might want to not just start a subshell, but actually replace the current shell process with a new program. This can be done with the exec command. exec takes a command and runs it in place of the current shell, meaning the current shell will not continue after exec. For instance, if you run exec bash in your shell, you don't get a new nested shell – your current shell becomes the new bash process. When you exit that bash, the session ends (since there was no parent shell to return to).
The exec command is often used in scripts when you want to hand over control to another program, or to replace a shell with a different one without spawning extra processes. It's also used to redirect file descriptors at a shell level (for advanced scripting scenarios).
In summary, parentheses ( ... ) create a subshell for grouped commands, launching a new shell like bash gives you a child shell that you can exit, and exec replaces the current shell process entirely with a new program. Each has its use: subshells for isolation, new shells for temporary environments, and exec for full replacement.
Throughout this guide, we've seen many one-liner commands – powerful combinations of tools strung together to perform complex tasks in one go. Mastering one-liners can significantly boost efficiency on the command line. Here are a few additional examples of handy bash one-liners:
du -ah . | sort -h -r | head -n 10. This uses du (disk usage) to list sizes of all files and directories under the current directory (-a for all files, -h for human-readable sizes), then sorts them by size (-h sorts by human-readable sizes, -r for reverse order so largest first), and shows the top 10.grep -R "needle" .. This will print all matches of "needle" in the current directory and subdirectories. Add -n for line numbers or -l to list only filenames. (Alternatively, rg "needle" if using the ripgrep tool for even faster recursive search.)grep -o "apple" file.txt | wc -l. grep -o will output each occurrence of "apple" on a new line, and wc -l counts the lines, giving the number of occurrences. This is a quick way to count matches (otherwise grep -c counts matching lines, which may be fewer if multiple matches per line).for f in *.jpeg; do mv "$f" "${f%.jpeg}.jpg"; done. This loop goes through each .jpeg file in the directory and renames it to have a .jpg extension. The expression ${f%.jpeg}.jpg takes the filename stored in $f and replaces the .jpeg suffix with nothing (effectively dropping it) then appends .jpg.watch -n 5 'df -h'. The watch command runs the given command at regular intervals (every 5 seconds here) and updates the display. In this example, it repeatedly shows disk usage (df -h) every 5 seconds. This isn't exactly a pipeline one-liner, but it's extremely useful for watching system status changes.By combining basic commands, redirections, and a bit of shell syntax, you can accomplish a lot without writing a full script. As you discover more commands and get comfortable with these patterns, you'll find yourself creating one-liners to solve problems on the fly. It's always a good practice to break down what a complex one-liner is doing (as we did above) to ensure you understand it and it's doing exactly what you intend.
wc -l file.txt – Print the number of lines in file.txt. (Use -w for words, -c for bytes.)head -n 20 file.txt – Show the first 20 lines of a file. (tail -n 20 shows the last 20 lines.)cut -d ',' -f1-3 data.csv – Extract the 1st through 3rd fields from a CSV (comma-separated) file.sort -r -n numbers.txt – Sort the lines in numeric order (-n) and reverse (-r) to get highest to lowest.uniq -c sorted.txt – Collapse duplicate lines in a sorted file, prefixing each unique line with its count (-c).tac file.txt – Output the file with lines in reverse order (last line first).paste file1.txt file2.txt – Combine two files line-by-line (joining lines with a tab by default).tr '[:upper:]' '[:lower:]' – Translate uppercase to lowercase from standard input (outputs to stdout). Also use tr -d to delete characters (e.g., tr -d '\n' to remove newlines).rev file.txt – Reverse each line of the file (characters in line reversed).diff -u old.txt new.txt – Compare two files and produce a unified diff (prefixed with +/– to show changes).which command – Show the path of the executable that would run for "command". (E.g., which python)type command – Show how "command" is interpreted (alias, function, built-in, or file path).find /path -type f -name "*.log" – Recursively find files under /path that match the name pattern *.log. Use -type d for directories.find . -size +100M -exec ls -lh {} \; – Find files larger than 100MB in current directory and list them (using ls -lh). The \; ends the -exec clause.find . -name "*.tmp" -exec rm {} \; – Find and remove (rm) all files ending with .tmp in current directory tree. (Use -exec echo rm ... first to verify.)grep -R "foo" /path – Recursively search for "foo" in all files under /path. Add -n to show line numbers, -i for ignore case.grep -v "^#" – From input, output all lines that do not start with # (useful for ignoring comment lines).xargs – Construct argument lists and execute utility. Often used with find: e.g., find . -type f -print0 | xargs -0 grep "hello" to search "hello" in all files found (null-separated safe).printenv – Print all environment variables (or printenv VAR for a specific variable).export VAR=value – Set an environment variable in the current shell (and sub-processes). E.g., export EDITOR=vim.alias ll='ls -la' – Create an alias ll for a command (ls -la in this case). Put aliases in ~/.bashrc to make them permanent.source ~/.bashrc (or . ~/.bashrc) – Apply changes from the bashrc file to the current session (reload the config).HISTSIZE, HISTFILESIZE – Variables controlling how many commands to remember in history (in-memory and in the ~/.bash_history file respectively).HISTCONTROL=ignoredups:ignorespace – Setting to ignore consecutive duplicate commands and those starting with space in history.Ctrl-R – (interactive) Reverse search through command history. Start typing to find a past command.set -o vi / set -o emacs – Switch shell editing mode to Vi or Emacs style. (Emacs is default; Vi gives Vim-like command line editing.)cd - – Switch to the previous directory (toggle back and forth between two directories).pushd dir / popd – Navigate directories using a stack. pushd changes directory like cd but pushes the old directory onto a stack; popd goes back by popping from that stack. Useful for quickly jumping around and back.command1 && command2 – Run command2 only if command1 succeeds.command1 || command2 – Run command2 only if command1 fails.command1; command2 – Run command1 and then command2 regardless of success/failure.$(command) – Command substitution: use the output of command in place. E.g., echo "Files: $(ls | wc -l)".diff <(cmd1) <(cmd2) – Process substitution: treat outputs of cmd1 and cmd2 as files to diff (or similarly, feed into any command that expects file arguments).echo "command" | bash – Pipe a dynamically generated command or script into a new bash shell to execute it.cmd & – Append & to run cmd in background. Use jobs to list jobs, fg/%n to bring to foreground, kill to terminate if needed.nohup cmd & – Run cmd immune to hangup (logout), in background. Output is redirected to nohup.out by default.(cd /tmp && do_something) – Perform do_something in a subshell where current directory is /tmp. The enclosing shell’s directory isn’t affected.exec program – Replace the current shell with program (the shell process ends, and program runs in its place).for i in {1..5}; do ...; done – Loop from 1 to 5 (inclusive). Inside the loop, use $i for the current number. Useful for simple iterations in shell.Written on November 12, 2025
macOS (like other Unix systems) provides several commands to inspect and manipulate text data. Some of the most frequently used are wc, head, cut, grep, sort, and uniq. These tools can be combined in pipelines to filter and process textual information efficiently.
wc – Word, Line, and Byte CountThe wc command (“word count”) counts lines, words, and bytes (characters) in files or input. It is often used to get a quick size estimate of a file or output.
wc -l filename (use -l to count lines only).wc -w filename (use -w to count words).wc -c filename (use -c for byte count).For example, wc -l /var/log/system.log will output the number of lines in the system log file.
head – View Beginning of FilesThe head command displays the first lines of a file. By default, it shows the first 10 lines. It is useful for previewing large files or outputs.
head -n 20 filename will display the first 20 lines.head to preview results. For example, ps aux | head will show the top of a process list.$ head -n 3 sample.txt
Line 1 of the file
Line 2 of the file
Line 3 of the file
cut – Extract Columns of Textcut is used to extract columns or fields from lines of text, especially when data is structured with a delimiter (such as CSV or log files). It can cut by bytes, characters, or fields separated by a delimiter.
cut -d ':' -f1,3 file.txt splits each line by : and extracts the 1st and 3rd fields.cut -c 1-10 file.txt extracts the first 10 characters of each line.$ echo "user:password:uid:gid" | cut -d ':' -f1,4
user:gid
grep – Search Text with Patternsgrep finds lines in text that match a given pattern. It is a powerful search tool supporting plain text searches as well as regular expressions. By default, grep prints the matching lines.
grep -i "error" logfile.txt matches “error” in any case (error, ERROR, etc.).grep -n "main" source.c shows matching lines with line numbers.grep -v "^#" filters out lines that start with # (commonly used to skip comments).$ grep -n "root" /etc/passwd
1:root:x:0:0:root:/root:/bin/bash
sort – Sort Text LinesThe sort command sorts lines of text alphabetically or numerically. It’s often used to organize data or prepare for other operations (like finding duplicates).
sort -n numbers.txt treats the content as numbers (so 2 comes before 10).sort -r outputs results in descending order.sort -k 2 file.txt sorts by the 2nd field (default delimiter is whitespace).uniq – Uniquify and Count Duplicatesuniq filters out adjacent duplicate lines in a sorted file (or stream). It is commonly used after sort to count or remove duplicates.
sort file.txt | uniq will output each line once (all duplicates collapsed).uniq -c prefixes each unique line with the number of occurrences. (Use sort first, since uniq only collapses consecutive duplicates.)uniq -d shows only lines that were duplicated.$ sort items.txt | uniq -c
3 apple
1 banana
2 orange
md5sum and PipelinesIdentifying duplicate files can be automated using cryptographic hashes. The md5sum tool computes MD5 hashes of files, which can serve as unique fingerprints. By hashing files and then sorting or counting identical hashes, duplicates can be detected.
Using md5sum with sorting and uniq: One approach is to generate an MD5 for each file and sort the results so that identical hashes line up consecutively, then use uniq to find duplicates. For example:
$ md5sum *.jpg | sort | uniq -w32 -d
In this command, md5sum *.jpg prints lines with the hash and filename for each .jpg file. The output is piped into sort (so identical hashes are adjacent), and then uniq -w32 -d prints only the hashes (and associated line) that are duplicated (considering only the first 32 characters, which constitute the MD5 hash). This quickly lists files that have the same content.
Using awk for detailed duplicate reports: For more detail, an awk one-liner can aggregate file names by their hash:
$ md5sum *.jpg | awk '{ hash=$1; file=$2; files[hash] = files[hash] file " "; count[hash]++; } END { for(h in files) if(count[h] > 1) { print count[h] " duplicates of hash " h ": " files[h]; } }' | sort -nr
This command computes all hashes, builds a list of files for each hash, and at the end prints out any hash that appeared more than once, along with the count and file names. The output is sorted in reverse numeric order (sort -nr) to show the largest groups of duplicates first.
Using checksums is much faster than manually comparing files byte by byte, and it helps detect duplicates even if file names differ.
printenvEnvironment variables are dynamic values that affect the behavior of the shell and system. They include settings like PATH (which lists directories to search for executables), HOME (the user’s home directory), and many others. These variables can be viewed and set in the shell.
printenv command prints all environment variables (or a specific one if a name is provided). Similarly, running env with no arguments also displays the environment.echo $VARIABLE or printenv VARIABLE. For example, echo $HOME prints the home directory path.export command to set a variable and mark it for export to child processes, e.g., export EDITOR=nano (this makes $EDITOR equal “nano” for that session and any processes started from it).$ printenv | head -n 3
HOME=/Users/alice
LANG=en_US.UTF-8
PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
Environment variables can influence program behavior. For example, a program might check $LANG to determine the language, or $TZ for the timezone. The env command can also run a command under a modified environment (for instance, env TZ=UTC date will run the date command with the timezone temporarily set to UTC, without changing the global environment).
Note for macOS: The default shell on modern macOS is Zsh (replacing Bash). Configuration that would be placed in ~/.bashrc on Linux systems should be added to Zsh’s startup files on macOS. For login sessions (such as opening a new Terminal window), Zsh reads ~/.zprofile, and for interactive shells it reads ~/.zshrc. Therefore, to persistently set environment variables or define aliases on macOS, add them to ~/.zprofile (or ~/.zshrc) and reload the configuration.
For example, to ensure Homebrew’s installation is properly integrated into the environment, one can include the following lines in ~/.zprofile:
# Homebrew environment setup (macOS)
eval "$(/opt/homebrew/bin/brew shellenv)" # Set PATH and other Homebrew variables
export PATH="/opt/homebrew/sbin:$PATH" # Include Homebrew sbin in PATH
Explanation: The first line runs the Homebrew brew shellenv command, which outputs the necessary export statements to configure Homebrew in the shell; using eval applies those exports. The second line explicitly prepends Homebrew’s sbin directory to the PATH, ensuring any executables installed there (for example, some server binaries like Nginx) can be found. After adding these lines, the updated environment will be applied in new terminals, or immediately in the current session with source ~/.zprofile.
Frequently used or long commands can be turned into shortcuts using aliases. An alias is like a custom command that expands into a longer command. Aliases are usually defined in a shell startup file (for example, ~/.bashrc on Linux or ~/.zshrc on macOS) so they are available in every session.
alias name='command'. For example, alias ll='ls -l --color' creates an alias ll that lists directory contents in long format (with output colored, if supported).~/.bashrc or ~/.bash_profile on Linux, and ~/.zshrc or ~/.zprofile on macOS) ensures it is configured for each new shell session.source command. For example, source ~/.bashrc (or ~/.zshrc) reloads the configuration without the need to log out and back in.$ alias gs='git status'
$ gs
On branch main
...
In the above example, an alias gs is set to run git status. After defining it (or after reloading the configuration), typing gs executes git status.
Aliases are a simple way to save keystrokes. For more complex logic (such as running multiple commands or conditional behavior), a shell function or script might be more appropriate.
There are multiple commands to locate files and identify executables on the system. Depending on the need (finding an executable’s path, or searching for files by name or content), different tools can be used, such as which, type, find, grep, and xargs.
which vs typeThe which command searches the directories listed in $PATH for an executable file and prints the first match. For example, which python might output /usr/bin/python. It is an external command and only knows about executables in the PATH.
The shell’s built-in type command is more powerful: it indicates how a given name would be interpreted by the shell. It can identify if a command is built-in, an alias, a function, or an external executable. For instance, if ls is an alias, type ls would show the alias definition, whereas which ls would just show the path to the ls executable.
$ which ls
/bin/ls
$ type ls
ls is aliased to `ls --color=auto`
In the above example, which finds the ls binary’s location, while type reveals that ls is actually an alias (which eventually calls the binary with options). If custom aliases or functions are defined, which will not show them, but type will.
findThe find command performs a recursive search through directories. It is extremely flexible, with options to filter by name, type, size, modification time, and more. Basic usage requires a path to search in (such as /etc or . for the current directory) and a search condition.
find /etc -name "*.conf" finds all files under /etc ending with .conf.find . -type d -name "test*" finds directories (-type d) in the current folder whose names start with “test” (use -type f for files).find . -maxdepth 2 -name "*.txt" limits how deep to search in subdirectories (in this case, 2 levels deep).$ find /etc -type f -name "*.conf"
/etc/nginx/nginx.conf
/etc/ssh/sshd_config
...
If file names contain spaces or special characters, find will handle them safely (printing each path on a separate line). Multiple conditions can be combined as well (for example, finding files modified in the last day that also match a name pattern).
-execfind can not only locate files, but also perform actions on them using the -exec option. This option executes a specified command on each found file. In the -exec syntax, {} is a placeholder for the current file name, and the command is terminated with either \; or +.
find /var/log -name "*.log" -exec ls -l {} \; will run ls -l on each log file found.{} is the placeholder for the file path. It’s often wrapped in quotes (like "{}") to handle file names with spaces, although find handles many such cases automatically.echo is prudent. For example, find ~/tmp -name "*~" -exec echo rm {} \; will output lines like rm /Users/alice/tmp/file~ for each match, without actually deleting anything.$ find ~/tmp -type f -name "*~" -exec echo rm {} \;
rm /Users/alice/tmp/doc1.txt~
rm /Users/alice/tmp/image.png~
After verifying the list, the actual remove can be done by running the command without echo. The -exec action executes once per matching file. If -exec ... {} + (with a plus) is used, find will attempt to pass multiple files to a single command invocation for efficiency.
grep and xargsTo search for text inside files across directories, a couple of approaches are available:
grep -R (recursive grep): grep -R "search text" /path/to/dir will recursively search through all files under the given directory for the pattern, printing any matches. This one-command solution is quick for finding text in files.find and xargs: Another approach is to use find to list files and then pipe the list to xargs, which runs grep on batches of those files. xargs takes input (like filenames from find) and appends them as arguments to the specified command. For example:$ find . -name "*.py" -print0 | xargs -0 grep -H "TODO"
./app/main.py: # TODO: improve error handling
./app/utils.py: # TODO: optimize this function
In this example, find lists all .py files in the current directory and below. The -print0 option separates file names with a null character (to handle spaces in names), and xargs -0 reads the input accordingly. The -H option in grep forces it to always print the filename with matches. The result is a list of all "TODO" occurrences in Python files, each prefixed by its file path.
The combination of find and xargs is powerful when grep -R isn’t available or if additional filtering of files is needed before searching. Note that xargs will handle long lists by invoking the command multiple times as necessary. On modern systems, grep -R is usually the simplest method to search within files.
Sometimes it’s handy to wrap common find/grep patterns in a shell function for quick reuse. Here are example functions that could be added to a shell configuration file (adjusting paths as needed):
file_grep <text> <path> – Recursively search for text in all files under path, case-insensitively. It uses sudo find <path> -type f -print0 | xargs -0 grep -i "<text>" to handle filenames safely (null-separated). Note: sudo is included to ensure access to all files under path; omit it if not needed. Add -n to the grep part for line numbers in results.find_re <path> '<regex>' – Find files under path whose full file path matches the given regular expression (case-insensitive). It runs find <path> -type f -iregex "<regex>". For example, find_re /opt/homebrew '.*frank.*' would list files with “frank” anywhere in their path.find_str <path> <substring> – Find files under path with substring in their filename (case-insensitive). It is a shortcut for using find -iname "*substring*". For instance, find_str /opt/homebrew conf might find files like nginx.conf or myconfig.txt (because “conf” appears in their names).These custom functions leverage the flexibility of find combined with grep, encapsulating common search tasks into single commands. They can be defined in ~/.zshrc or another shell startup file for convenience.
The Zsh and Bash shells keep a history of commands that have been entered. This history allows quick access to previously run commands, making it easy to repeat or modify them without retyping. There are settings to control how many commands to remember and how the history is stored.
By default, shells remember a certain number of commands (often 500). Two important environment variables control this memory:
HISTSIZE: The number of commands to keep in memory for the current session’s history.HISTFILESIZE: The maximum number of lines to store in the history file (persisted in ~/.bash_history for Bash or ~/.zsh_history for Zsh).By increasing these values (for example, adding HISTSIZE=10000 and HISTFILESIZE=20000 in a startup file), the shell can retain a larger history (10,000 commands in memory and 20,000 in the file, in this example).
Another useful variable is HISTCONTROL (in Bash) or HIST_IGNORE_DUPS / HIST_IGNORE_SPACE in Zsh, which can be set to avoid recording duplicate commands or commands prefixed with a space (often used to avoid logging sensitive commands).
Shells provide “history expansion” shortcuts using the ! character for quickly reusing commands from history:
!! – Repeat the last command. (Equivalent to !-1.)!n – Run command number n from the history list (as shown by history). For example, !1203 would run the command with index 1203.!-n – Run the command from n lines back. For example, !-3 executes the third from last command. Adding :p at the end (e.g., !-3:p) will print that command instead of executing it.!string – Run the most recent command that begins with string. For instance, !grep runs the last command that started with “grep”.!?string? – Run the most recent command that contains string anywhere in it. Example: !?ssh? would find the last command that had “ssh” in it (whether at the start or middle of the command).^old^new – A quick substitution in the previous command. This finds the first occurrence of “old” in the last command and replaces it with “new”, then executes the modified command. For example, if the previous command was ls *.jp (missing the ‘g’), typing ^jp^jpg will run ls *.jpg.These shortcuts can save time. For example, if a command fails due to insufficient permissions, using sudo !! will rerun the last command with elevated privileges (sudo).
It is also possible to search the command history interactively by pressing Ctrl-R and typing a part of a command; the shell will auto-fill the most recent matching command, which can then be executed or edited.
The shell has two primary editing modes for the command line: Emacs mode and Vi mode. By default, Zsh and Bash use Emacs keybindings, which allow familiar cursor movement and editing shortcuts (similar to common shortcuts in many applications, and to the Emacs text editor).
set -o emacs). It provides quick editing keys like Ctrl-A (move to start of line), Ctrl-E (move to end of line), Alt-B (move backward one word), Alt-F (move forward one word), Ctrl-K (cut text to end of line), Ctrl-Y (paste the cut text), and Ctrl-R (reverse search through history). These shortcuts make it efficient to fix typos or rerun commands.set -o vi for those who prefer Vi/Vim keybindings. In Vi mode, pressing Esc toggles from insert mode to command mode. Navigation then uses h, j, k, l (left, down, up, right), and word movements with b (backward) and w (forward), etc., similar to editing in Vi.These modes affect only how the command line is edited, not how commands execute. Emacs mode is generally easier for beginners (arrow keys and the Ctrl shortcuts are intuitive), whereas Vi mode is appreciated by those with Vi experience. Either mode can be used, depending on personal preference and familiarity.
Navigating deep directory structures can be time-consuming. Shells allow the creation of functions to act as smart shortcuts. One example is a function that quickly changes to a specific directory based on a keyword. Consider this function defined in a shell startup file:
qcd() {
case "$1" in
web) cd /opt/homebrew/var/www ;; # jump to web root
nginx) cd /opt/homebrew/etc/nginx ;; # jump to Nginx config
*) echo "qcd: unknown key '$1'"; return 1 ;;
esac
pwd # print current directory
}
This qcd function takes one argument and switches to a preset directory depending on the keyword. For example, qcd web jumps to /opt/homebrew/var/www, and qcd nginx goes to /opt/homebrew/etc/nginx. After changing directory, it prints the current directory (using pwd) as confirmation.
Custom directory shortcuts like this can save a lot of typing. Another built-in tip for quick navigation is the CDPATH environment variable – one can list base directories in CDPATH, and then cd will directly navigate into matching subdirectories from those bases. However, defining explicit functions or aliases (like qcd) provides more control over the shortcut names and target locations.
cd -The shell provides a quick way to jump to the previous working directory using cd -. This command switches back to the last directory you were in, which is very handy for toggling between two locations.
$ pwd
/Users/alice/projects
$ cd /var/www/html
$ pwd
/var/www/html
$ cd -
/Users/alice/projects
In this example, the user was in /Users/alice/projects, then changed to /var/www/html. Executing cd - returns to /Users/alice/projects and also prints that directory path. Running cd - again would flip back to /var/www/html, and so on.
This works because the shell remembers the previous directory in the variable OLDPWD, and cd - is essentially a shorthand for cd "$OLDPWD" (with the added behavior of printing the directory after switching). It provides a quick way to bounce between two directories without retyping long paths.
date, seq, yes, and Advanced grepBeyond the basic text tools, there are a few more commands that are useful in scripting and automation:
date – Display or Format DatesThe date command prints the current date and time, or formats the date/time in a specified way. This is helpful for timestamps in scripts or logs.
date with no arguments shows the current date/time in a standard format.date "+%Y-%m-%d %H:%M:%S" outputs the date as YYYY-MM-DD HH:MM:SS. The + indicates a format string, where %Y is the year, %m is the month, etc. Many format specifiers are available (for example, %A for the full weekday name).$(date +%s) to get the current Unix timestamp (seconds since 1970-01-01), or embed a formatted date into a filename, e.g., backup_$(date +%Y%m%d).tar.gz.seq – Generate Sequences of Numbersseq outputs a sequence of numbers, one per line. It’s handy for generating loops or numbered lists.
seq 5 produces 1, 2, 3, 4, 5 on separate lines.seq 2 2 10 starts at 2, steps by 2, and goes up to 10 (outputting 2, 4, 6, 8, 10).seq -w 1 3 prints 01, 02, 03 – the -w option pads numbers with leading zeros to equalize width.Sequences often complement other commands. For example, generating numbered filenames in a script (by iterating over $(seq N)) or creating a simple loop in the shell.
yes – Automatic Confirmation or RepetitionThe yes command outputs a string repeatedly (by default "y" followed by a newline, if no string is provided). Its primary use is to feed a continuous stream of a response into a program that asks for user input, effectively auto-confirming prompts.
yes | command can be used to automatically answer "y" to every prompt. For example, if a script or installer repeatedly prompts for confirmation, piping the output of yes will send "y" responses continuously. (Many package managers and installers have their own -y or “assume yes” option, which is often preferable to using yes, but yes demonstrates how to handle generic cases.)yes no | command would continuously output "no". Be cautious, as yes will not stop until the receiving program terminates or its input stream is closed.yes can quickly produce a large volume of text. For instance, yes X | head -n 1000 > file.txt will create a file with 1000 lines of "X". Remember to stop yes with Ctrl-C if it’s running without a terminating condition, since it will otherwise run indefinitely.grep TechniquesEarlier, this guide introduced grep for basic searches. Here are a few more useful options and patterns for powerful text searching:
grep -R (or grep -r) searches directories recursively. Use -n to include line numbers in output and -H to always show filenames, especially when searching a single file.grep -A 2 -B 2 "ERROR" logfile shows 2 lines After and 2 lines Before each matching line (in this case, “ERROR”), providing context around each match. (The -C option can be used as a shorthand when the number of lines before and after is the same.)grep -c "pattern" file reports the number of matching lines (which can be useful for getting a quick count of occurrences).grep -E "fail|error|critical" file.log enables extended regular expression mode, allowing the use of alternation (in this example, matching lines that contain “fail” or “error” or “critical”). For even more complex patterns, grep -P enables Perl-compatible regex (in GNU grep).grep -l "text" *.md lists only the filenames of files that contain the search text (useful to identify which files have a certain string without printing the lines).With these options, grep becomes not just a tool to find exact lines, but also a context-aware search utility and a quick content scanner across many files.
Apart from the commonly used commands, there are other text processing utilities that can be very handy in specific situations:
tac – Concatenate and print files in reverse. (tac is essentially cat backwards.) It reads a file from bottom to top, outputting the last line first. For example, tac file.txt will display the file with lines in reverse order. Note: macOS does not include tac by default, but an equivalent effect can be achieved with tail -r (or by installing GNU coreutils which provides tac).paste – Merge lines of files. It reads lines from multiple files and pastes them side by side separated by tabs (by default). For instance, if one file contains names and another contains scores, paste names.txt scores.txt will produce a two-column output with names in the first column and corresponding scores in the second.diff – Compare files line by line. It outputs the differences needed to change one file into the other. By default, diff file1 file2 produces a series of lines indicating changes. The -u option provides unified diff format (commonly used for patches), and using tools like colordiff or graphical diff viewers can make the output easier to interpret. Additionally, diff -r can compare directories recursively.tr – Translate or delete characters. It reads from standard input and writes to standard output, performing character-by-character substitution or deletion. For example, tr 'A-Z' 'a-z' will translate lowercase letters to uppercase in the input text. tr -d can delete characters (e.g., tr -d '\r' removes carriage return characters, which is useful for converting Windows-formatted text files).rev – Reverse the characters in each line. It’s the character-wise counterpart to tac. Running rev on a line like “Hello” outputs “olleH”. This is useful for certain text processing tricks or simply to reverse strings from the command line.Each of these commands addresses a particular type of task. While they might not be needed every day, knowing about them means having a simple solution at hand for specific problems instead of writing more complex code or using a heavier tool.
awk and sedawk and sed are classic Unix utilities for processing text streams. They are extremely powerful, but even basic uses can save a lot of manual effort.
sed – Stream Editorsed is a stream editor that reads input (from files or standard input), transforms it according to a set of commands, and outputs the result. It’s commonly used for substitution (find and replace) or for deleting lines.
sed 's/old/new/g' file.txt will replace all occurrences of “old” with “new” in each line of file.txt. (The g at the end stands for “global,” meaning replace all occurrences in the line. Without it, only the first occurrence on each line is replaced.) By default, sed prints the transformed text to standard output. Use -i (with caution) to edit the file in place.sed '/^$/d' file.txt will delete all empty lines (lines matching the regex ^$, which means “start and end of line with nothing between”). Similarly, sed '5,10d' would delete lines 5 through 10 of the file.sed prints every line (after applying edits). Using -n suppresses automatic printing, and the p command can be used to print specific lines. For example, sed -n '1,3p' file.txt will output only lines 1 through 3 of the file.awk – Pattern Scanning and Processing Languageawk is a small programming language specialized for text processing. An awk command typically consists of a pattern and an action, and it operates on each line of input (by default).
awk splits each input line into fields (by whitespace by default, or by a delimiter given with -F). Fields are accessed by $1, $2, etc. For example, awk '{print $2, $1}' names.txt will print the second field followed by the first field of each line in names.txt, effectively swapping the first two columns.{ ... } can filter which lines to act on. For instance, awk '$3 > 100 {print $1, $2}' data.txt will print the first two fields of lines where the third field is greater than 100. (If data.txt had multiple columns of numbers, this would extract those with a certain value in the third column.)awk has BEGIN and END blocks for actions before processing input and after processing all input, respectively. It also supports variables and arithmetic. For example, awk '{sum += $1} END {print sum}' numbers.txt will calculate and print the sum of the first field of all lines in numbers.txt.Both sed and awk can accomplish complex text manipulations that might otherwise require writing a separate program. They have a learning curve, but mastering them (even at a basic level) greatly enhances your command-line text processing capabilities.
&&, ||, and ;In shell scripting and on the command line, multiple commands can be strung together on one line using control operators. The three most common operators are &&, ||, and the semicolon ;. They allow conditional or sequential execution of commands:
command1 && command2 – Executes command2 only if command1 succeeded (i.e., returned an exit status of 0). This is useful for chaining commands that depend on each other. For example, mkdir newdir && cd newdir creates a new directory and then changes into it only if the directory creation was successful.command1 || command2 – Executes command2 only if command1 failed (returned a non-zero exit status). For example, grep "pattern" file || echo "Pattern not found". Here, if grep does not find the pattern (and thus exits with status 1), the echo runs; if grep succeeds, the echo is skipped.command1; command2 – Executes command1 and then, regardless of its success or failure, executes command2. Commands separated by ; always run sequentially, ignoring exit status. For example: echo "Starting update"; brew update will print the message, then run brew update afterward.Using && and || can make one-liners and scripts more concise and robust by handling success or failure in-line. They function like logical AND and OR, but for command execution flow.
$(...) to Embed CommandsCommand substitution allows the output of one command to be inserted into the arguments of another command. In Bash and Zsh, this is done with the syntax $( ... ). (Older syntax using backticks `...` is still supported but $(...) is generally preferred for readability and nesting.)
For example, suppose there are many .txt files and you want to move those that contain the word “Kansas” to a directory named “kansas”. This can be done by combining grep with mv using command substitution:
$ mv $(grep -l "Artist: Kansas" *.txt) kansas/
Here’s how it works:
grep -l "Artist: Kansas" *.txt lists all .txt filenames in the current directory that contain the string “Artist: Kansas”. The -l option makes grep output only the names of files with matches, rather than the matching lines.$( ... ) around that command captures its output. The shell will replace the entire $(grep ...) expression with the list of files that grep found.mv command then effectively sees: mv file1.txt file2.txt ... kansas/, which moves all those files into the kansas/ directory (which should exist beforehand).Command substitution is extremely useful for embedding the result of one command into another. Another simple example: echo "Today is $(date +%A)" might output “Today is Wednesday” (if run on a Wednesday), since $(date +%A) is replaced by the output of the date +%A command (which prints the weekday name).
Process substitution is a feature of Bash and Zsh that allows the output of a command (or input to a command) to be referenced like a file. The syntax is <(command) for substituting output and >(command) for substituting input (the latter is less commonly used). It’s often used with commands that normally take file arguments, enabling them to instead operate on dynamic data.
One classic example is comparing the outputs of two commands using diff. Normally diff compares two files, but with process substitution:
$ diff <(ls *.jpg | sort) <(seq -f "%g.jpg" 1 1000)
This compares two “virtual” files:
ls *.jpg | sort (a sorted list of .jpg files in the current directory).seq -f "%g.jpg" 1 1000 (which generates the sequence 1.jpg, 2.jpg, ..., 1000.jpg).The diff command sees each of those outputs as a file (the shell creates temporary file descriptors for them) and thus can report differences between the actual files present and the expected sequence of files.
The <( ) syntax runs the commands inside the parentheses and provides paths like /dev/fd/63 that refer to their outputs. This can be used in many situations where a command expects a filename, allowing you to feed in process outputs directly. Keep in mind that process substitution is not a POSIX feature; it is specific to certain shells (like Bash and Zsh).
Usually, commands are executed by typing them in the shell or running a script file. However, it’s also possible to execute commands that come from standard input or a file, effectively treating command text as a script on the fly.
For instance:
$ echo "ls -l /Applications" | bash
This pipeline sends the string "ls -l /Applications" into the Bash shell’s standard input. Bash will execute the commands it reads from stdin, in this case listing the /Applications directory with details. The effect is similar to writing that command in a file and then running bash file, but it’s done without an intermediate file, using a pipe.
Similarly, if a file (say commands.txt) contains a series of commands, running bash < commands.txt will execute those commands in sequence as if they were typed into the shell. This is a simple way to run a batch of commands without creating a separate script file (though functionally it is like a temporary script).
When using this technique, especially with input from untrusted sources, be cautious: feeding text into sh or bash will execute whatever commands are present, which can be dangerous if the content is not known to be safe.
On the command line, appending an ampersand (&) to a command will execute it in the background. This means the shell prompt returns immediately and the command runs concurrently, allowing you to continue other work. Background jobs are useful for long-running processes that you don’t need to monitor actively.
For example:
$ long_running_script.sh &
[1] 4821
When run, this command outputs a job identifier in brackets (here [1]) and a process ID (4821). The shell prompt returns while long_running_script.sh continues executing in the background.
The shell provides built-in commands to manage these background jobs:
jobs – Lists current jobs, showing their job number and status (Running, Stopped, etc.).fg %1 – Brings job number 1 to the foreground (so it becomes active and occupies the shell again). Simply fg without an argument brings the most recent background job to foreground.bg %1 – If a job is stopped (perhaps via Ctrl-Z) or currently in the foreground, this resumes it in the background.kill – Sends a signal to a process, usually to terminate it. You can use either the process ID or the job number with a percent sign (e.g., kill 4821 or kill %1) to stop a background job. Use kill -9 (SIGKILL) only if a normal kill doesn’t work, since it forces termination.If the output of a background process isn’t redirected, it will still appear in the terminal (possibly intermixing with your typed commands). To avoid this, it’s common to redirect the output to a file (using > filename 2>&1) or to /dev/null if it’s not needed.
The shell will notify you when a background job completes or stops, typically with a message like [1]+ Done long_running_script.sh the next time you press Enter. This is a helpful reminder that the job finished (or was suspended).
For processes that need to continue even after logging out, consider using nohup (no hang-up) or a terminal multiplexer like screen or tmux. Running nohup command & will make the process ignore the hang-up signal (so it isn’t terminated when the session ends), and by default it will write output to a file named nohup.out if not redirected.
A subshell is a separate shell process launched by the current shell. Parentheses ( ... ) around a sequence of commands will execute them in a subshell. A common use of subshells is to localize a change in state (like the working directory or shell options) so that it doesn’t affect the main shell. This is frequently seen in advanced one-liners and scripting patterns, especially when working with archiving or file operations.
Consider these use cases:
cat archive.tar.gz | (mkdir -p /tmp/other && cd /tmp/other && tar xzvf -) – This one-liner creates a directory /tmp/other (if it doesn’t exist), then changes into it within a subshell, and extracts the archive from standard input (the - tells tar to read from stdin). The current shell’s working directory is unchanged by the cd because it occurs in the subshell. After running, the files from the archive will be in /tmp/other.tar czf - dir1 | (cd /tmp/dir2 && tar xzf -) – Here, tar czf - dir1 creates a tarball of the directory dir1 and writes it to stdout (the - as the output file means “standard output”). That output is piped into a subshell where we first cd /tmp/dir2 and then run tar xzf - to extract from stdin into /tmp/dir2. This effectively copies the contents of dir1 into /tmp/dir2 in one line, without creating an intermediate tar file on disk.tar czf - dir1 | ssh user@remote "(cd /path/on/remote && tar xzf -)" – This variation uses ssh to run a subshell on a remote host. The local tar writes to stdout, which ssh then feeds into the remote shell’s tar xzf - command. The subshell on the remote side ensures that the extraction happens in the directory /path/on/remote. This is a convenient way to transfer an entire directory to a remote location in one command.cd into a directory in a subshell and run tar there so that the archive’s content doesn’t include unwanted path prefixes. (See the Tar Archiving and Backups section below for an example.)These examples show how subshells can be combined with pipes to perform tasks in isolated environments. By using && between commands inside the subshell, each step (e.g., directory creation or changing directory) must succeed for the next to run, adding a measure of safety. Using - with tar instructs it to use standard input or output, which, combined with ssh for remote commands, enables powerful one-liners for copying or deploying files.
In summary, subshells (with ( ... )) allow you to create a temporary shell with its own environment, which is very useful for containing side effects. After the subshell command completes, you’re back in the original shell environment as if nothing changed, except for whatever output or files the subshell’s commands produced.
exec)When you run a command like bash (or zsh) from your current shell, you create a new subshell (a child process of your current shell). Your prompt will change (perhaps to indicate a new shell level), and you can exit this subshell to return to the original shell. Any changes (like changed directories or variables) made in the subshell won’t affect the parent shell once it’s exited.
For example:
$ echo $PS1
\u@\h:\w\$
$ bash # start a new Bash subshell
$ PS1="subshell$ " # change the prompt in the subshell
subshell$ exit # exit the subshell
$
In this snippet, the prompt \u@\h:\w\$ is the default format (user@host:cwd$). After starting bash, we’re in a subshell; changing PS1 only affects the subshell’s prompt (shown here as subshell$). Exiting the subshell returns to the original shell, which still has the original prompt and environment.
Sometimes, however, you might want to completely replace the current shell process with another program. The exec command achieves this: when you exec program, your current shell is replaced by program, and when program terminates, there is no shell process to return to.
For instance, running exec bash in a shell will not create a nested subshell. It will replace the current shell with a new Bash process. When you then exit that Bash, the session ends (since the original shell was replaced). This can be useful in scripts (to hand off control to another program) or when logging in (some configurations exec a login shell to replace the initial process).
In summary, parentheses ( ... ) launch a child shell (subshell) to execute commands without affecting the parent shell, whereas exec replaces the current shell process entirely with a new program. Subshells are great for containment and local scope; exec is for when you are done with the current process and want to run something else in its place.
Throughout this guide, many one-liner commands have been demonstrated – powerful combinations of tools strung together to perform complex tasks in one go. Mastering one-liners can significantly boost efficiency on the command line. Here are a few additional examples of handy one-liners:
du -ah . | sort -h -r | head -n 10. This uses du (disk usage) with -a to list sizes of all files and directories under the current directory, and -h for human-readable sizes. The list is then sorted by size (sort -h understands the size suffixes due to -h) in reverse order (-r, so largest first), and head -n 10 picks the top 10 entries.grep -R "needle" .. This prints all occurrences of "needle" in files under the current directory. Add -n for line numbers or -l to list only the filenames containing the string. (For even faster recursive searching with ignore patterns, a tool like rg (ripgrep) can be used similarly.)grep -o "apple" file.txt | wc -l. Here, grep -o outputs each occurrence of “apple” on a new line, and wc -l counts those lines, yielding the number of times “apple” appears in the file. (Using grep -c would count matching lines, which would count a line only once even if the word appears multiple times on that line.)for f in *.jpeg; do mv "$f" "${f%.jpeg}.jpg"; done. This loop iterates over all .jpeg files in the directory, and for each file, constructs a new name by replacing the .jpeg extension with .jpg (${f%.jpeg}.jpg) and then renames the file. The ${f%.jpeg} parameter expansion strips the .jpeg suffix from $f, and then .jpg is appended.watch -n 5 'df -h'. The watch command runs the given command repeatedly (every 5 seconds, in this case) and updates the display. This example re-runs df -h (disk free space in human-readable units) every 5 seconds, so you can observe changes over time. (On macOS, watch is not installed by default, but you can install it via Homebrew or use alternatives like running a loop in a script.)With practice, it becomes easier to create one-liners to solve problems on the fly. It is always a good practice to break down a complex one-liner (as done above) to ensure it is understood and accomplishes exactly what is intended.
wc -l file.txt – Print the number of lines in file.txt. (Use -w to count words, -c for bytes.)head -n 20 file.txt – Show the first 20 lines of a file. (tail -n 20 shows the last 20 lines.)cut -d ',' -f1-3 data.csv – Extract the 1st through 3rd fields from a CSV file (assuming comma delimiter).sort -r -n numbers.txt – Sort the lines numerically (-n) and reverse (-r) the order to get highest to lowest.uniq -c sorted.txt – Collapse adjacent duplicate lines in a sorted file, prefixing each unique line with its occurrence count (-c).tac file.txt – Output the file with lines in reverse order (last line first). (Not available by default on macOS; use tail -r as an alternative.)paste file1.txt file2.txt – Combine two files line-by-line (joining lines with a tab by default).tr '[:upper:]' '[:lower:]' – Translate uppercase to lowercase from standard input (outputs to standard output). Use tr -d to delete characters (e.g., tr -d '\n' to remove newlines).rev file.txt – Reverse each line of the file (characters in each line are reversed).diff -u old.txt new.txt – Compare two files and produce a unified diff (lines prefixed with +/– to show changes).which command – Show the path of the executable that would run for "command". (E.g., which python might output /usr/bin/python.)type command – Show how "command" is interpreted by the shell (alias, function, built-in, or file path).find /path -type f -name "*.log" – Recursively find files under /path that match the pattern *.log. (Use -type d to search for directories.)find . -size +100M -exec ls -lh {} \; – Find files larger than 100MB in the current directory (and subdirs) and list them with details (ls -lh). The \; terminates the -exec clause.find . -name "*.tmp" -exec rm {} \; – Find and remove all files ending with .tmp in the current directory tree. (Do a dry-run first: replace rm with echo rm to verify.)grep -R "foo" /path – Recursively search for "foo" in all files under /path. Add -n for line numbers, -i for case-insensitive search.grep -v "^#" – From input, output all lines that do not start with # (useful for filtering out comment lines).xargs – Construct argument lists and execute a command. Often used with find: e.g., find . -type f -print0 | xargs -0 grep "hello" to search for "hello" in all files found (using null-terminated strings for safety).printenv – Display all environment variables (or use printenv VAR for a specific variable).export VAR=value – Set an environment variable for the current shell (and its subprocesses). E.g., export EDITOR=vim.alias ll='ls -la' – Create an alias ll for the command ls -la. Add alias definitions to ~/.zshrc (or ~/.bashrc) to make them permanent.source ~/.zshrc (or . ~/.zshrc) – Apply changes from the Zsh configuration file to the current session (reload the startup file). Replace ~/.zshrc with ~/.bashrc for Bash.HISTSIZE, HISTFILESIZE – Variables controlling how many commands to keep in history (in memory and in the history file, respectively).HISTCONTROL=ignoredups:ignorespace (Bash) / HIST_IGNORE_DUPS, HIST_IGNORE_SPACE (Zsh) – Settings to ignore consecutive duplicate commands and those starting with a space in history.Ctrl-R – (interactive search) Search backward through command history. Type to filter, press Ctrl-R repeatedly to cycle through matches.set -o vi / set -o emacs – Switch the command-line editing mode to Vi or Emacs style. (Emacs mode is default; Vi mode for Vi/Vim key bindings.)cd - – Switch to the previous directory (toggle back and forth between two recent directories).pushd dir / popd – Use a directory stack. pushd changes directory (like cd) but pushes the old directory onto a stack; popd pops it off to return. Useful for jumping around and then back.command1 && command2 – Run command2 only if command1 succeeds (exit status 0).command1 || command2 – Run command2 only if command1 fails (non-zero exit status).command1; command2 – Run command1 and then command2 unconditionally (sequentially).$(command) – Command substitution: use the output of command in place in another command. (E.g., echo "Files: $(ls | wc -l)" prints “Files: X”.)diff <(cmd1) <(cmd2) – Process substitution: treat the output of cmd1 and cmd2 as files and compare them with diff (or use as inputs to other commands that expect files).echo "command" | bash – Pipe a dynamically generated command or script into a new shell (here Bash) to execute it.cmd & – Run cmd in the background. Use jobs to list background jobs, fg to foreground a job, kill to terminate.nohup cmd & – Run cmd immune to hangup (logout), in the background. Output goes to nohup.out by default.(cd /tmp && do_something) – Execute commands in a subshell (here changing directory to /tmp only within that subshell). The current directory of the main shell remains unchanged.exec program – Replace the current shell with program. The current shell process ends, and program takes over (when program exits, the session ends).for i in {1..5}; do ...; done – Loop from 1 to 5 (inclusive). Inside the loop, $i holds the current number. Useful for simple iterative tasks in shell scripts.Written on November 12, 2025
qcd Navigation, Tar Backups, and Find Utilities (Written November 12, 2025)Apple’s macOS now uses Zsh as the default shell (since macOS Catalina), replacing Bash:contentReference[oaicite:20]{index=20}. This means configuration that would go into ~/.bashrc on Linux is instead placed in a Zsh startup file. For login shells (the default in Terminal app), Zsh reads ~/.zprofile (and ~/.zshrc for interactive shells). To permanently add environment variables or aliases on macOS, edit your ~/.zprofile (or ~/.zshrc) and add the desired export, alias, or function definitions. After saving, source the file (or reopen Terminal) to apply changes.
For example, to ensure Homebrew’s paths are set up in Zsh, include the lines below in ~/.zprofile. This runs Homebrew’s shell setup and adds Homebrew’s sbin directory to PATH:contentReference[oaicite:21]{index=21}:
# Homebrew environment setup (for macOS)
eval "$(/opt/homebrew/bin/brew shellenv)" # Homebrew: set PATH and variables
export PATH="/opt/homebrew/sbin:$PATH" # include Homebrew sbin in PATH
Explanation: The brew shellenv command outputs the necessary export statements to configure Homebrew, which we evaluate in the shell. We also explicitly prepend Homebrew’s sbin directory so commands like Nginx (if installed via Homebrew) are found:contentReference[oaicite:22]{index=22}. After adding these, you can use source ~/.zprofile (or restart the shell) to apply them.
qcd FunctionIf you frequently work in certain directories, you can define a “quick change directory” function qcd to jump to them by a short keyword. This avoids typing long paths each time. For instance, define qcd in your ~/.zprofile as follows:
# Quick change directory function (qcd) for common locations
qcd() {
case "$1" in
web) cd /opt/homebrew/var/www ;; # jump to web root
nginx) cd /opt/homebrew/etc/nginx ;; # jump to Nginx config
*) # fallback for unknown keyword
echo "qcd: unknown key '$1'"
return 1 ;;
esac
pwd # print the current directory to confirm location
}
In this example, running qcd web will cd into /opt/homebrew/var/www, and qcd nginx will go to /opt/homebrew/etc/nginx. The function uses a Bash/Zsh case statement to map a short key to each target directory:contentReference[oaicite:23]{index=23}. If an unsupported key is given, it prints an error message and returns a non-zero status. After changing directory, it prints the new working directory (pwd) so you always know where you ended up:contentReference[oaicite:24]{index=24}.
Tip: You can extend the qcd function with as many locations as needed by adding more case patterns. In Bash, you could even set up tab-completion for the keywords (e.g. using complete -W "web nginx" qcd), though this particular complete command works in Bash but not Zsh:contentReference[oaicite:25]{index=25}. (Zsh has its own completion system if desired.)
Alternatively, if you prefer separate commands per location, simple aliases can suffice. For example, you might add:
alias gonginx="cd /opt/homebrew/etc/nginx/"
alias gohttp="cd /opt/homebrew/var/www/"
in ~/.zprofile, which would let you type gohttp to jump to the web directory, etc. These aliases achieve a similar result:contentReference[oaicite:26]{index=26}, but using the single qcd function is a more scalable approach for multiple destinations.
Regular backups of your web directory are important. We can create a shell function to compress the /opt/homebrew/var/www folder into a timestamped tar.gz archive. A basic approach is: change to /opt/homebrew/var, run tar -zcvf on the www folder, then return. However, a cleaner method is to use a subshell to temporarily change directories for tar, which avoids affecting your current shell’s directory.
For example, here’s a tar_backup function that archives the www folder with a filename prefixed “WEB” plus the current date (YYYYMMDD). It ensures uniqueness by appending a counter if an archive of the same day already exists:
tar_backup() {
local base="WEB$(date +"%Y%m%d")" # e.g. WEB20251112 for Nov 12, 2025
local filename="${base}.tar.gz"
local counter=1
# If a file with this name exists, append _1, _2, ... to the base name
while [ -e "$filename" ]; do
filename="${base}_${counter}.tar.gz"
counter=$((counter + 1))
done
# Create the tar.gz archive of the www directory
( cd /opt/homebrew/var && tar -zcvf "$filename" www )
}
This function uses a subshell ( ... ) to cd into /opt/homebrew/var for the tar command, then automatically returns to the original directory when done. The archive will contain the www/ folder at its top level (because we ran tar from /opt/homebrew/var) rather than an absolute path. Using a subshell in this way isolates the directory change to that subshell process:contentReference[oaicite:27]{index=27}. (Alternatively, GNU tar’s -C option can change directory for you: e.g. tar -C /opt/homebrew/var -zcvf "$filename" www, achieving the same effect without a subshell:contentReference[oaicite:28]{index=28}.)
The naming logic above checks if WEB20251112.tar.gz exists, and if so, tries WEB20251112_1.tar.gz, WEB20251112_2.tar.gz, etc., ensuring no backup is overwritten:contentReference[oaicite:29]{index=29}. The date +"%Y%m%d" part embeds the date in the filename for easy reference.
Usage: Simply call tar_backup in the terminal; it will create the archive in the current directory (which, if you ran it from your home or anywhere, will still place the .tar.gz in /opt/homebrew/var because of the subshell cd). You can extend this idea further: for instance, a tar_web <name> function could take a custom name and produce <name>.tar.gz from the www folder:contentReference[oaicite:30]{index=30}, or a simplified tar_backup_prototype without uniqueness checks:contentReference[oaicite:31]{index=31}—but the above version is robust for daily use.
To restore or inspect a backup, you can extract the tar.gz archive. For safety, it’s wise to extract into a specific directory. Here’s a tar_extract function to do that:
tar_extract() {
if [ -z "$1" ]; then
echo "Usage: tar_extract [target_dir]"
return 1
fi
local archive="$1"
local target="${2:-/opt/homebrew/var}" # default target is /opt/homebrew/var
mkdir -p "$target"
tar -zxvf "$archive" -C "$target" # extract here (z=gunzip, x=extract, v=verbose)
}
This will extract the given tarball into /opt/homebrew/var by default (or into a specified target directory if you provide one). It uses tar -zxvf where -z handles gzip compression and -C <dir> changes to the target directory before extraction. For example, tar_extract WEB20251112.tar.gz would unpack the www folder from that archive back under /opt/homebrew/var. (We created the archive with www/ as the top-level folder, so extracting into /opt/homebrew/var recreates /opt/homebrew/var/www and its contents.)
Verification: You can list the contents of a tar file without extracting using tar -tvf archive.tar.gz (t=list, v=verbose). Or perform a quick integrity check by combining commands: for instance, cat WEB20251112.tar.gz | tar -tzvf - will concatenate and list the archive contents in one step:contentReference[oaicite:32]{index=32}. This one-liner uses a pipe to feed the tar file into tar for listing, similar to how one might combine split files or process an archive from standard input.
Note on Subshell Pipelines: The backup function above used a subshell for convenience. This idea comes from the classic “tar pipe” trick to copy or move directory trees. For example, one can clone a directory hierarchy with permissions preserved by running: (cd src && tar -cf - .) | (cd dst && tar -xpf -). Here each tar runs in a subshell that cd to the proper directory before executing, and a pipe connects the archive stream between them:contentReference[oaicite:33]{index=33}. In modern usage, the tar -C flag can achieve the same without subshells:contentReference[oaicite:34]{index=34}, but it’s good to recognize this pattern as it shows how subshells isolate the working directory and environment for each part of a pipeline.
Sometimes you need to search through files or find files by name pattern. The following custom functions (to put in ~/.zprofile) make use of find and grep to simplify common search tasks. All searches are case-insensitive by default (using -i or -iname):
file_grep <text> <path> – Search within files for a given text string under the specified directory. It will recurse through all files under <path> and print any lines that match <text>. For example, file_grep "nginx" /opt/homebrew will find all occurrences of “nginx” in files under /opt/homebrew. Implementation: it uses find <path> -type f -print0 | xargs -0 grep -i "<text>" to handle filenames safely and pass them to grep:contentReference[oaicite:35]{index=35}. (The -print0 and xargs -0 ensure file names with spaces/newlines are handled correctly.) Note: sudo is included before find in case some subdirectories require root access:contentReference[oaicite:36]{index=36}; you can drop the sudo if not needed. The output will include filenames and the matching lines. If you prefer to see line numbers, add the -n option to the grep command (making it grep -in) so each result shows the line number of the match.find_re <path> '<regex>' – Find files by a regular expression pattern on their full path. This uses find <path> -type f -iregex "<regex>", which matches the entire file path against the regex (case-insensitively):contentReference[oaicite:37]{index=37}. For example, find_re /opt/homebrew '.*frank.*' would list all files under /opt/homebrew whose path (including filename) contains “frank” as a substring (the .* wildcards before and after ensure the regex matches anywhere in the path). This is powerful for flexibly filtering files by name pattern beyond simple wildcards.find_str <path> <substring> – Find files by a simpler substring match in the filename. It runs find <path> -type f -iname "*<substring>*", so any file whose name (case-insensitive) contains the given text will be listed:contentReference[oaicite:38]{index=38}. For instance, find_str /opt/homebrew 'conf' might find files like nginx.conf or myconfig.txt. This is essentially a shortcut for using the -iname wildcard match of find.Each of these functions prints results to standard output. They can be piped to other tools (for example, append | less for easier browsing if the output is very long). They illustrate the combination of find and grep for powerful searching in shell scripts. In fact, you could accomplish similar text search with a single grep command (e.g. grep -R "nginx" /opt/homebrew to recursively search within files, or adding -l to list filenames only), but the find | xargs grep approach is more controlled and can be more flexible – for example, you could modify it to restrict by file type or size using find options, or to avoid searching in certain subdirectories.
Place the above function and alias definitions into your ~/.zprofile (or ~/.zshrc), then run source ~/.zprofile to load them immediately. New Terminal sessions will load them automatically. With this setup on macOS, you benefit from a customized shell environment:
qcd web take you to project directories instantly.tar_backup safely archive your work with timestamps, and tar_extract helps restore them.file_grep, find_re, find_str let you locate files and content with ease.These improvements streamline common tasks and align with macOS’s Zsh shell conventions:contentReference[oaicite:39]{index=39}, making your command-line workflow more efficient and enjoyable.
Written on November 12, 2025
Tar ( tar ) is the standard archiving tool on macOS for combining multiple files and directories into a single archive file, often with compression. It’s commonly used for backups, transferring directory structures, and bundling projects. This section covers how to create and extract tar archives in the Terminal, best practices for safe archiving, and advanced techniques for using tar in backup scripts and remote operations.
tar
The basic syntax to create a tar archive is:
tar -czf archive-name.tgz source-folder
In this command, -c means “create a new archive,” -z compresses it with gzip (producing a .tgz or .tar.gz file), and -f specifies the archive filename. For example, to archive a folder LotsOfFiles into a compressed tarball:
tar -czf LotsOfFiles.tgz LotsOfFiles
If the folder is large, add the verbose flag -v to monitor progress. This will list each file as it’s added:
tar -czvf LotsOfFiles.tgz LotsOfFiles
LotsOfFiles/
LotsOfFiles/file1.txt
LotsOfFiles/file2.txt
...
By default, tar archives store paths relative to the current working directory. To avoid capturing unnecessary path components and to isolate the operation, it’s best practice to either run tar from the directory containing the files or use tar’s -C option. For example, to archive the contents of /opt/homebrew/var/www without embedding the full path:
tar -czf ~/backups/www_site.tgz -C /opt/homebrew/var www
This changes to /opt/homebrew/var while archiving, so that inside www_site.tgz the files are stored under a relative www/ directory rather than /opt/homebrew/var/www . Equivalently, one can use a subshell to similar effect:
(cd /opt/homebrew/var && tar -czf ~/backups/www_site.tgz www)
Tip: Running archive commands in a subshell (as shown above with parentheses) ensures your current shell’s working directory is not changed by the
cd. Using-Cis functionally the same and improves path safety by preventing unintended parent directories from ending up in the archive.
To extract a tar archive, use the -x (extract) flag. For instance, to extract LotsOfFiles.tgz in the current directory:
tar -xzf LotsOfFiles.tgz
Adding -v will show each file as it’s unpacked:
tar -xzvf LotsOfFiles.tgz
LotsOfFiles/
LotsOfFiles/file1.txt
LotsOfFiles/file2.txt
...
It’s often wise to control where the files go when extracting. Use -C to specify a target directory. For example, to extract into a new folder on the Desktop:
mkdir ~/Desktop/Recovery
tar -xzvf project_backup.tgz -C ~/Desktop/Recovery
This ensures all files from project_backup.tgz end up in the Recovery directory. Before extracting unfamiliar archives, it’s good practice to list their contents first with tar -tf (or tar -ztf for gzip-compressed archives) to preview the files and folder structure. This helps avoid any surprises such as “tarbombs” (archives that unpack files all over the current directory). Always extract to an isolated directory or use -C to avoid overwriting important files in your working location.
tar_backup
To automate backups, you can use a shell function that creates a timestamped archive. Below is an example function that compresses a directory (e.g. a local website in /opt/homebrew/var/www ) into a date-stamped tar.gz file. It ensures each backup filename is unique by appending a number if a file for that date already exists:
tar_backup() { local src=\"/opt/homebrew/var/www\" local dest=\"~/backups\" local base=\"www_$(date '+%Y%m%d')\" local archive=\"$dest/${base}.tar.gz\" mkdir -p \"$dest\" # ensure destination directory exists local n=0 while [ -e \"$archive\" ]; do n=$((n+1)) archive=\"$dest/${base}_$n.tar.gz\" done tar -czf \"$archive\" -C \"$src\" . echo \"Backup archive created: $archive\" }
This tar_backup function will create an archive like www_20251112.tar.gz (using the current date). If you run it again on the same day, it will produce www_20251112_1.tar.gz , www_20251112_2.tar.gz , and so on, avoiding overwrites. The use of -C \"$src\" . means “change to the source directory and archive . (everything in it)”, so the archive contains only the contents of www , not the entire path. The function also creates the backup directory ( ~/backups ) if it doesn’t exist. For example:
$ tar_backup Backup archive created: ~/backups/www_20251112.tar.gz
After running tar_backup , the ~/backups folder will contain the new compressed archive of the www directory. You can adjust the src and dest paths in the function to suit what you’re backing up and where you want the backup stored.
tar_extract
Just as creating backups can be scripted, extracting archives safely can be automated with a function. The following tar_extract function takes an archive filename and an optional target directory. It will create a directory (if not provided, it derives one from the archive name) and extract the archive into that location, keeping the files contained:
tar_extract() { local archive=\"$1\" local dest=\"$2\" if [ -z \"$archive\" ]; then echo \"Usage: tar_extract <archive-file.tar.gz> [destination-folder]\" return 1 fi if [ -z \"$dest\" ]; then dest=\"${archive%%.tar*}\" # remove .tar, .tar.gz, .tar.bz2, etc. fi mkdir -p \"$dest\" tar -xzf \"$archive\" -C \"$dest\" echo \"Extracted $archive to $dest/\" }
With this function, extracting an archive is as simple as calling tar_extract archive.tar.gz . If destination-folder isn’t specified, it will create a folder (here, named after the archive file) to hold the contents. For example, running tar_extract ~/backups/www_20251112.tar.gz will make a directory www_20251112 (in the current directory) and unpack the archive into it. The function’s approach of always using a dedicated extraction directory and the -C flag helps prevent the accidental scattering of files into the wrong location. The output confirms where the files were extracted:
$ tar_extract ~/backups/www_20251112.tar.gz Extracted ~/backups/www_20251112.tar.gz to www_20251112/
Tar’s ability to read from standard input and write to standard output makes it powerful for copying or deploying files between systems without intermediate files. By combining tar with SSH, an entire directory can be sent over the network and unpacked on the fly. For example, to copy a local directory to a remote server:
tar -czf - -C /opt/homebrew/var www | ssh user@remote-server \"tar -xzf - -C /opt/homebrew/var\"
This one-liner does the following: on the local machine, tar archives the /opt/homebrew/var/www directory and sends the compressed archive to stdout (the - after -czf signifies “write to standard output”). The SSH command connects to remote-server and runs another tar that reads from stdin ( -xf - ) and extracts into /opt/homebrew/var . In effect, it streams the www directory to the remote host and unpacks it there, all in a single step. This is an efficient way to deploy files or back up data to a remote machine without creating a temporary archive file on either side. (Add a -v to one or both tar commands if you want to see the list of files transferred.)
The same technique can be reversed to pull files from a remote system. For instance, to back up /var/logs from a remote server onto your Mac:
ssh user@remote-server \"tar -czf - /var/logs\" | tar -xzf - -C ~/Downloads/remote-logs
Here, the tar on the remote side writes a compressed archive of /var/logs to stdout, which is piped over SSH to the local tar command that immediately extracts it into ~/Downloads/remote-logs . This way, the logs are retrieved and unpacked on the Mac without needing an intermediate file.
tar
Because tar can work with standard input/output, it excels in advanced shell scenarios using pipes and process substitution. Here are a couple of powerful techniques:
Diffing two archives: To compare the contents of two tar archives (for example, two different backups) without extracting them, you can list their file names and use the diff command with process substitution:
diff -u <(tar -tzf backup-old.tar.gz) <(tar -tzf backup-new.tar.gz)
This command runs tar -tzf (list contents of a gzip-compressed archive) on each archive in a subshell. The <( ) syntax feeds those outputs to diff as if they were files. The result is a unified diff listing differences in file names between backup-old.tar.gz and backup-new.tar.gz – for instance, it will show which files were added or removed. This approach quickly spot-checks changes between archive versions without manual inspection.
Combining split archives: If an archive was split into multiple parts (e.g., by using the split command), you can recombine and extract it on the fly. Suppose a large archive was split into backup.tar.gz.partaa , backup.tar.gz.partab , etc. You can use cat to join the parts and pipe into tar for extraction:
cat backup.tar.gz.part* | tar -xzf - -C ~/Downloads/Recovered
Here, cat concatenates all the part files in order and sends the complete archive stream to tar via the pipe. The tar -xzf - reads from stdin and extracts the contents into the Recovered directory. This saves time and disk space by avoiding the creation of a full recombined archive on disk first.
tar
One-Liners
Tar’s flexibility lends itself to many convenient one-line commands for everyday tasks. Below are a few practical examples:
Listing archive contents: To quickly see what’s inside an archive without extracting, use tar -tf (or tar -ztf for compressed archives). For example:
tar -tvf www_20251112.tar.gz | head -n 5
This will display the first five entries of the archive www_20251112.tar.gz in long format (permissions, owner, size, date, and name). You can omit the v to just list file names, or pipe to grep to search for a specific file. This is useful for verifying the contents of a backup archive before doing a full extract.
Streaming a backup to a file or processor: Tar can be combined with other commands via pipelines. For instance, tar’s output could be piped directly into a compression tool or an encryption program. However, since flags like -z (gzip) are built-in, you often don’t need an external compressor. One common pattern is piping tar to openssl for encryption:
tar -czf - /path/to/important/data | openssl enc -aes-256-cbc -e -out secure_backup.tar.gz.enc
Here tar produces a compressed archive on stdout, which is encrypted by openssl and written to secure_backup.tar.gz.enc . The reverse (decrypting and extracting) can be done similarly by piping openssl decryption into tar -xzf - . This demonstrates how tar streams allow on-the-fly processing of archives.
Remote download and unpack: You can download and extract an archive in one step using curl (available by default on macOS). This is handy for installing tarball releases from the internet without manual steps:
curl -L https://example.com/project.tar.gz | tar -xz -C /opt/projects/
The -L option tells curl to follow redirects, and the pipeline feeds the downloaded archive straight into tar -xz , which extracts it into /opt/projects/ . No temporary file is saved locally; the data is streamed directly into the extraction process. This one-liner is frequently used for quick deployments or retrieving sample projects.
In summary, the tar command is a cornerstone for archiving and backups on the macOS command line. By following best practices (such as using -C for directory isolation and verifying contents with tar -t ) and leveraging shell pipelines, tar can handle everything from simple folder compression to complex backup workflows and remote file transfers, all while maintaining safety and efficiency.
Written on November 12, 2025
ls-R)
The ls -R command lists all files and directories recursively. This means it will display the contents of the current directory and all subdirectories, which is useful for viewing a complete directory structure.
-lh)
The ls -lh command displays file sizes in a human-readable format, showing sizes in kilobytes (KB), megabytes (MB), or gigabytes (GB), as appropriate. This makes it easier to understand file sizes at a glance compared to the default byte-based format.
-lart)
The ls -lart command combines several options to provide an advanced view of files:
-l: Displays files in a long format, including permissions, ownership, size, and modification date.-a: Includes hidden files (those starting with a dot) in the listing.-r: Reverses the order, showing the oldest files first.-t: Sorts files by modification time, placing the most recently modified files at the end of the list when used with -r.-lSr)
The ls -lSr command lists files by size in ascending order:
-l: Shows detailed file information, such as file size, permissions, and ownership.-S: Sorts files by size, starting with the largest.-r: Reverses the order, showing the smallest files first.--color=auto)
The ls --color=auto command adds color to the output, distinguishing files, directories, and symbolic links by color. This visual enhancement simplifies identification of different types of files within the terminal.
rmWhen a file name contains spaces, quotes are necessary to ensure the shell interprets it correctly. For example:
rm "file with spaces.txt"
In this case, either single or double quotes can be used to handle the spaces in the file name properly.
An alternative method for handling file names with spaces is to escape each space with a backslash (\):
rm file\ with\ spaces.txt
This approach is particularly effective when dealing with multiple files or file names containing special characters directly in the terminal.
--preserve-root)
The rm -rf --preserve-root command adds an extra safeguard, ensuring the root directory (/) is never deleted. This is crucial to prevent accidental system-wide deletion.
rm !(*.txt))
The rm !(*.txt) command deletes all files except those matching a specific pattern, such as text files. This requires enabling extglob with the command:
shopt -s extglob
This method provides control over batch deletion while protecting specific file types.
cp – Enhanced Copying
The cp command offers several enhancements:
cp -u source destination: Only copies files if the source is newer than the destination, making it useful for backup scripts.cp --parents file /path/to/destination: Copies the file along with its directory structure, preserving the hierarchy.
For more efficient copying, consider using rsync:
rsync -avh source destination
rsync is optimized for large transfers, preserving permissions and utilizing compression.
mv – Moving with Precision
The mv command can be used with additional options:
mv -u source destination: Moves files only if the source is newer or if the destination file does not exist.find . -name "*.bak" -exec mv {} /backup/ \;: Moves all .bak files from the current directory to /backup/, combining find with mv for complex file operations.find – Complex Searchfind /path -mtime -1: Finds files modified in the last day.find /path -type f -exec chmod 644 {} \;: Finds files and changes their permissions in bulk.find /path -name '*.log' -size +10M -delete: Deletes log files larger than 10MB.awk – Advanced Text Processingawk '{print $1, $3}' file: Extracts and prints specific columns from a file (e.g., column 1 and column 3).awk '/pattern/ {print $0}' file: Prints lines matching a pattern.Combined with process monitoring, the following command lists users and processes consuming more than 50% CPU:
ps aux | awk '$3 > 50 {print $1, $3, $11}'
sed – Streamlined Text Editingsed -i 's/old/new/g' file: Replaces all occurrences of old with new within a file.sed '/pattern/d' file: Deletes lines matching a pattern.df and du – Disk Usage Analysisdf -hT: Displays disk usage in a human-readable format with file system type.du -sh * | sort -rh: Sorts files and directories by size, providing a clear overview of storage usage.du --max-depth=1 /path: Shows disk usage for directories up to a specific depth, helping to identify space hogs quickly.ps and top – Process Monitoringps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem: Lists processes sorted by memory usage, helping to monitor resource-intensive tasks.top -o %MEM: Sorts processes by memory usage, prioritizing resource consumption monitoring.netstat and ss – Network Monitoringnetstat -tuln: Lists all open ports and their associated services.ss -tuln: A faster alternative to netstat, providing similar insights into network activity.xargs – Efficient Command Chainingfind /path -name '*.log' | xargs rm: Finds all log files and passes them to rm for deletion.cat file.txt | xargs -n 1 echo: Feeds each line of a file into echo one by one, enabling efficient multi-line processing.alias – Command Shortcuts
Setting an alias in ~/.bashrc or ~/.zshrc helps reduce repetitive typing:
alias proj="cd /home/user/projects"
alias ll="ls -lh"
alias cls="clear"
rsync – Smart Synchronizationrsync -avz --progress source/ destination/: Synchronizes files between two directories with compression and displays progress.rsync -avz --delete source/ destination/: Deletes files from the destination that do not exist in the source, ideal for mirroring directories.These advanced commands and techniques offer powerful control over file management, system monitoring, and data transfers in Linux.
(A) grep -r "###" /path/to/designated/folder
This command searches recursively for the string "###" in all files and subdirectories under the specified directory.
-r: The recursive flag, which enables searching through all files and subdirectories within the provided directory.Limitations:
(B) grep -rl "###" /path/to/designated/folder
This command functions similarly to the first, with the addition of the -l flag, which modifies the output to display only filenames containing the matching text, without showing the matching lines.
-r: Recursively searches through all directories and files.-l: Prints only the filenames where matches are found, excluding the matching lines from the output.Advantages:
Limitations:
-i flag is added.(C) find /path/to/designated/folder -type f | xargs grep -i "###" 1> tmp1 2> tmp2
This command begins by using find to list all files, then pipes the results to grep via xargs, which searches for the specified string "###".
-type f: Restricts the search to regular files only, excluding directories or other file types.|: The pipe operator, which takes the output of find and passes it as input to the next command (xargs).xargs: Takes the output from find and uses it as arguments for another command, in this case grep.1> tmp1: Redirects the standard output (such as matching lines and filenames) to a file named tmp1.2> tmp2: Redirects error messages (such as permission denied errors) to a file named tmp2.Advantages:
-i flag.Limitations:
-print0 and xargs -0.(D) find /path/to/designated/folder -type f -print0 | xargs -0 grep -i "###" 1> tmp1 2> tmp2
This command enhances the previous one by handling filenames that contain spaces or special characters more effectively.
-print0: Instructs find to output filenames terminated by a null character (\0), rather than a newline. This is beneficial for managing filenames that include spaces, special characters, or newlines.|: The pipe operator, used to pass the output from find to xargs.xargs -0: Instructs xargs to expect null-terminated input, ensuring compatibility with find's -print0 option and properly handling filenames with special characters.Advantages:
-i flag.Limitations:
-print0 and xargs -0.find "$(brew --prefix)" -name [tool_name] -type fThis document explains how to determine the installation path of a command-line tool on macOS. The instructions focus on scenarios involving Homebrew installations, though they can be adapted to any other method of tool installation. Every step and detail is preserved from earlier discussions but rearranged and generalized to maintain a high level of clarity and professionalism.
find "$(brew --prefix)" -name [tool_name] -type f Command
This command is a convenient way to locate a specific file—such as a binary for a tool—within the directory structure managed by Homebrew. The example below uses [tool_name] as a placeholder; substitute the actual tool’s name (e.g., ffmpeg, git, or any other executable).
find "$(brew --prefix)" -name [tool_name] -type f
brew --prefix: Returns the base directory where Homebrew is installed. On Apple Silicon (M1/M2) systems, this is commonly /opt/homebrew; on Intel-based Macs, /usr/local; some custom Homebrew setups may differ.
$(...): When the shell processes brew --prefix inside $(...), it replaces that portion of the command with the actual path (e.g., /opt/homebrew), resulting in:
find "/opt/homebrew" -name [tool_name] -type f
find "$(brew --prefix)": Instructs the find utility to begin searching in the Homebrew prefix directory returned by brew --prefix.
-name [tool_name]: Tells find to look for files named exactly [tool_name].
-type f: Restricts the search to regular files, excluding directories, symlinks, or other file types.
Outcome: The command scans Homebrew’s installation tree for files named [tool_name] and helps pinpoint the exact location of the installed binary.
Some tools may not have been installed using Homebrew, or they may be installed in a location not covered by the Homebrew directory structure. In such cases, the following approaches can be used:
which or command -vwhich [tool_name]
or
command -v [tool_name]
If [tool_name] is found in the PATH, these commands return the absolute path (for example, /usr/local/bin/[tool_name]). If nothing is returned, the tool is not on the system’s PATH.
If the exact location remains unknown, it may be necessary to search the entire file system:
sudo find / -name [tool_name] -type f 2>/dev/null
/: Starts the search from the root directory.2>/dev/null: Redirects error messages (such as permission denials) to /dev/null, creating a cleaner output.This approach can take significantly longer than searching only the Homebrew prefix because it scans every accessible directory on the system.
find "$(brew --prefix)" -name [tool_name] -type fbrew --prefix ensures the command dynamically targets the correct location without manual path specification.
find Utilityfind command is included by default on macOS (as well as most Linux and Unix-like systems). There is no need to install additional software to run find.
find by directing it to a suspected location or by conducting a system-wide search.
ffmpeg), the same syntax applies to any file name. Substitute [tool_name] to locate other binaries or resources.
Use one of the two patterns below. Both search recursively from the current directory and safely handle filenames with spaces.
Show matching lines with line numbers (skip binary files):
find . -type f \( -name '*.txt' -o -name '*.html' -o -name '*.py' -o -name '*.js' \) -print0 \
| xargs -0 grep -I -n -i 'lecroy'
List filenames only that contain the keyword:
find . -type f \( -name '*.txt' -o -name '*.html' -o -name '*.py' -o -name '*.js' \) -exec grep -I -i -l 'lecroy' {} +
-i = case-insensitive; -n = show line numbers; -l = print filenames only.-I tells grep to ignore binary files.-print0 with xargs -0 safely handles spaces and newlines in filenames.-name patterns to include other extensions as needed.Written on February 12, 2025
Managing processes is a fundamental aspect of system administration in both Linux and macOS environments. Understanding how to check, search for, and control processes is essential for maintaining system performance and stability. This guide provides detailed instructions on managing processes, incorporating tools and commands available in both Linux and macOS.
ps CommandThe ps (process status) command provides a snapshot of current processes.
ps aux
a: Displays processes from all users.u: Shows processes with a user-oriented format.x: Includes processes without a controlling terminal.ps -ef
-e: Selects all processes.-f: Displays a full-format listing.top CommandThe top command provides a dynamic, real-time view of running processes.
top:
top
M to sort by memory usage.P to sort by CPU usage.q to quit.Note: On macOS, top has some differences in options and display.
top Command Differences:
o to change sort order. For example, to sort by memory:o mem
/ to search for a process.-s option:top -s 5
top -n 20
top -u username
htop Commandhtop is an interactive process viewer with a user-friendly interface.
htop:
sudo apt-get install htop # For Debian-based systems
brew install htop # For macOS
htop:
htop
pstree CommandDisplays processes in a tree format, showing parent-child relationships.
pstree
pgrepSearches for processes based on name and other attributes.
pgrep process_name
-l: Lists the process name alongside the PID.-u user_name: Searches for processes owned by a specific user.ps with grepFilters the list of processes to find specific ones.
ps aux | grep process_name
grep Process Itself:
ps aux | grep [p]rocess_name
top and htoptop: Press / and type the process name to search.htop: Press F3 and enter the process name.Processes consuming excessive CPU or memory can degrade system performance.
top or htop and sort by CPU usage.top or htop.Processes that are not functioning correctly or have become defunct.
ps output, the STAT column indicates the state.D: Uninterruptible sleep (usually I/O).Z: Zombie (terminated but not reaped by parent).T: Stopped.R: Running.S: Sleeping.lsof CommandLists open files and the processes that opened them.
lsof /path/to/file
lsof -i :port_number
sudo lsof -i :80
sudo lsof -i :22
lsof -p PID
netstat CommandDisplays network-related information.
netstat -tulpn
netstat -anv | grep LISTEN
-a: Display all sockets.-n: Show numerical addresses without resolving hostnames.-v: Verbose output.kill CommandSends a signal to a process to terminate it.
kill PID
SIGTERM (15): Requests a graceful shutdown.SIGKILL (9): Forces immediate termination.SIGSTOP (19): Stops (pauses) a process.SIGCONT (18): Continues a stopped process.killall CommandTerminates processes by name rather than PID.
killall process_name
-u user_name: Kills processes owned by a specific user.-signal: Sends a specific signal.Note: On macOS, killall targets processes by their full command name as displayed in ps or top. The command is case-sensitive and requires the exact process name.
pkill CommandSimilar to pgrep, but sends signals to processes.
pkill process_name
-u user_name: Targets processes of a specific user.-signal: Specifies the signal to send.htopF9 or k to initiate the kill menu.SIGTERM).vmstatReports virtual memory statistics.
vmstat 2 5
Note: On macOS, use vm_stat (with an underscore).
vm_stat
iostatReports CPU and input/output statistics.
iostat 2 5
sar (Linux Only)Collects, reports, or saves system activity information.
sar -u 2 5
Ensuring the correct process is targeted helps prevent unintended system behavior.
SIGTERM: Allows the process to close files and release resources gracefully.SIGKILL Only if Necessary: Forcefully terminates the process without cleanup.pstree to understand relationships./var/log/. Use tail for real-time monitoring:tail -f /var/log/syslog
log show --last 1h
Some processes require root privileges to manage. Operate with the least privilege necessary.
sudo when needed:sudo kill PID
By understanding and utilizing these tools and commands, processes can be effectively managed in both Linux and macOS environments, ensuring optimal system performance and stability.
In Unix-like environments and macOS, external drives such as SD cards or USB disks are typically mounted in specific directories, making them accessible from the command line. These drives may be automatically mounted in designated directories, or manual mounting can be employed for greater control.
On macOS, external drives are automatically mounted in the /Volumes directory. Each drive appears as a folder within this directory, named according to the drive’s label, allowing for organized and predictable access.
cd /Volumes
ls
After navigating to /Volumes, using the ls command lists the mounted drives. For instance, if an SD card is labeled "SDCARD," access it directly by specifying the drive’s path:
cd /Volumes/SDCARD
In most Linux distributions, external drives are generally mounted in either /media or /mnt, with specific mounting practices depending on distribution and user configuration:
Drives are usually mounted automatically in /media/username/DRIVENAME, where username represents the logged-in user.
cd /media/username/DRIVENAME
For manual mounting, /mnt is commonly used as a directory for temporary mounts. This process requires the use of the mount command.
sudo mount /dev/sdX1 /mnt
cd /mnt
Replace /dev/sdX1 with the correct device name for the external drive. For example, /dev/sdb1 often denotes the first partition on a USB disk.
mount Command for Manual MountingThe mount command provides flexibility for mounting devices, allowing access to a variety of filesystems and external storage. The command is structured as follows:
sudo mount -o options device mount_point
/dev/sdb1./mnt or a subdirectory within /media.ro for read-only access or loop for mounting ISO files.1. Mounting a USB Drive Manually: To mount a USB drive (e.g., /dev/sdb1) to /mnt/usb, use:
sudo mount /dev/sdb1 /mnt/usb
Before accessing the device, ensure that the /mnt/usb directory exists, creating it if necessary:
sudo mkdir -p /mnt/usb
2. Mounting an ISO File as a Loop Device: ISO files are often mounted as loop devices, making their files accessible without burning them to physical media. This example mounts an ISO file as a read-only loop device using the iso9660 filesystem type:
sudo mount -o loop,ro -t iso9660 /path/to/file.iso /mnt/iso
3. Unmounting a Device: To safely remove a mounted device, unmount it using the umount command:
sudo umount /mnt/usb
Ensuring the device is unmounted before physically disconnecting it helps prevent data loss or corruption.
These practices allow for flexible and efficient management of external drives and ISO files across Unix-like environments, providing consistent access through automatic and manual mounting techniques.
emacs ~/.zsh_historyThe following instructions describe the process for accessing the complete shell command history using Emacs. The procedure is outlined in a systematic manner, providing details for both bash and zsh shells.
A shell history file must be identified before proceeding:
~/.bash_history~/.zsh_historyBefore opening the history file, it is advisable to ensure that the session’s command history is fully written to the history file.
| Shell | Command | Description |
|---|---|---|
| bash | history -a |
Appends the session's recent commands to the history file. |
| zsh | fc -W |
Writes the current session's history to the history file. |
Execute the corresponding command in the terminal to update the history file.
Once the history file is updated, Emacs can be used to view and search the command history. Launch Emacs with the appropriate file as follows:
emacs ~/.bash_historyemacs ~/.zsh_historyOpening the file in Emacs allows navigation, search, and editing of the complete command history.
Written on April 1, 2025
Purpose. Provides a detailed overview of files and directories within the current working directory, presenting sizes in readable units (KB, MB, GB) and showing metadata such as permissions, owner, and modification date.
Syntax. ls -lh
Example output.
-rw-r--r-- 1 user staff 12M Oct 12 14:32 sample.mov
drwxr-xr-x 5 user staff 160B Oct 12 13:10 project/
Usage notes. This command does not traverse subdirectories. To include hidden files, use ls -alh.
Purpose. Identifies the most space-consuming files across the current directory and all nested subdirectories, supporting targeted cleanup and optimization efforts.
Syntax. find . -type f -exec du -h {} + | sort -hr | head -n 5
Example output.
413M ./20251005_RickshawRide02.mov
128M ./20251004_keyboard.mov
61M ./20251003_nexus02.MOV
24M ./20251005_yufuin05.MOV
24M ./20251002_walk01.png
Usage notes. The pipeline performs three key tasks: du -h calculates human-readable sizes, sort -hr orders results from largest to smallest, and head -n 5 limits the output to the top five results.
Purpose. Generates a concise summary of disk usage for each immediate child directory, enabling quick identification of space distribution.
Syntax. du -sh */
Example output.
2.0M _dev/
193M attic/
23M driver/
94M media/
6.5M secure/
929M src/
5.4G travel/
Usage notes. The -s flag produces a total summary, -h formats sizes in human-readable units, and the */ glob limits the scope to directories one level below the current working path.
Interpretation nuance. The “total” value shown bylscounts 512-byte blocks, whereasdu -hreports sizes using binary units and rounds to one decimal. Minor discrepancies are normal and expected.
Written on October 12, 2025
This guide provides detailed instructions for downloading Debian ISO files using Jigdo on a macOS system. The steps are organized to ensure clarity and efficiency, addressing potential challenges that may arise during the process.
Homebrew serves as a package manager for macOS, facilitating the installation of various software packages, including Jigdo.
brew install jigdo
Execute this command in the Terminal to install Jigdo.
brew list jigdo
Confirm that Jigdo has been installed correctly. Typical output includes executable files located in /opt/homebrew/Cellar/jigdo/0.8.2/bin/, such as:
jigdo-filejigdo-litejigdo-mirrorTo utilize Jigdo commands seamlessly, ensure that Homebrew’s binary directory is included in the system’s PATH.
echo $PATH
Verify if /opt/homebrew/bin is part of the PATH. If not present, proceed to update the PATH.
Add Homebrew’s binary directory to the PATH by editing the shell configuration file (~/.zshrc for Zsh or ~/.bashrc for Bash).
For Zsh users:
echo 'export PATH="/opt/homebrew/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc
For Bash users:
echo 'export PATH="/opt/homebrew/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
which jigdo-lite
A valid path, such as /opt/homebrew/bin/jigdo-lite, indicates successful configuration.
If jigdo-lite is still not found, manual linking may be required. Execute the following commands:
brew unlink jigdo && brew link jigdo
which jigdo-lite
To reconstruct Debian ISO files, both .jigdo and .template files are required. These files provide the necessary information and structure for the ISO assembly.
Navigate to the Debian Jigdo DVD Images. Download the corresponding .jigdo and .template files and save them in ~/Desktop/Debian/.
With the necessary Jigdo files in place, proceed to download and assemble the Debian ISO.
cd ~/Desktop/Debian/
jigdo-lite debian-12.7.0-i386-DVD-1.jigdo
For scenarios involving multiple Jigdo and template files (e.g., 21 ISOs), scripting can automate the download process. Below are two versions of the Zsh script, each with different approaches for automating responses to Jigdo prompts.
For Zsh (using yes for continuous "Enter" presses):
#!/bin/zsh
# Directory containing all .jigdo and .template files
JIGDO_DIR=~/Desktop/Debian/
# Navigate to the Jigdo directory
cd "$JIGDO_DIR" || exit
# Loop through all .jigdo files and execute jigdo-lite, with continuous "Enter" key presses
for jigdo_file in *.jigdo; do
echo "Processing $jigdo_file..."
# Use 'yes' to simulate continuous "Enter" presses for each file
yes '' | jigdo-lite "$jigdo_file"
done
This version of the script employs the yes command to repeatedly send an empty string, simulating continuous "Enter" presses until all Jigdo prompts are satisfied. This method is useful if the number of prompts varies or if additional confirmations are required during the download process.
For Zsh (using printf for exactly two "Enter" presses):
#!/bin/zsh
# Directory containing all .jigdo and .template files
JIGDO_DIR=~/Desktop/Debian/
# Navigate to the Jigdo directory
cd "$JIGDO_DIR" || exit
# Loop through all .jigdo files and execute jigdo-lite, with exactly two "Enter" key presses
for jigdo_file in *.jigdo; do
echo "Processing $jigdo_file..."
# Use 'printf' to simulate pressing "Enter" twice for each file
printf '\n\n' | jigdo-lite "$jigdo_file"
done
This version of the script uses printf to send exactly two newline characters, simulating two "Enter" key presses. It is beneficial when only two prompts are expected, as it avoids continuous input and provides controlled interaction with the Jigdo process.
#!/bin/bash
#Directory containing all .jigdo and .template files
JIGDO_DIR=~/Desktop/Debian/
# Navigate to the Jigdo directory
cd "$JIGDO_DIR" || exit
# Loop through all .jigdo files and execute jigdo-lite, with continuous "Enter" key presses
for jigdo_file in *.jigdo; do
echo "Processing $jigdo_file..."
# Use 'yes' to simulate continuous "Enter" presses for each file
yes '' | jigdo-lite "$jigdo_file"
done
#!/bin/bash
# Directory containing all .jigdo and .template files
JIGDO_DIR=~/Desktop/Debian/
# Navigate to the Jigdo directory
cd "$JIGDO_DIR" || exit
# Loop through all .jigdo files and execute jigdo-lite, with exactly two "Enter" key presses
for jigdo_file in *.jigdo; do
echo "Processing $jigdo_file..."
# Use 'printf' to simulate pressing "Enter" twice for each file
printf '\n\n' | jigdo-lite "$jigdo_file"
done
Choose the script version based on the expected interaction with Jigdo prompts. The yes command version is suitable for continuous responses, while the printf version provides a precise number of responses.
chmod +x ~/Desktop/download_isos.zsh
~/Desktop/download_isos.zsh
Replace with download_isos.sh for Bash.
To facilitate Debian installation from an SD card, the ISO file must be properly written to the SD card. This guide provides a brief overview for Windows users utilizing Rufus, followed by detailed steps for macOS users using built-in tools.
Download Rufus from the official website. Rufus is a straightforward tool for creating bootable media from ISO files on Windows.
This section provides detailed steps to create a bootable Debian SD card on macOS without third-party software.
Insert the SD card and open Terminal. Use the following command to display all drives and identify the SD card by its size:
diskutil list
Note the identifier for the SD card (such as /dev/disk9).
Unmount the SD card with the command:
diskutil unmountDisk /dev/disk9
Next, erase and format the SD card:
sudo diskutil eraseDisk FAT32 BOOT MBRFormat /dev/disk9
The command eraseDisk initiates the process of removing all existing data on the SD card. The FAT32 parameter specifies the filesystem to be used, ensuring compatibility across various operating systems. The label BOOT names the new partition, and MBRFormat sets the partition scheme to Master Boot Record, which is suitable for booting purposes.
/dev/disk0 (internal, physical):
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *2.0 TB disk0
1: Apple_APFS_ISC Container disk1 524.3 MB disk0s1
2: Apple_APFS Container disk3 2.0 TB disk0s2
3: Apple_APFS_Recovery Container disk2 5.4 GB disk0s3
/dev/disk3 (synthesized):
#: TYPE NAME SIZE IDENTIFIER
0: APFS Container Scheme - +2.0 TB disk3
Physical Store disk0s2
1: APFS Volume Macintosh HD - Data 807.5 GB disk3s1
2: APFS Volume Macintosh HD 10.8 GB disk3s3
3: APFS Snapshot com.apple.os.update-... 10.8 GB disk3s3s1
4: APFS Volume Preboot 12.3 GB disk3s4
5: APFS Volume Recovery 1.9 GB disk3s5
6: APFS Volume VM 2.1 GB disk3s6
/dev/disk4 (disk image):
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme +10.4 GB disk4
1: Apple_APFS Container disk5 10.4 GB disk4s1
/dev/disk5 (synthesized):
#: TYPE NAME SIZE IDENTIFIER
0: APFS Container Scheme - +10.4 GB disk5
Physical Store disk4s1
1: APFS Volume watchOS 10.5 21T575 ... 10.1 GB disk5s1
/dev/disk6 (disk image):
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme +17.6 GB disk6
1: Apple_APFS Container disk7 17.6 GB disk6s1
/dev/disk7 (synthesized):
#: TYPE NAME SIZE IDENTIFIER
0: APFS Container Scheme - +17.6 GB disk7
Physical Store disk6s1
1: APFS Volume iOS 17.5 21F79 Simul... 17.0 GB disk7s1
/dev/disk9 (internal, physical):
#: TYPE NAME SIZE IDENTIFIER
0: FDisk_partition_scheme *32.0 GB disk9
1: Windows_FAT_32 bootfs 268.4 MB disk9s1
2: Linux 31.7 GB disk9s2
The output of the diskutil list command provides a detailed view of the storage devices connected to the MacBook. Below is an interpretation of each entry:
This device, with a capacity of 2.0 TB, represents an internal and physical disk, meaning it is permanently installed within the MacBook rather than being a removable or virtual drive. The disk utilizes the GUID Partition Scheme and contains the following partitions:
This entry signifies a synthesized APFS container generated by macOS, encompassing various volumes related to /dev/disk0. It includes multiple essential system partitions such as:
These entries correspond to disk images, likely representing mounted virtual drives or other macOS system images:
disk4 and disk6 represent GUID partition schemes for disk images.disk5 and disk7 are synthesized from disk4 and disk6, respectively, containing APFS volumes for watchOS and iOS simulators.This 32.0 GB device is marked as both "internal" and "physical," indicating a physically removable medium recognized as part of the MacBook's internal hardware interface, such as an SD card slot. It employs an FDisk partition scheme, commonly associated with devices formatted for compatibility across various operating systems. The two partitions present are:
Windows_FAT_32 partition, typical for broader compatibility with Windows systems, often used in boot setups.Linux partition, suggesting prior use for a Linux-based operating system or data storage.The identifying characteristics of /dev/disk9—a size of 32.0 GB, an FDisk partition scheme, and a removable nature—indicate this is the SD card. Such removable drives are recognized as "internal, physical" due to their connection through a built-in card reader or slot, contrasting with virtual or purely internal SSDs and HDDs that are non-removable.
macOS requires the ISO to be in .img format for the dd utility. Convert the Debian ISO by running:
hdiutil convert -format UDRW -o ~/Desktop/debian.img ~/Desktop/debian-12.7.0-arm64-DVD-1.iso
The hdiutil command-line tool is utilized for working with disk images in macOS. The UDRW format specifies an uncompressed read/write image, which is necessary for the subsequent dd operation. The output location is set to ~/Desktop/debian.img, and the source file is ~/Desktop/debian-12.7.0-arm64-DVD-1.iso.
If macOS appends .dmg to the output file (resulting in debian.img.dmg), this extension remains acceptable for the next step.
dd:Use dd to transfer the IMG file to the SD card. The dd utility performs a low-level copy of data from one location to another. The parameters used are:
if=~/Desktop/debian.img.dmg: Specifies the input file.of=/dev/disk9: Specifies the output file (the SD card).bs=1m: Sets the block size to 1 megabyte, optimizing the copy process for speed.The command to execute is:
sudo dd if=~/Desktop/debian.img.dmg of=/dev/disk9 bs=1m
Prior to executing this command, it is necessary to unmount the SD card using diskutil unmountDisk /dev/disk9. This ensures that no other processes are accessing the disk, preventing potential data corruption during the write operation.
Be cautious when using dd, as it can overwrite any specified drive without warning. This process can take a few minutes; no progress is shown by default.
Once the process is complete, safely eject the SD card with:
diskutil eject /dev/disk9
The SD card is now prepared as a bootable Debian installation medium.
Configuring Debian to utilize local ISO files as repositories enhances package management efficiency, particularly in environments with limited or unreliable internet connectivity. This guide outlines the process of mounting multiple ISO files, updating the package manager’s sources list to prioritize local repositories, and automating the mounting process for sustained convenience.
A fundamental step involves creating directories designated for mounting each ISO file. Organizing these directories under /media maintains system orderliness.
sudo mkdir -p /media/debian-iso{1..21}
mkdir -p: Creates the specified directories along with any necessary parent directories./media/debian-iso{1..21}: Generates directories named /media/debian-iso1 through /media/debian-iso21.Automating the mounting process ensures efficiency when handling multiple ISO files. A Bash script is employed to mount each ISO to its corresponding directory as a loop device.
emacs ~/mount_debian_isos.sh -nw
#!/bin/bash
# Base directory where ISO files are located
ISO_DIR=/home/frank/Downloads
MOUNT_DIR=/media
# Loop through all 21 ISOs and mount them
for i in {1..21}; do
ISO_FILE="$ISO_DIR/debian-12.7.0-arm64-DVD-$i.iso"
MOUNT_POINT="$MOUNT_DIR/debian-iso$i"
if [ -f "$ISO_FILE" ]; then
echo "Mounting $ISO_FILE to $MOUNT_POINT..."
sudo mount -o loop,ro "$ISO_FILE" "$MOUNT_POINT"
else
echo "Warning: $ISO_FILE does not exist."
fi
done
#!/bin/bash): Specifies that the script should be executed in the Bash shell.ISO_DIR: Directory containing the ISO files. Modify this path if the ISOs are stored elsewhere.MOUNT_DIR: Base directory for mounting the ISOs.loop: Mounts the ISO as a loop device.ro: Mounts the ISO as read-only to prevent modifications.chmod +x ~/mount_debian_isos.sh
~/mount_debian_isos.sh
The mount command is versatile and used across various scenarios, from mounting ISO files to accessing network drives and USB devices. Below are several common examples of the mount command, demonstrating frequently used options and configurations:
1. Mount an ISO File as a Loop Device: This example mounts an ISO file as a read-only loop device using the ISO 9660 filesystem type.
sudo mount -t iso9660 -o loop /home/frank/Downloads/debian-9.5.0-amd64-DVD-1.iso /media/d1
2. Mount a USB Drive Automatically: Linux often automatically recognizes and mounts USB drives to /media/username/DRIVENAME. However, manual mounting is also possible:
sudo mount /dev/sdb1 /mnt/usb
lsblk or fdisk -l.sudo mkdir -p /mnt/usb.3. Mount a Windows NTFS Drive: For dual-boot systems, accessing Windows partitions from Linux may require specifying the NTFS filesystem type.
sudo mount -t ntfs-3g /dev/sda1 /mnt/windows
4. Mount a Network Share (NFS): Network File System (NFS) is widely used for accessing remote file systems across a network.
sudo mount -t nfs 192.168.1.100:/shared-folder /mnt/nfs
5. Mount a CIFS (Windows/Samba) Network Share: CIFS (Common Internet File System) is a network protocol that allows access to shared folders from Windows or Samba servers.
sudo mount -t cifs -o username=frank,password=yourpassword //192.168.1.101/shared-folder /mnt/cifs
6. Mount a Disk Partition as Read-Only: For forensic or data recovery purposes, mounting a partition in read-only mode prevents any accidental modifications.
sudo mount -o ro /dev/sdc1 /mnt/readonly
7. Mount a Bind Directory (Make One Directory Accessible at Another Path): The bind option allows one directory to be mounted at another path, effectively mirroring its contents.
sudo mount --bind /var/www/html /mnt/website
/var/www/html will be accessible.sources.list to Include Local and Online RepositoriesConfiguring the APT package manager to prioritize local ISO repositories while retaining the ability to access online sources involves editing the sources.list file. The order of entries dictates the priority, with earlier entries being preferred.
sources.list:sudo emacs /etc/apt/sources.list
Append the following lines to the end of the file. Ensure that bookworm is the correct codename for the Debian release in use. Adjust accordingly if a different release is active.
# Local Debian ISO repositories
deb [trusted=yes] file:/media/debian-iso1/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso2/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso3/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso4/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso5/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso6/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso7/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso8/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso9/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso10/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso11/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso12/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso13/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso14/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso15/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso16/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso17/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso18/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso19/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso20/ bookworm main contrib
deb [trusted=yes] file:/media/debian-iso21/ bookworm main contrib
# Online Debian repositories
deb http://deb.debian.org/debian bookworm main contrib
deb-src http://deb.debian.org/debian bookworm main contrib
deb http://deb.debian.org/debian-security bookworm-security main contrib
deb-src http://deb.debian.org/debian-security bookworm-security main contrib
deb http://deb.debian.org/debian bookworm-updates main contrib
deb-src http://deb.debian.org/debian bookworm-updates main contrib
deb [trusted=yes] file:/media/debian-isoX/ bookworm main contrib: Specifies each local ISO as a trusted repository. The trusted=yes option bypasses signature verification; ensure ISOs are obtained from official sources to maintain security.deb) and source (deb-src) repositories are included as needed.Refreshing the APT package database allows the system to recognize the newly configured repositories.
sudo apt update
This command updates the package index, enabling APT to acknowledge packages available from both local ISO repositories and online sources.
Packages can now be installed using APT, with the system prioritizing local ISO repositories before consulting online sources.
sudo apt install <package_name>
Replace <package_name> with the desired package to install.
apt-get install build-essential linux-headers-$(uname -r)
# Raspberry Pi
apt-get install build-essential gcc raspberrypi-kernel-headers raspberrypi-kernel
apt-get install build-essential emacs raspberrypi-kernel-headers git bc bison flex libc6-dev libncurses5-dev make
sudo apt install crossbuild-essential-armhf
To ensure that all ISO files are mounted automatically upon system boot, entries can be added to the /etc/fstab file. This guarantees that the local repositories are available without manual intervention each time the system starts.
/etc/fstab:sudo emacs /etc/fstab
Append the following lines to the end of the file.
# Mount Debian ISO repositories
~/Downloads/debian-12.7.0-i386-DVD-1.iso /media/debian-iso1 iso9660 loop,ro 0 0
~/Downloads/debian-12.7.0-i386-DVD-2.iso /media/debian-iso2 iso9660 loop,ro 0 0
# ...
~/Downloads/debian-12.7.0-i386-DVD-21.iso /media/debian-iso21 iso9660 loop,ro 0 0
iso9660): Standard filesystem for ISO images.loop: Mounts the file as a loop device.ro: Mounts the ISO as read-only.0 0): Commonly set to 0 0 for ISO mounts, indicating no dump and no filesystem check.sudo mount -a
This command mounts all filesystems specified in /etc/fstab without necessitating a system reboot.
Before diving into device and kernel programming, it is crucial to understand the core system information. The following commands offer insights into the Linux system's hardware, kernel, and architecture.
The hostnamectl command provides a basic overview of the system, including the hostname, operating system, and kernel version.
# System Information
hostnamectl
The uname command displays kernel and architecture information.
# Kernel Version
uname -r
# Machine Architecture
uname -m
uname -r: Displays the kernel version, which is essential for module development or custom kernel builds.uname -m: Reveals the machine architecture (e.g., x86_64 or arm64), ensuring compatibility with drivers and modules.To check the Linux distribution and version details:
# Linux Distribution and Version
lsb_release -a
Detailed CPU information can be obtained using lscpu or by inspecting /proc/cpuinfo.
# Human-readable CPU Information
lscpu
# Detailed CPU Information
cat /proc/cpuinfo
lscpu: Provides CPU architecture, cores, threads, and cache sizes in a human-readable format.cat /proc/cpuinfo: Offers detailed CPU information, useful for performance benchmarking when developing CPU-bound kernel modules.Linux exposes detailed information about devices and their drivers through the /sys and /proc filesystems, as well as utility commands.
To check the drivers associated with devices and view system buses:
# List PCI Devices with Kernel Modules
lspci -k
# View Input Devices
cat /proc/bus/input/devices
lspci -k: Lists PCI devices along with their associated kernel modules.cat /proc/bus/input/devices: Displays information about input devices connected to the system.The /sys filesystem provides a way to interact with kernel objects.
For example, to check the status of a network interface:
# Network Interface Status
cat /sys/class/net/eth0/operstate
To list available filesystems supported by the kernel:
# Available Filesystems
cat /proc/filesystems
Understanding the hardware landscape and relationships between different buses and devices is essential for advanced device programming.
The lspci command is used to inspect PCI devices:
# PCI Devices in Tree View
lspci -tv
# PCI Devices with Kernel Modules
lspci -k
# Verbose PCI Device Information
lspci -vv
# Filter PCI Devices (e.g., USB Controllers)
lspci -v | grep USB
lspci -tv: Shows a hierarchical tree of PCI devices and their connections.lspci -k: Links kernel modules to devices, verifying which drivers are in use.lspci -vv: Provides detailed device information, such as capabilities, power management, and interrupt settings.lspci -v | grep USB: Filters specific device classes, like USB controllers.For USB device driver development:
# USB Devices in Tree View
lsusb -tv
# Verbose USB Device Information
lsusb -v
# Inspect Specific USB Device
lsusb -d <vendor_id:product_id>
lsusb -tv: Provides a tree view of connected USB devices.lsusb -v: Gives verbose information about each USB device, including descriptors, vendor IDs, and product IDs.lsusb -d <vendor_id:product_id>: Inspects a specific USB device.For debugging or interacting with USB devices programmatically, tools like usbmon and Wireshark can capture USB traffic.
To view block devices such as disks and partitions:
# List Block Devices
lsblk
# List Block Devices with Filesystem Information
lsblk -f
lsblk: Provides a clear hierarchical structure of block devices.lsblk -f: Adds filesystem and UUID information, useful for understanding mounted devices and their properties.For low-level inspection of SCSI devices:
# List SCSI Devices
lsscsi
lsscsi: Lists all SCSI devices connected to the system.Tracking interrupts and I/O performance is vital for optimizing system interactions with hardware.
To view the number of interrupts per CPU for each I/O device:
# Interrupts per Device
cat /proc/interrupts
This is useful when optimizing interrupt handling or diagnosing hardware IRQ conflicts.
The iostat and iotop commands help monitor I/O device performance:
# Extended I/O Statistics (updates every second)
iostat -x 1
# Block Device Statistics
iostat -d 1
# Monitor I/O Usage by Process
iotop
iostat -x: Provides extended statistics like utilization, throughput, and wait time.iostat -d: Focuses on block devices, updating statistics every second.iotop: Displays real-time I/O usage by process, useful for identifying processes performing heavy I/O operations.Memory management and caching are critical when programming kernel modules or drivers.
To view detailed memory statistics:
# Memory Information
cat /proc/meminfo
This provides information such as total memory, free memory, available swap, and cached memory.
To list memory cache information:
# Cache Information
sudo lshw -C memory
lshw -C memory: Displays information about the system's memory hierarchy, including caches.Delving into specific hardware capabilities is facilitated by tools for inspecting buses, devices, and subsystems.
The lshw command provides detailed information about the hardware configuration.
# Hardware Path Tree View
sudo lshw -short
# Class-specific Information
sudo lshw -short -C bus
sudo lshw -short -C cpu
sudo lshw -short -C storage
# Bus Information
sudo lshw -businfo
lshw -short: Provides a concise list of hardware.lshw -short -C [class]: Focuses on specific classes like bus, cpu, or storage.lshw -businfo: Shows how devices are connected to various buses.Device-specific information can be extracted using hwinfo and other utilities.
To list all supported filesystems in the kernel:
# Supported Filesystems
cat /proc/filesystems
This is useful when developing storage drivers or working with filesystems.
The hwinfo command provides detailed information about hardware components.
# Concise Hardware Summary
sudo hwinfo --short
# Detailed USB Information
sudo hwinfo --usb
hwinfo --short: Gives a concise summary of all detected devices.hwinfo --usb: Provides detailed information about USB devices, including vendor and product IDs.For storage systems or SCSI devices, in-depth tools are necessary to inspect and configure devices.
To show the hierarchy of SCSI devices:
# SCSI Device Hierarchy
lsblk -s
lsblk -s: Displays block devices in reverse dependencies, showing the relationship between devices.To obtain detailed SMART information for storage devices:
# SMART Information
sudo smartctl -a /dev/sda
smartctl: Provides disk health, performance data, and potential failure indicators.Real-time monitoring of device and kernel activity is essential for performance tuning and debugging.
The top or htop commands can be used to monitor processes and system load:
# Real-time Process Monitoring
top
# Enhanced Process Monitoring
htop
top: Displays system processes and resource usage.htop: An interactive process viewer with more details and a user-friendly interface.To monitor block device I/O:
# Block Device I/O Statistics
iostat -d
iostat -d: Shows I/O statistics for block devices.To monitor network interfaces:
# Real-time Network Interface Monitoring
iftop
iftop: Displays bandwidth usage on network interfaces.Tools like perf are used to monitor kernel and application performance, helping identify bottlenecks.
To monitor CPU performance, system calls, and events:
# Live CPU Performance Monitoring
sudo perf top
To record a performance profile:
# Record Performance Data
sudo perf record -a
To display the recorded performance data:
# Report Performance Data
sudo perf report
perf top: Provides a real-time view of system performance.perf record: Collects performance data over time.perf report: Analyzes and displays the collected data.Kernel modules extend system functionality without requiring a reboot and are commonly used in device driver development.
To list all loaded kernel modules:
# List Loaded Kernel Modules
lsmod
To load a kernel module manually:
# Load a Kernel Module
sudo modprobe <module_name>
To unload a kernel module:
# Unload a Kernel Module
sudo modprobe -r <module_name>
To get detailed information about a specific module:
# Module Information
modinfo <module_name>
This provides information such as module parameters, dependencies, and author.
Kernel headers are required when building or debugging modules:
# Install Kernel Headers
sudo apt-get install linux-headers-$(uname -r)
Kernel logs are essential for tracking system errors and debugging device driver issues.
To view real-time kernel logs:
# Real-time Kernel Logs
dmesg -w
Access kernel logs through the systemd journal:
# Kernel Logs via systemd journal
journalctl -k
To filter logs and focus on specific messages:
# Filter Kernel Logs
dmesg | grep <keyword>
Replace <keyword> with the module name or any relevant term to isolate specific log messages.
Monitoring how kernel modules interact with hardware is crucial for device driver development.
To locate messages related to a specific kernel module:
# Module-related Kernel Messages
dmesg | grep <module_name>
This assists in debugging issues like module loading or initialization.
Kernel module development often requires building, inserting, and testing custom modules.
To compile the kernel module for the currently running kernel:
# Compile Kernel Module
make -C /lib/modules/$(uname -r)/build M=$(pwd) modules
To insert the compiled module into the running kernel:
# Insert Kernel Module
sudo insmod mymodule.ko
To remove the kernel module:
# Remove Kernel Module
sudo rmmod mymodule
Note: Use the module name without the .ko extension when removing.
For kernel module developers, debugfs and ftrace are invaluable for exposing internal kernel data and tracing function calls.
To mount the debugfs filesystem:
# Mount debugfs
sudo mount -t debugfs none /sys/kernel/debug
To trace function calls or events in the kernel:
# Set the Current Tracer to 'function'
echo function | sudo tee /sys/kernel/debug/tracing/current_tracer
To view the trace output:
# View Tracing Output
cat /sys/kernel/debug/tracing/trace
This allows tracing of function calls, which is invaluable for kernel debugging.
Kernel parameters can be inspected and tweaked using the /proc/sys directory or via the sysctl command.
To view all kernel parameters:
# View All Kernel Parameters
sysctl -a
To modify a kernel parameter (e.g., increasing the maximum number of open files):
# Increase Maximum Open Files
sudo sysctl -w fs.file-max=100000
To make the change permanent, add the parameter to /etc/sysctl.conf.
Building custom kernels or kernel modules is sometimes necessary when developing for the Linux kernel.
To configure and build a custom kernel:
# Configure Kernel Options
make menuconfig
# Compile the Kernel
make
# Install the Kernel and Modules
sudo make modules_install install
For building and inserting custom kernel modules:
# Compile Kernel Module
make -C /lib/modules/$(uname -r)/build M=$(pwd) modules
# Insert Kernel Module
sudo insmod mymodule.ko
# Remove Kernel Module
sudo rmmod mymodule
Linux kernel programming on the Raspberry Pi presents unique challenges and considerations compared to traditional desktop or server environments. The Raspberry Pi, being a single-board computer based on the ARM architecture, requires specific approaches for kernel development, module compilation, and device driver integration. This guide explores the differences, methodologies, and best practices for effective kernel programming on the Raspberry Pi.
sudo apt-get install raspberrypi-kernel-headers
build-essential, gcc, make, and git.sudo apt-get install gcc-arm-linux-gnueabihf
ARCH=arm
CROSS_COMPILE=arm-linux-gnueabihf-
KERNELDIR=/path/to/kernel/sources
make command with appropriate flags:
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf-
.ko module file to the Raspberry Pi.insmod or modprobe:
sudo insmod module_name.ko
rmmod:
sudo rmmod module_name
lsmod and dmesg for logs.git clone --depth=1 https://github.com/raspberrypi/linux
cd linux
KERNEL=kernel7
make bcm2709_defconfig
menuconfig or xconfig:
make menuconfig
make with appropriate flags for cross-compilation.
make -j4 zImage modules dtbs
sudo make modules_install
sudo cp arch/arm/boot/zImage /boot/kernel7.img
sudo cp arch/arm/boot/dts/*.dtb /boot/
sudo cp arch/arm/boot/dts/overlays/*.dtb* /boot/overlays/
sudo cp arch/arm/boot/dts/overlays/README /boot/overlays/