Knowing Linux/Unix commands for manipulating and filtering content in a file can save time, increase efficiency, and provide a flexible and reproducible way to work with file data.
Knowing these commands for manipulating and filtering content in a file can be very useful for several reasons:
- Efficiency: When working with large files, it can be time-consuming to manually edit or search through the file for specific content. Unix commands provide efficient ways to automate these tasks.
- Reproducibility: Using Unix commands to manipulate and filter content in a file creates a clear and repeatable process that can be easily applied to similar files in the future.
- Flexibility: Unix commands can be combined in many ways to achieve different goals, making them a flexible tool for a wide range of tasks.
- Compatibility: Unix commands are available on most Unix-based systems, including Linux and macOS, so knowledge of these commands can be helpful when working on different systems.
Let’s begin:
cat file.txt | grep '^-*' > file2.txt
This command is using the cat and grep commands to filter content from a file and save the result to a new file. Here’s what each part of the command does:
- cat file.txt: This command prints the contents of file.txt to the terminal.
- |: This is a pipe symbol, which sends the output of the cat command to the next command (grep in this case) as input.
- grep ‘^-*’: This command uses the grep tool to search for lines that start with one or more hyphens (-). The ^ symbol matches the beginning of the line, and the – symbol is enclosed in square brackets to indicate that it is a literal character to match. The * symbol indicates that there may be zero or more hyphens after the initial one.
- > file2.txt: This command redirects the output of the grep command to a new file called file2.txt. If file2.txt already exists, it will be overwritten with the new content.
More cool commands
Extracting lines containing a specific keyword and saving them to a new file:
grep "keyword" file.txt > new_file.txt
Counting the number of occurrences of a specific word in a file:
grep -o "word" file.txt | wc -l
Removing blank lines from a file:
sed '/^$/d' file.txt > new_file.txt
Removing leading and trailing whitespaces from lines in a file:
sed 's/^[ \t]*//;s/[ \t]*$//' file.txt > new_file.txt
Extracting a range of lines from a file:
sed -n 'start_line,end_linep' file.txt > new_file.txt
Sorting a file by a specific column:
sort -k column_number file.txt > new_file.txt
Removing duplicates from a file:
sort file.txt | uniq > new_file.txt
Reversing the order of lines in a file:
tac file.txt > new_file.txt
Extracting a specific column from a file:
cut -f column_number file.txt > new_file.txt
Replacing a specific word or pattern in a file:
sed 's/old_word/new_word/g' file.txt > new_file.txt
Counting the number of lines in a file:
wc -l file.txt
Extracting lines containing a pattern and the lines immediately before and after:
grep -A num_lines_after -B num_lines_before "pattern" file.txt > new_file.txt
Extracting lines that match multiple patterns:
grep -E "pattern1|pattern2" file.txt > new_file.txt
Sorting a file numerically by a specific column:
sort -n -k column_number file.txt > new_file.txt
Merging two sorted files into a single sorted file:
sort file1.txt file2.txt > merged_file.txt
Extracting a random sample of lines from a file:
shuf -n num_lines file.txt > new_file.txt
These are just a few more examples of Unix commands that can be used to manipulate and filter content in a file. There are many other useful commands that can be combined in different ways to achieve specific goals.
Cheers!
Dan D.