Q: I need to use my Linux system to grep email addresses out of a text file. Is there a way I can tell grep to just look for emails?
A: You can use regular expressions with grep. If you construct a good regex you can pull just about anything out of a text file. Below we use grep with the -E (extended regex) option which allows interpretation of the pattern as a regular expression. The -o option tells grep to only show the matching pattern, not the whole line.
grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+.[a-zA-Z0-9.-]+\b" filename.txt
You can also use egrep instead of grep with the -E switch.
egrep -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+.[a-zA-Z0-9.-]+\b" filename.txt
That's it. With the above regular expression you should be able to find all the email addresses in your file.
$ egrep -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+.[a-zA-Z0-9.-]+\b" test
awesome@putorius.net
Let's break down the regular expression.
\b is a word boundary, so we put one on each side. This basically tells grep that there should be a blank space on either side of the match.
[a-zA-Z0-9.-] tries to specify any valid character for the beginning of the email address. These being lowercase a to z, uppercase a to z, any digit, a period or a dash.
The plus sign means add to or concatenate.
Then we specify the @ symbol, which is very recognizable.
Then we repeat the same section looking for valid characters twice, separated by a period. This all makes up the basic structure of an email address.
From grep man pages:
-E = Interpret PATTERN as an extended regular expression.
-o = Show only the part of a matching line that matches PATTERN.
Resources:
GREP MAN PAGE: https://ss64.com/bash/grep.html