Putorius
Bash Scripting

Grep All Email Addresses from a Text File

Q: I need to use my Linux system to grep email addresses out of a text file. Is there a way I can tell grep to just look for emails?

A: You can use regular expressions with grep. If you construct a good regex you can pull just about anything out of a text file. Below we use grep with the -E (extended regex) option which allows interpretation of the pattern as a regular expression. The -o option tells grep to only show the matching pattern, not the whole line.

grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+.[a-zA-Z0-9.-]+\b" filename.txt

You can also use egrep instead of grep with the -E switch.

egrep -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+.[a-zA-Z0-9.-]+\b" filename.txt

That's it. With the above regular expression you should be able to find all the email addresses in your file.

$ egrep -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+.[a-zA-Z0-9.-]+\b" test
awesome@putorius.net

Let's break down the regular expression.

\b is a word boundary, so we put one on each side. This basically tells grep that there should be a blank space on either side of the match.

[a-zA-Z0-9.-] tries to specify any valid character for the beginning of the email address. These being lowercase a to z, uppercase a to z, any digit, a period or a dash.

The plus sign means add to or concatenate.

Then we specify the @ symbol, which is very recognizable.

Then we repeat the same section looking for valid characters twice, separated by a period. This all makes up the basic structure of an email address.

From grep man pages: 
-E = Interpret PATTERN as an extended regular expression.
-o = Show only the part of a matching line that matches PATTERN.

Resources:
GREP MAN PAGE: https://ss64.com/bash/grep.html

Exit mobile version