Professional Documents
Culture Documents
Abstract
This paper will examine how to trace the origin of an e-mail message using message headers, message text,
and network logs. It will also discuss some of the challenges today in tracking the source of e-mail
messages and in determining the identity of the senders.
Keywords: E-mail forensics, e-mail headers, e-mail tracking, sender, receiver, identity
1 Introduction
E-mail is often used in criminal acts and in-appropriate actions such as threats, frauds (phishing), sexual harassment
and stalking, mail bombing, and extortion. Almost any type of crime today contains some form of Internet related
evidence. Organizations such as private companies have to deal with an equal amount of Internet related issues such
as dealing with inappropriate e-mail, including threats, to preventing and co-operating in the fight against spamming
and phishing.
E-mail can be hard to connect to an individual in principle, but in practice it can be traced and connected to the
perpetrator. E-mail headers contain information that allows investigators to trace it, and the e-mail itself contains
the most important clues regarding its authenticity. Along with information contained in network, router, and e-mail
server logs it is possible to trace the path of the e-mail from recipient back to the sender.
Not all e-mail can be traced however. And especially if the sender used identity concealing methods such as the
use of an anonymizer service, spoofing techniques, bot networks, tunneling, open proxies, open mail relays, and
untraceable Internet connections. When such techniques are used, it can also be quite difficult to determine the
identity of the sender just by examining the electronic trail.
This paper will examine standard e-mail tracing techniques, and some of the techniques that e-mail senders use
to hide their electronic trail that makes it difficult to track the source. Some advanced digital forensic techniques for
e-mail tracing will also be presented.
2 E-mail fundamentals
Users send e-mail typically using an e-mail client. The client sends the e-mail usually consisting of a text file to an
e-mail server. The server hands the e-mail to the client at the receiver’s machine if it is a local delivery, or it hands
it to other e-mail servers via the Internet. The email may actually get relayed between other mail hosts, before it is
eventually transferred to a mail server in the recipient’s domain [3].
At least four computers are generally needed to process e-mail. The first is the sender’s computer, where the e-
mail originated. The second is the e-mail server of the sender’s ISP (Internet Service Provider) which receives the
e-mail from the sender’s computer. The third is the e-mail server of the recipients ISP, which receives the e-mail
from the sender’s ISP. Finally, the fourth is the computer of the recipient receiving the message from their ISP’s e-
mail server. Each of the computers has a unique identifier called an IP (Internet Protocol) address. If the ISP of
the receiver and sender are the same, then a minimum of three computers are involved [10].
Following are some services that are important to know about in order to effectively trace e-mail.
The important thing to understand about these different protocols is that their use affects where mail messages
are stored. POP mail users always use their local machines for their e-mail archives, while clients using IMAP and
MAPI have the option of storing e-mails on the server. Incoming and outgoing messages for web based post office
service such as Google Mail, Yahoo Mail, or AOL are also stored on the server [10].
Outgoing e-mail uses a different protocol called Simple Mail Transfer Protocol (SMTP). SMTP is a simple
protocol and consists of a few text based commands or keywords. Servers that accept mail and relay to other servers
(also called Mail Transfer Agents or MTAs) also use SMTP [10]. Non-SMTP transmissions can occur as well when
e-mails are exchanged between users on the same e-mail system. The important thing to understand here is that
when messages are being relayed between servers, SMTP is used to keep track of the IP addresses of the other
servers connecting to them, and the IP addresses of these servers are added to information on the e-mail header. E-
mail servers have the ability to maintain logging information, and they are also a reliable source of information
about e-mail headers.
Each of these entities maintains a database of registered IP addresses and contact data that is freely available on
the internet [11].
3 E-mail tracing
Determining an e-mail’s origin is referred to as “tracing” [12]. This process requires examining e-mail headers, e-
mail message files, and e-mail server logs.
2
TRACING E-MAIL MESSAGES
Read Bottom Up
CC: Jayesh Naithani <jayesh.naithani@lawson.com>
Date: Sat, 1 May 2010 12:41:11 -0500
Subject: How are you?
Thread-Topic: How are you?
Thread-Index: AQHK6VV9wvdSp0Todk6XpkEWJuJyxw==
Message-ID: <AC162E333D854F429B9DFD32777480EB2D4365C090@UST-E2K7VS3.stthomas.edu>
Accept-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
MIME-Version: 1.0
X-Reverse-DNS: smtp4.stthomas.edu
X-MIMETrack: Itemize by SMTP Server on SMTP/Lawson at 05/01/2010 12:41:16 PM,
Serialize by Router on USSPC01/Server/Lawson(Release 7.0.3FP1|February 24, 2008) at
05/01/2010 12:41:35 PM, Serialize complete at 05/01/2010 12:41:35 PM
X-TNEFEvaluated: 1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="us-ascii"
Content-Language: en-US
Return-Path: nait4086@stthomas.edu
Hi Jayesh,
- Jayesh
The header contains routing information about the e-mail, source and destination information, IP address of the
sender, and timestamps. The message body (in green) contains the actual message, and can optionally contain
attachments. The message headers (in teal) are the most important part for investigating and tracing the email.
3
TRACING E-MAIL MESSAGES
X-Headers are inserted by client programs, and is a defacto standard for passing information to other e-mail
handling programs for processing. E-mail headers should always be viewed with caution by investigators as they
can be easily faked [3]. Once the originating IP address is identified, it can be used to determine the owner of the IP
address. Inspection of server logs obtained may reveal more information about the identity of the person using that
address during the date and time in question of the email.
Using the bottom-to-top approach, the earliest line from the bottom indicates the mail being received by the
sender’s e-mail server. Subsequent “Received” lines show handoffs between one e-mail host to another. The
topmost or last “Received” line is the handoff to the final e-mail host in the routing chain involved in delivering the
email to the receiver. Here is the format of a typical “Received” record:
Together the “Received” lines should form an unbroken chain from the sender’s e-mail host to the receiver’s e-
mail host, and describe the progress of the e-mail from the sender to the receiver. At each point, the host names
should be verified to resolve to their IP addresses Important here also is to examine the logs of all servers in the
4
TRACING E-MAIL MESSAGES
received chain as soon as possible to verify the routing information indicated in the headers. Logs files do get
removed quickly. The actual computer used to send the information, or to receive the message if using a non-HTTP
e-mail client, may also not get reported in the header [11].
Finally, the routing information added by servers is not under the control of the sender, and cannot be forged.
However, bogus “Received” fields can still be added, and forencic investigators should always be on the lookout for
these towards the lower end of the routing chain.
They help when searching e-mail logs to corroborate if the message was in fact sent via the particular computer,
and can sometimes contain time related and other helpful information about the e-mail. The message-ID is a unique
string assigned by the mail system when the message was created. SMTP servers involved in receiving the message
and passing it along towards its destination also create message IDs (see Table 6).
Once the e-mail sender’s identity has been confirmed, the Message ID can be a strong piece of evidence to
associate the machine used to send the e-mail to the received e-mail headers. Specialized forensic tools are often
required to find deleted e-mail files on the sender’s computer [3].
Using SMTP stand-alone client applications such as Microsoft Outlook, IBM Lotus Notes and others.
Using HTTP (web) based e-mail services such as AOL, Yahoo Mail, Google Mail and others.
Sending e-mail manually using other applications and scripts.
Tracing e-mail origin can differ depending on initiation methods. With HTTP based e-mail, if the sender and
receiver are using the same web service, no servers are involved in exchanging e-mail outside of the web service. In
such cases, tracing the e-mail becomes a matter of determining the ISP that owns the IP address in the “Received”
section of the e-mail header. If a sender uses a company’s SMTP server then Network Address Translation (NAT),
which enables a corporate network to use one set of IP addresses for internal traffic and another set for external
traffic [5], may likely be used and a non-published IP address could be recorded in e-mail headers [3].
Some factors are shared between initiations methods, such as static and dynamic address policies of ISPs and
corporate networks, the importance of time information in e-mail headers, and the use of tunneling services such as
SSH (Secure Shell) [3, 4]. Other factors differ, such as false headers and use of open relay and open proxy servers
which are more relevant to SMTP based e-mails. All these factors are important when examining e-mail headers to
determine the origin of the e-mail, or when reviewing the e-mail headers for forged information.
5
TRACING E-MAIL MESSAGES
Decoding the information in the above header reveals the IP address of the sender. Facebook has recently
changed this header to only include localhost (127.0.0.1) Base64 encoded, to preserve the anonymity of the sender
[14]. But this is enough to demonstrate that e-mail clients can provide additional clues about the sender’s location
by adding information via e-mail headers, and as a result of features which are typical of the e-mail client.
An http “<img src>” tag where the source of picture is placed on an http server. When the receiver opens
the message, an entry containing the IP address of the receiver’s machine is logged on the http server
holding the image. The helps track down the sender and validate the ownership of the e-mail address.
However, this technique may not always work as some browsers and e-mail clients block the downloading
of images [3].
If the sender is using an open proxy server that makes it difficult track them down, then forensic
investigators can try sending emails with an embedded Java Applet or Active X control that extracts the
receiver’s IP address and other information from their machine and sends it back to the investigator [2].
6
TRACING E-MAIL MESSAGES
5.1 Spoofing
Spoofing is the process where a sender inserts fake headers into the e-mail address to hide the network address of
their computer. For example, a sender can insert fake “From” and “Reply-To” headers into the e-mail [7]. When
performing an analysis of e-mail headers it is important to keep in mind that these headers can be forged.
The X-Originating-IP field can assist in the identification of the computer used to send the email messages.
This field may not always be included however. But if present, this field should match the address in the bottom
“Received” message line. If not, then it may indicate that some of the header information has been forged.
Once a sender’s IP address has been identified there are several resources such as http://samspade.org that can
be used to identify the owner of the message.
5.5 Tunneling
Tunneling can be used by e-mail senders to evade being traced by law enforcement [3]. SSH tunneling is one
approach used by home users. When tunneling is used, only the IP address of the tunneling server shows up in the
“Received” section of e-mail headers. Also, the use of the SSH server used for tunneling may not be logged. In
addition, there may be a number of users logged onto the same server using SSH at the time the e-mail was sent, and
this make it hard to prove the identity of the sender in court. All this makes tracing the source of a tunneled e-mail
quite complex.
7
TRACING E-MAIL MESSAGES
6 Conclusion
Tracing e-mail messages is a large and complex topic. This paper briefly describes the basic method for tracing e-
mail messages and determining the identity of the sender using header information, server network logs, and
message text.
E-mail messages consist of a header and body, and optionally attachments. The e-mail header contains the
information a forensic investigator needs to track the origin of the message. E-mail logs and server information can
be used to additionally track the source of e-mail messages.
Finally, the paper outlines a few of the common challenges facing any forensic investigator when attempting to
identify the source and identity of the e-mail and sender. Senders can forge header information and make use of
insecure computers, open relay hosts and proxies, and untraceable Internet connections to hide their source and
identity. In such cases, a message can only be traced back to the point where the forgery begins, or when further
routing information is untrustworthy and sometimes unavailable.
References
1. Qi, M., Edgar-Nevill, D., Wang, Y., and Xu, R. 2008. Tracking online trails. Int. J. Electron. Secur. Digit.
Forensic 1, 4 (Nov. 2008), 353-361.
2. Ickin Vural, HS Venter. Investigating Identity Concealing and Email Tracing Techniques
3. Al-Zarouni Marwan. 2004. Tracing E-mail Headers. We-B Centre & Edith Cowan University
4. Akin, T. 2003. Webmail Forensics. Retrieved 12/5/2004, from
http://opensores.thebunker.net/pub/mirrors/blackhat/presentations/bh-usa-03/bh-us-03-akin.pdf
5. http://www.wikipedia.com
6. Bill Nelson, Amelia Phillips, and Christopher Steuart. 2010. Guide to Computer Forensics and
Investigations, Fourth Edition.
7. Boneh, Dan. 2004. The Difficulties of Tracing Spam Email. Department of Computer Science Stanford
University.
8. Jones, H. 2001. Removing the Mystery from E-mail Tracing. Retrieved 6/5/2004, from
http://ncfs.ucf.edu/Email%20Tracing2.ppt
9. Venit, A. J. 2000. The Key to Unlocking E-Mail Headers. Retrieved 6/5/2004, from
http://ncfs.ucf.edu/email%20tracing%20SA%20Venit.ppt
10. Rick’s Spam Digest, http://www.rickconner.net/spamweb/anatomy.html
11. Thomas J. E. Schwarz, 2006. S.J. Email Fundamentals.
http://www.cse.scu.edu/~tschwarz/COEN252_06/Lectures/emailTracing.html
12. Thomas J. E. Schwarz, S.J. 2006. Email and Internet Investigations.
http://www.cse.scu.edu/~tschwarz/COEN252_06/Lectures/Email%20Investigation.html
13. Donald Cheung, 2010. SEIS722 Email Tracing Lecture Notes.
14. Chester Wisniewski’s Blog, http://www.sophos.com/blogs/chetw/g/2010/05/08/facebook-notifications-
leak-ip-addresses