You are on page 1of 31

Analysis of the File Transfer Protocol (FTP)

by Priscilla Oppenheimer There are many myths about how FTP works. The goal of this white paper is to briefly describe FTP protocol processes to dispel some of the myths. More detailed information about FTP behavior is available in Chapter 9 of the Troubleshooting Campus Networks book by Priscilla Oppenheimer and Joseph Bardwell. FTP was one of the first Internet protocols. It was designed for use on the Internet when the Internet was still a closed system that connected universities, government agencies, and a few commercial companies involved in the development of the system. FTP's behavior on modern networks, especially networks where security is a big concern, is problematic. This paper provides the theoretical background you will need to get FTP working properly on a modern network that uses firewalls and Network Address Translation (NAT). FTP was designed long before the term client/server came in vogue, but it does behave like a client/server protocol nonetheless. FTP uses two TCP connections, one for control information and one for the actual data. Analysis of an FTP session involves an examination of the FTP commands sent on the control connection and an assessment of the TCP segments sent on the data connection. With normal or active FTP, the control connection is initiated by the client side and the data connection is initiated by the server side. (Active mode is also sometimes called port mode). The other option is passive FTP, in which case the client side establishes the data connection.

FTP and TCP Port Numbers


FTP uses different TCP port numbers depending on whether active or passive FTP is in use. Before we cover FTP in more detail, we'll briefly discuss some basic concepts regarding TCP port numbers. TCP uses port numbers to identify the sending and receiving application. A port number helps TCP demultiplex byte streams and deliver bytes to the correct application. TCP ports can be semi-permanent or ephemeral (short-lived). Servers listen on the semipermanent ports for clients wishing to access services. Clients use ephemeral ports to identify their end of a conversation. The client side only lasts while the client is using a service, whereas a server port is usually open the entire time that a server is running. TCP port numbers also fall into these three categories:

Well-known port numbers are used to identify standard services that run above TCP, including FTP, HTTP, Telnet, SMTP, and so on. Well-known port numbers are 0 to 1,023. Registered port numbers identify an application that has been registered with the Internet Assigned Numbers Assigned Numbers Authority (IANA). Registered port numbers are 1,024 to 49,151.

Private port numbers are unregistered and can be dynamically assigned to any application. Private port numbers are 49,152 to 65,535.

A registered port number is intended for use by only the registered application. However, you will see port numbers that are supposedly "registered" get used as an ephemeral port by applications that are not related to the registered application. You can get an up-to-date list of port numbers from IANA here.

FTP Active Versus Passive Mode


When troubleshooting FTP problems, one of the first questions you should ask is whether active or passive mode is in use. Because their behaviors are quite different, the two modes experience different problems. In the past, client implementations defaulted to active mode. These days, many client implementations default to passive mode due to security concerns with active mode. FTP Active Mode The steps for active FTP are as follows: 1. The client sends a TCP SYN to the well-known FTP control port (port 21) on the server. The client uses an ephemeral port as its source port. 2. The server sends the client a SYN ACK from port 21 to the ephemeral port on the client. 3. The client sends an ACK. The client uses this connection to send FTP commands and the server uses this connection to send FTP replies. 4. When the user requests a directory listing or initiates the sending or receiving of a file, the client software sends a PORT command that includes an ephemeral port number that the client wishes the server to use when opening the data connection. The PORT command also includes an IP address, which is usually the client's own IP address, although FTP also supports a third-party mode where a client can tell a server to open a connection with a different host. 5. The server sends a SYN from port 20 to the client's ephemeral port number, which was provided to the server in the client's PORT command. 6. The client sends a SYN ACK from its ephemeral port to port 20. 7. The server sends an ACK. 8. The host that is sending data uses this new connection to send the data in TCP segments, which the other host ACKs. (With some commands, such as STOR, the client sends data. With other commands, such as RETR, the server sends data.) 9. After the data transfer is complete, the host sending data closes the data connection with a FIN, which the other host ACKs. The other host also sends its own FIN, which the sending host ACKs. 10. The client can send more commands on the control connection, which may cause additional data connections to be opened and then closed. At some point, when the user is finished, the client closes the control connection with a FIN. The server ACKs the client's FIN. The server also sends its own FIN, which the client ACKs. Figure 1 shows a graphical representation of the first few steps of FTP active mode.

Figure 1. FTP active mode.


The FTP PORT Command

The FTP PORT command causes problems for network support engineers in many ways. For one thing, the encoding of the IP address and port number in a PORT message is not straightforward. In addition, an application-layer protocol command theoretically shouldn't include network-layer information (an IP address). This breaks the principles of protocol layering and can cause both interoperability and security problems. Figure 2 shows an example of an FTP PORT command.

Figure 2. An FTP PORT command from a client to a server. In the detail view of the packet shown in Figure 2, the WildPackets EtherPeek protocol analyzer decodes the IP address parameter for the PORT command, followed by the port number. (See PORT 192,168,10,232,6,127.) The 6, 127 portion becomes a port number by multiplying the first digit by 256 and adding the second digit. So the client specified a port number that is (6 x 256) + 127, which equals 1663. Figure 3 verifies that the server did indeed open a TCP connection from port 20 to port 1663.

Figure 3. A server in active mode opens a TCP connection for data transfer. When FTP is used, network firewalls must be "stateful." The firewall must track FTP sessions and be on the lookout for FTP PORT commands. The firewall needs to anticipate the connection establishment coming from the FTP server into the client's port specified in the PORT command. If the network uses NAT, the NAT gateway must be stateful also. The gateway needs to translate the IP address in the FTP PORT command to the address assigned to the client, and then recalculate the TCP checksum. If the gateway doesn't perform these operations correctly, FTP can fail. Hackers can exploit the third-party feature of FTP by setting the IP address and port number parameters in the PORT command to a target host's address and port numbers (sometimes called an FTP bounce attack). For example, a hacker can cause an FTP server to continually send TCP SYN packets from port 20 to a set of target ports, making it look like the server is initiating a port scan. The target won't know that the attack is coming from the hacker's machine. It will appear that it's coming from the FTP server. Some commonly-used FTP implementations set the IP address in the PORT command to 0.0.0.0, with the intention being that the server should simply open the data connection on the same client that opened the control connection. Setting the IP address to 0.0.0.0 can confuse firewalls.

For exmple, with Cisco IOS release 6.x(x), the PIX fixup protocol for FTP does not allow the IP address for the data connection to be different from the one already in use for the control connection. The reason for this is to thwart a hacker who could use the PORT command to launch an attack on another host. Although the FTP implementations that set the IP address to 0.0.0.0 are not intentional hacks, they do trigger a problem with the fixup protocol and other firewalls that were designed to disallow FTP third-party mode and to avoid FTP bounce attacks. FTP Passive Mode The steps for passive FTP are described in the following list. Steps 1 through 3 are the same as the first three steps for active mode. Also, steps 9 through 11 are the same as the last three steps for active mode. 1. The client sends a TCP SYN to the well-known FTP control port (port 21) on the server. The client uses an ephemeral port as the source port. 2. The server sends the client a SYN ACK from port 21 to the ephemeral port on the client. 3. The client sends an ACK. The client uses this connection to send FTP commands and the server uses the connection to send FTP replies. 4. When the user requests a directory listing or initiates the sending or receiving of a file, the client software sends a PASV command to the server indicating the desire to enter passive mode. 5. The server replies. The reply includes the IP address of the server and an ephemeral port number that the client should use when opening the connection for data transfer. 6. The client sends a SYN from a client-selected ephemeral port to the server's ephemeral port number, which was provided to the client in the reply to the client's PASV command. 7. The server sends a SYN ACK from its ephemeral port to the client's ephemeral port. 8. The client sends an ACK. 9. The host that is sending data uses this new connection to send the data in TCP segments, which the other host ACKs. (With some commands, such as STOR, the client sends data. With other commands, such as RETR, the server sends data.) 10. After the data transfer is complete, the host sending data closes the data connection with a FIN, which the other host ACKs. The other host also sends its own FIN, which the sending host ACKs. 11. The client can send more commands on the control session, which may cause additional data connections to be opened and then closed. At some point, when the user is finished, the client closes the control connection with a FIN. The server ACKs the client's FIN. The server also sends its own FIN, which the client ACKs. Figure 4 shows a graphical representation of the first few steps of FTP passive mode.

Figure 4. FTP passive mode.


The FTP PASV Command

A PASV request asks the server to accept a data connection on a new TCP port selected by the server. There are no parameters to the PASV command. The server's response is a single line showing the IP address of the server and the TCP port number where the server is accepting connections. Figure 5 shows a server's reply to a client's PASV command. The server tells the client that the server is listening on port 5365. (See 192,168,179,100,20,245). To calculate the port number, multiply 20 times 256 and add 245. (20 x 256 +245 = 5365.)

Figure 5. A server's reply to a client's PASV command. After receiving a reply to its PASV command, the client opens a TCP connection from an ephemeral port to the port number supplied by the server. Figure 6 shows the TCP connection establishment from the client. Notice that the destination port is 5365, as expected.

Figure 6. A client in passive mode opens a TCP connection for data transfer. Most people assume that passive mode causes fewer problems for firewalls than active mode, but note that the client opens a connection to an ephemeral (not well-known) port when using passive mode. Some firewalls and Cisco access lists block this. Also, the response from the server comes from an ephemeral port and goes to an ephemeral port. Firewalls and Cisco access lists might block this also. On a Cisco router you can use the established keyword with an access list to avoid this second problem. The established keyword tells the router to allow packets with the ACK bit set. The SYN ACK from the server has the ACK bit set.

Clear Text Username and Password


Another infamous problem with FTP is that it sends the username and password as clear text, that is, unencrypted. Anyone with a strategically-placed protocol analyzer can see the name and password. FTP sends the data in the files being transfered as clear text also. This might not seem like a huge problem because it's difficult to place a protocol analyzer in the correct place to glean the unencrypted information, but it's not impossible to do so. The problem is made worse by the fact that users tend to use the same password for many different applications. If hackers glean an FTP password, they may be getting a password that is also used to access online checking accounts or other confidential data.

Alternatives to FTP
In this day and age, there's probably no excuse for using FTP any more, considering that there are other more secure options for file transfer. Both SCP and SFTP, for example, are similar in function to FTP but use Secure Shell (SSH) authentication and encryption. If you use a Unixbased server, you should be able to invoke scp or sftp from the command line. For more information about SecureShell, see the OpenSSH Web site. If your use for FTP is limited to updating Web pages, there is another alternative called Webbased Distributed Authoring and Versioning (WebDAV). WebDAV is a set of extensions to the Hypertext Transfer Protocol (HTTP) to allow users to collaboratively edit and manage files on remote Web servers. See RFC 2518 for more information.

Summary
FTP was designed in the 1970s. At that time, the Internet was a closed network and security was not a big concern. When FTP is used in modern environments with NAT gateways, firewalls, and Cisco access lists, problems can arise, whether you use active or passive mode. FTP is often used for mission-critical applications on the public Internet, which is probably a mistake. There have been many attempts to make FTP more secure. These attempts cause troubleshooting issues and also fail to fix the most glaring security problem with FTP, which is that it sends the username and password as clear text. Many alternatives to FTP are available, including SCP, SFTP, and WebDAV.

Other FTP Resources


Bernstein, D. J., FTP: File Transfer Protocol. CERT Coordination Center, Problems with the FTP PORT Command or Why You Don't Want Just any PORT in a Storm. Postel, J. and J. Reynolds, RFC 959 (the official specification for FTP).

What is FTP?
FTP is an acronym for File Transfer Protocol. It represents a widely-used, method for transferring files between two computers. One of the computers must be set up as a host or server. The other, known as the client, is the computer that you are using. There are two basic types of operations: When you retrieve a file from the server, you are performing a "get" operation. This is also known as "downloading." When you post a file to the server, your are performing a "put" operation. This is also known as "uploading."

When you connect to an FTP server, you must send a username and password. In many cases the username "anonymous" may be used to mean "I am a visitor." Even if you are visiting and FTP site and are not prompted for a username, the FTP server will assign the anonymous username to you. Most FTP servers ask which username you would like to use.

Types of FTP "Programs"


There are a number of programs to help transfer files using FTP. Technical people tend to use a command line program built into most operating systems, and it has a simple name: 'ftp'. But in order to keep network sniffers from reading your password when you connect to our ftp server, we use a secure ftp server. This means that you must use a secure ftp client that uses "SFTP", or secure file transfer protocol. There are a number of popular freeware FTP programs such as WinSCP, CuteFTP, FileZilla, Fugu, and SmartFTP. In addition there are commercially available programs for those with too much money. For class purposes, FileZilla is a great program to use because it is free to use and runs on Windows, Macs, and Linux environments. Side note: another way to access files is to use your web browser. However, there are lots of traces of your username and password that will be left behind, so refrain from using this method.

I want to watch the ftp traffic and find which ftp urls are being accessed with tshark. Wireshark's Display Filters contains the fields http.request.uri and http.host, but these options are not available for ftp traffic. The problem is that FTP is not a stateless transaction protocol like HTTP - with HTTP the client does a single request which details all the parameters required to deliver the file, and the server responds with a single message that contains all the metadata and the file contents. By comparison, FTP is a chat-style protocol: to get something done you open a connection to the server and it starts chatting with the server - login, change to some directory, list files, get me this file, etc. You can listen into this conversation using wireshark like this: tshark -i lo -f 'port 21' -l -t ad -n -R ftp.request.command -T fields -e ftp.request.command -e ftp.request.arg The output received when a user tries to retrieve a file from the FTP server (in this example using the client software curl) might look like this:

USER PASS PWD CWD EPSV TYPE SIZE RETR QUIT

username password Documents I somefile.ext somefile.ext

A bit of processing over that might give you a URL like log of file retrievals. For example, I came up with this thing using perl: tshark -i lo -f 'port 21' -l -t ad -n -R ftp.request.command \ -T fields -e ftp.request.command -e ftp.request.arg | \ perl -nle ' m|CWD\s*(\S+)| and do { $dir=$1; if ($dir =~ m,^/,) { $cwd=$dir } else { $cwd .= "/$dir"; } }; m|RETR\s*(\S+)| and print "$cwd/$1";' For the same FTP session above, this script will yield a single line of output: /Documents/somefile.ext

A classic example that shows people why using the FTP protocol is a bad idea:

You can begin to capture traffic by going to the left panel of the Wireshark window and clicking on Capture/Interfaces. Choose the interface that goes "out" to the network (for example eth1) and click on Start, and Wireshark will start examining all the packets in transit on the network.

Now open a second terminal window and invoke a normal FTP session. Enter the login name and password, run some FTP commands, then close the session. Return to the main Wireshark window, and you should see that many packets have passed over the network since the moment you started capturing. Click Stop Capture (or press Ctrl+E); then you can examine the traffic you've got.

Figuring out something from what is probably a large amount of traffic is not simple, until you use a BPF filter. You want one that shows only packets that are part of an FTP connection, so in the Filter field type "ftp." Immediately the traffic of your session should be highlighted, and in a stunning display of poor security, you'll see clearly your username and password. It will look something like this:
356 101.676753 10.100.1.1 192.168.0.4 Response: 220 (vsFTPd 1.1.3) 360 104.546659 192.168.0.4 10.100.1.1 Request: USER wazi 362 104.594520 10.100.1.1 192.168.0.4 Response: 331 Please specify the password. 366 106.530150 192.168.0.4 10.100.1.1 Request: PASS mytest 371 108.922240 10.100.1.1 192.168.0.4 Response: 530 Login incorrect. FTP FTP FTP FTP FTP 86 77 100 77 88

If this result doesn't convince your colleagues to quit using FTP and turn to OpenSSH, there's no hope for them.

The Most Common Filters for Wireshark


On top of all of the filters we've look at so far, here are a few more useful ones.

ip.addr==192.168.0.1 --- Show all traffic from and to 192.168.0.1 tcp.port==80 --- Show all the traffic with 80 as a source or destination port ip.src==192.168.0.1 and ip.dst==10.100.1.1 --- Show all the traffic that starts from 192.168.0.1 and has as target 10.100.1.1 ftp --- Show only the traffic for the ftp protocol http --- Show only the traffic for the http protocol dns --- Show only the traffic for the dns protocol http.request.uri contains string --- Show all http traffic where the url contains the word "string."

As I had said that FTP and HTTP are two of the some application protocols that pass the username and password across the network in clear text. So here, we will do some sniffing of FTP server details of some victim. Install and run Wireshark

Go to Capture and select Interfaces from the drop-down.

I select the VMWare network adapter since I am using VMWare for demonstrating this. Go to Options

Check that the Promiscuous mode checkbox is selected and I also opted to use the Capture filter to narrow the sniffed packets. That is, it targets the packet activity of the specific IP used as the Capture filter. Done with the setting, press the Start button. Now, wait for the victim to login to the FTP server. I am using Ubuntu OS at the victim end on the same network.

I have setup the FTP server at some IP 192.168.48.128. The sooner the victim logs in. The login details get sniffed and get listed out as:

Check the rows highlighted in blue, in the above snapshot. One shows the user-name and the other shows the password of the victim.
The best alternative is FTPS (FTP and SSL) - from the wikipedia page: FTPS (also known as FTP Secure and FTP-SSL) is an extension to the commonly used File Transfer Protocol (FTP) that adds support for the Transport Layer Security (TLS) and the Secure Sockets Layer (SSL) cryptographic protocols. FTPS should not be confused with the SSH File Transfer Protocol (SFTP), an incompatible secure file transfer subsystem for the Secure Shell (SSH) protocol. It is also different from Secure FTP, the practice of tunneling FTP through an SSH connection.

for FTP, there is always two sessions between client and server. The control session is opened by the client on TCP port 21. Then for every subsequent data transfer another data session is required. The control session is always long-lived. <authors>, <RCF title in italics>, <organization>, <rfc number, year>, <retrieved online/etc>, <URL>
there is support for creating your own authentication and authorization providers. You can also create providers for custom FTP logging and for determining the home directory information for your FTP users.

FTP Binding Component is bound to either a service consumer or service provider, the interfaces exposed is defined by a WSDL. FTP Binding Component implements a set of specific extensibility elements so that a service can be defined and bound to a FTP protocol.

The FTP Binding Component supports the following extensibility elements: 1. Address: Specifies the FTP connectivity element information such as, FTP URL (host, port, login, password), directory listing style, user defined heuristics for directory listing parsing. 2. Binding: FTP binding element, a marker element indicating a FTP binding. This element does not have attributes. 3. Operation: FTP operation element is a marker element indicating a FTP operation. This element does not have attributes. 4. Transfer: FTP transfer element, specifies a message transfer from a sender and receiver perspective. For example, to specify a message transfer for a service request, there is a sender and a receiver involved, the WSDL author specifies: A. The target sender sends to: Represented by attribute ftp:sendTo, the target receiver receives from represented by attribute ftp:receiveFrom. Also, the additional operations performed before a message is sent (PUT) to target or after a message is received (GET) from the target and is called Pre/Post operations. B. messageCorrelate: If enabled, a UUID tagging based message correlation scheme will be used to correlate requestresponse of a synchronous service invoking. 5. Message: FTP message element, specifies a message transfer from a service consumer and/or service provider perspective. The WSDL author can specify: A. The message repository represented as attribute ftp:messageRepository. A base directory where all the working directories for a message transfer are created, such as, A. inbox : Is used for posting request (by consumer) and polling request (by provider) B. instage : Is used for staging request C. outbox : Is used for posting response (by provider) and polling response (by consumer) D. outstage : Is used for staging response B. messageCorrelate : If enabled, a UUID tagging based message correlation scheme will be used to correlate request-response of a synchronous service invoking.
Beginning with FTP 7.5, the FTP service supports extensibility for custom authentication and roles, logging, and home directory lookups. When you create an extensibility provider, you must first register it in the .NET Global Assembly Cache (GAC) for managed-code providers, or in your system's registry for native-code (COM) providers.

The following table lists the managed-code interfaces that are exposed by the FTP service.
IFtpAuthenticationProvider Interface (Managed) [IIS 7]

Provides an interface for authenticating a FTP user.


IFtpHomeDirectoryProvider Interface (Managed) [IIS 7]

Provides an interface for retrieving the path to a user's FTP home directory.
IFtpLogProvider Interface (Managed) [IIS 7]

Provides an interface for logging FTP activity.


IFtpRoleProvider Interface (Managed) [IIS 7]

Provides an interface for authorization and role checks.

An Overview of the File Transfer Protocol


The File Transfer Protocol (FTP) was one of the first efforts to create a standard means of exchanging files over a TCP/IP network, so the FTP has been around since the 1970's. The FTP was designed with as much flexibility as possible, so it could be used over networks other than TCP/IP, as well as being engineered to have the capability with exchanging files with a broad variety of machines. The base specification is RFC 959 and is dated October 1985. There are some additional RFCs relating to FTP, but it should be noted that even as of this writing (December 2001) that most of the new additions are not in widespread use. The purpose of this document is to provide general information about how the protocol works without getting into too many technical details. RFC 959 should be consulted for details on the protocol. Control Connection -- the conversation channel The protocol can be thought of as interactive, because clients and servers actually have a conversation where they authenticate themselves and negotiate file transfers. In addition, the protocol specifies that the client and server do not exchange data on the conversation channel. Instead, clients and servers negotiate how to send data files on separate connections, with one connection for each data transfer. Note that a directory listing is considered a file transfer. To illustrate, we'll just present (an admittedly contrived) example of how the FTP would work between human beings rather than computer systems. For our example, we'll assume we have a client, Carl Clinton, who wishes to transfer files from Acme Mail Service that manages his post office box. Below is a transcript of a phone call between Carl Clinton and Acme Mail Service.
Clinton: (Dials the phone number for the mail service) Service: "Hello, this is the Acme Mail Service. Clinton: "Hello, this is Carl Clinton.
number MB1234." MB1234. How may I help you today?" I would like to access mailbox

Service: "OK, Mr. Clinton, I need to verify that you may access mailbox
What is your password?"

Clinton: "My password is QXJ4Z2AF." Service: "Thank you Mr. Clinton, you may proceed." Clinton: "For now, I'm only interested in looking at the bills and Service: "OK." Clinton: "Please prepare to have your assistant call my secretary at +1 402
555 1234." invoices, so look at the folder marked "bills" in my mailbox."

Service: "OK." Clinton: "Now call my secretary and tell him the names of all the items in
the bills folder of my mailbox. Tell me when you have finished."

Server: "My assistant is calling your secretary now." Server: "My assistant has sent the names of the items." Clinton: (Receives the list from his secretary and notices a bill from Yoyodyne Systems.)
"Please prepare to have your assistant send to my fax machine +1

402 555 7777."

Service: "OK." Clinton: "Now fax a copy of the bill from Yoyodyne Systems." Server: "My assistant is calling your fax machine now." Server: "My assistant has finished faxing the item." Clinton: "Thank you, that is all. Server: "Goodbye."
Good bye."

Now let's look at how this same conversation would appear between computer systems communicating with the FTP protocol over a TCP/IP connection.
Client: Server: 220 Hello, this is the Acme
Mail Service.

Connects to the FTP service at port 21 on the IP address 172.16.62.36.

Client: USER MB1234 Server: 331 Password required to Client: PASS QXJ4Z2AF Server: 230 Logged in. Client: CWD Bills Server: 250 "/home/MB1234/Bills" is
new working directory. access user account MB1234.

Note that this password is not encrypted. The FTP is susceptible to eavesdropping! Change directory to "Bills."

Client: PORT 192,168,1,2,7,138

The client wants the server to send to port number 1930 on IP address 192.168.1.2. In this case, 192.168.1.2 is the IP address of the client machine. Send the list of files in "Bills." The server now connects out from its port 20 on 172.16.62.36 to port 1930 on 192.168.1.2. That succeeded, so the data is now sent over the established data connection. The client wants the server to send to port number 1931 on the client machine. Download "Yoyodyne.TXT." The server now connects out from its port 20 on 172.16.62.36 to port 1931 on 192.168.1.2. That succeeded, so the data is now sent over the established data connection.

Server: 200 PORT command successful. Client: LIST Server: 150 Opening ASCII mode data
connection for /bin/ls.

Server: 226 Listing completed. Client: PORT 192,168,1,2,7,139 Server: 200 PORT command successful. Client: RETR Yoyodyne.TXT Server: 150 Opening ASCII mode data Server: 226 Transfer completed. Client: QUIT
connection for Yoyodyne.TXT.

Server: 221 Goodbye.

When using FTP, users frequently employ the FTP client programs rather than directly communicating with the FTP server. Here's our same example using the stock "ftp" program which is usually installed as /usr/bin/ftp on UNIX systems (and FTP.EXE on Windows). The items the user types are in bold.
ksh$ /usr/bin/ftp ftp> open ftp.acmemail.example.com Connected to ftp.acmemail.example.com (172.16.62.36). 220 Hello, this is the Acme Mail Service. Name (ftp.acmemail.example.com:root): MB1234 331 Password required to access user account MB1234. Password: QXJ4Z2AF 230 Logged in. ftp> cd Bills 250 "/home/MB1234/Bills" is new working directory. ftp> ls 200 PORT command successful. 150 Opening ASCII mode data connection for /bin/ls. -rw-r--r--rw-r--r-1 ftpuser 1 ftpuser ftpusers ftpusers 14886 Dec 317000 Dec 3 15:22 Acmemail.TXT 4 17:40 Yoyodyne.TXT

226 Listing completed. ftp> get Yoyodyne.TXT local: Yoyodyne.TXT remote: Yoyodyne.TXT 200 PORT command successful. 150 Opening ASCII mode data connection for Yoyodyne.TXT. 226 Transfer completed. 317000 bytes received in 0.0262 secs (1.2e+04 Kbytes/sec) ftp> quit 221 Goodbye.

As you can see, FTP is designed to allow users to browse the file system much like you would with a regular UNIX login shell or MS-DOS command prompt. This differs from other protocols that are transactional (i.e. HTTP), where a connection is established, clients issue a single message to a server that replies with a single reply, and the connection is closed. On the other hand, client programs can be constructed to simulate a transactional environment if they know in advance what they need to do. In effect, FTP is a stateful sequence of one or more transactions. Command primitives, result codes and textual responses

The client is always responsible for initiating requests. These requests are issued with FTP command primitives, which are typically 3 or 4 characters each. For example, the command primitive to change the working directory is CWD. The server replies are specially formatted to contain a 3-digit result code first, followed by a space character, followed by descriptive text (there is also a format for multi-line responses). The protocol specifies that clients must only rely upon the numeric result code, since the descriptive text is allowed to vary (with a few exceptions). In practice, the result text is often helpful for debugging, but is generally no longer useful for end users. Authentication Although it is not required by protocol, in effect clients must always login to the FTP server with a username and password before the server will allow the client to access the service. There is also a de facto standard for guest access, where "anonymous" (or "ftp") are used as the username and an e-mail address is customarily used as the password in a way for a polite netizen to let the server administrator know who is using the guest login. Because users do not want to divulge their e-mail addresses to protect against unsolicited bulk e-mail, this has subsequently evolved to the point where the password is just some arbitrary text. Types of data connections The protocol has built-in support for different types of data transfers. The two mandated types are ASCII for text (specified by the client sending "TYPE A" to the server), and "image" for binary data (specified by "TYPE I"). ASCII transfers are useful when the server machine and client machine have different standards for text. For example, MS-DOS and Microsoft Windows use a carriage return and linefeed sequence to denote an end-of-line, but UNIX systems use just a linefeed. When ASCII transfers are specified, this enables a client to always be able to translate the data into its own native text format. Binary transfers can be used for any type of raw data that requires no translation. Client programs should use binary transfers unless they know that the file in question is text. The protocol does not have any advanced support for character sets for pathnames nor file contents. There is no way to specify UNICODE, for example. For ASCII, it is 7-bit ASCII only. Unfortunately, the burden of deciding what transfer type to use is left to the client, unlike HTTP, which can inform the client what type of data is being sent. Clients often simply choose to transfer everything in binary, and perform any necessary translation after the file is downloaded. Additionally, binary transfers are inherently more efficient to send over the network since the client and server do not need to perform on-the-fly translation of the data. It should be noted that ASCII transfers are mandated by the protocol as the default transfer type unless the client requests otherwise! The PORT and PASV conundrum -- Active and Passive data connections

Although it was purposely designed into the protocol as a feature, FTP's use of separate data connections cause numerous problems for things like firewalls, routers, proxies which want to restrict or delegate TCP connections, as well as things like IP stacks which want to do dynamic stateful inspection of TCP connections. The protocol does not mandate a particular port number or a direction that a data connection uses. For example, the easy way out would have been for the protocol's designers to mandate that all data connections must originate from the client machine and terminate at port 20 on the server machine. Instead, for maximum flexibility, the protocol allows the client to choose one of two methods. The first method, which we'll call "Active", is where the client requests that the server originate a data connection and terminate at an IP address and port number of the client's choosing. The important thing to note here is that the server connects out to the client.
Client: "Please connect to me at port 1931 on IP address
192.168.1.2, then send the data."

Server: "OK"

Or, the client can request that the server to assign an IP address and port number on the server side and have the client originate a connection to the server address. We call this method "Passive" and note that the client connects out to the server.
Client: "Please tell me where I can get the data." Server: "Connect to me at port 4023 on 172.16.62.36."

The active method uses the FTP command primitive PORT, so the first example using the actual FTP protocol would resemble this:
Client: PORT 192,168,1,2,7,139 Server: 200 PORT command successful.

The passive method uses the FTP command primitive PASV, so the second example using the actual FTP protocol would resemble this:
Client: PASV Server: Entering Passive Mode (172,16,62,36,133,111)

It should be noted that FTP servers are required to implement PORT, but are not required to implement PASV. The default has traditionally been PORT for this reason, but in practice it is now preferred to use PASV whenever possible because firewalls may be present on the client side which often cause problems. Partial data connections -- resuming downloads The protocol provides a means to only transfer a portion of a file, by having a client specify a starting offset into the file (using the REST primitive, i.e. "restart point"). If an FTP session fails while a data transfer is in progress and has to be reestablished, a client can request that the server restart the transfer at the offset the client specifies. Note that not all FTP servers support this feature.

Directory listings The base standard of the FTP protocol provides two types of listings, a simple name list (NLST) and a human-readable extended listing (LIST). The name list consists of lines of text, where each line contains exactly one file name and nothing else. The extended listing is not intended to be machine-readable and the protocol does not mandate any particular format. The de facto standard is for it to be in UNIX "/bin/ls -l" format, but although most servers try to emulate that format even on non-UNIX FTP servers, it is still common for servers to provide their own proprietary format. The important thing to note here is that this listing can contain any type of data and cannot be relied upon. Additionally, even those that appear in "/bin/ls -l" format cannot be relied upon for the validity of the fields. For example the date and time could be in local time or GMT. Newer FTP server implementations support a machine-readable listing primitive (MLSD) which is suitable for client programs to get reliable metadata information about files, but this feature is still relatively rare. That leaves the simple name list as the only reliable way to get filenames, but it doesn't tell a client program anything else (such as if the item is a file or a directory!). Functional concerns Despite a rich feature set, there are some glaring omissions. For example, the base specification doesn't even provide for clients to query a file's size or modification date. However, most FTP servers in use now support a de facto extension to the specification which provides the SIZE and MDTM primitives, and even newer servers support the extremely useful MLSD and MSLT primitives which can provide a wealth of information in a standardized format. There is also no 100% accurate way for a client to determine if a particular pathname refers to a file or directory, unless MLSD or MLST is available. Since the protocol also does not provide a way to transfer an entire directory of items at once, the consequence is that there is no 100% accurate way to download an entire directory tree. The end result is that FTP is not particularly suited to "mirroring" files and directories, although FTP client programs use heuristics to make calculated guesses when possible. Despite the guesswork that clients can use for determining metadata for files to download, there's little they can do for files that they upload. There is no standard way to preserve an uploaded file's modification time. FTP is platform agnostic, so there aren't standard ways to preserve platform-specific metadata such as UNIX permissions and user IDs or Mac OS file type and creator codes. Separate connections for data transfers are also a mixed blessing. For high performance, it would be best to use a single connection and perform multiple data transfers before closing it. Even better would be for a method to use a single connection for both the control connection conversation and data transfers. Since each data connection uses an ephemeral (random) port number, it is possible to "run out" of connections. For details on this phenomenon, a separate article is available. Security concerns

It is important to note that the base specification, as implemented by the vast majority of the world's FTP servers, does not have any special handling for encrypted communication of any kind. When clients login to FTP servers, they are sending clear text usernames and passwords! This means that anyone with a packet sniffer between the client and server could surreptitiously steal passwords. Besides passwords, potential attackers could not only monitor the entire conversation on the FTP control connection, they could also monitor the contents of the data transfers themselves. There have been proposals to make the FTP protocol more secure, but these proposals have not seen widespread adoption. Therefore, unless the IP protocol layer itself is secure (for example, encrypted using IPsec), FTP should not be used if sensitive login information is to be exchanged over an insecure network, or if the files containing sensitive material are being transferred over an insecure network. A TCP/IPv4 connection consists of two endpoints, and each endpoint consists of an IP address and a port number. Therefore, when a client user connects to a server computer, an established connection can be thought of as the 4-tuple of (server IP, server port, client IP, client port). Usually three of the four are readily known -- client machine uses its own IP address and when connecting to a remote service, the server machine's IP address and service port number are required. What is not immediately evident is that when a connection is established that the client side of the connection uses a port number. Unless a client program explicitly requests a specific port number, the port number used is an ephemeral port number. Ephemeral ports are temporary ports assigned by a machine's IP stack, and are assigned from a designated range of ports for this purpose. When the connection terminates, the ephemeral port is available for reuse, although most IP stacks won't reuse that port number until the entire pool of ephemeral ports have been used. So, if the client program reconnects, it will be assigned a different ephemeral port number for its side of the new connection. Similarly, for UDP/IP, when a datagram is sent by a client from an unbound port number, an ephemeral port number is assigned automatically so the receiving end can reply to the sender.

Limits Implied by the Ephemeral Port Range


Another important ramification of the ephemeral port range is that it limits the maximum number of connections from one machine to a specific service on a remote machine! The TCP/IP protocol uses the connection's 4-tuple to distinguish between connections, so if the ephemeral port range is only 4000 ports wide, it means that there can only be 4000 unique connections from a client machine to a remote service at one time. A port range of 4000 may seem large, but it is actually small for 21st century computing demands when you consider that a TCP connection must expire through the TIME_WAIT state before it is really completed. For example, even if both sides of a connection properly close their ends of the connection, due to TCP's error control, each side must wait until the TIME_WAIT state is expired before the connection's resources can really be disposed. The TIME_WAIT state is twice the MSL (maximum segment lifetime) which, depending on the IP stack, is usually

configured to be 240 seconds total. That means that you could have only 4000 connections per 240 second window, and in practice this can be exhausted.

Problems Exacerbated by the File Transfer Protocol


For example, the File Transfer Protocol (FTP) uses a separate data connection for each data transfer, so this phenomenon can be experienced when FTP'ing hundreds of small files over a fast local network. When all ephemeral ports are in use, a compatible BSD Sockets TCP/IP stack will return "Address already in use" errors. The second problem with the FTP is that with the default mode of operation, the separate data transfer connections originate from the server machine back to the client machine to an ephemeral port number requested by the client program. Since many firewalls restrict incoming connections, when the FTP data connection tries to connect to an ephemeral port number on the client machine, the connection may be denied by the firewall since the port number isn't a wellknown service port. It may be best to illustrate this by showing a sample FTP session:
Client: USER anonymous Server: 331 Guest login ok, send your
e-mail address as password.

Client: PASS NcFTP@ Server: 230 Logged in anonymously. Client: PORT 192,168,1,2,7,138 Server: 200 PORT command successful. Client: LIST Server: 150 Opening ASCII mode data
connection for /bin/ls.

The client wants the server to send to port number 1930 on IP address 192.168.1.2.

The server now connects out from port 20 to port 1930 on 192.168.1.2. That succeeded, so the data is now sent over the established data connection.

Server: 226 Listing completed. Client: QUIT Server: 221 Goodbye.

Now, had there been a restrictive firewall on the client machine, when the connection from the server to port 1930 on the client would have failed. Note that the client program did not explicitly ask for port 1930 -- it just asked for an unused port number to use for this temporary data connection.

Traditional Configuration of the Ephemeral Port Range


The BSD Sockets TCP/IP stack used ports 1024 through 4999 as ephemeral ports. Additionally, ports 1 through 1023 were intended for systems services running as the superuser, so those ports are called reserved ports. As discussed earlier, BSD's choice of the ephemeral port range is unfortunate because of its relatively small size (3975 ports) and its low numbered position. We feel that the default range

should be 49152 through 65535, which is both much larger (16383 ports) and is at the very top of the full port range.

Firewalling the Ephemeral Port Range


For firewalls, often administrators choose to restrict access to as many port numbers as possible. For cases where inbound connections to the ephemeral ports is required, an entire range of ports must be opened. It is imperative that when opening a range of ports on the firewall that no system services are listening on ports in the open range. Administrators will often want to open a specific range on the firewall, and then for each machine on the internal network, make sure that the ephemeral port range on the machine coincides with the open range on the firewall. It must be made clear that the ephemeral port range on machines on the internal network often do not coincide with each other since different operating systems may use different ranges. That is why it can be time consuming to manually configure each machine's ephemeral port range so it coincides with the open range on the firewall. As a result, administrators often end up changing to a policy of allowing all incoming ports and deny access to specific ports when needed. Luckily, it may not be necessary to open the ephemeral port range. It is usually only necessary when FTP is being served to the outside world (Passive "PASV" data connections use inbound ephemeral ports), or when FTP client access must work in non-passive mode ("PORT" connections from the server are inbound to clients using ephemeral ports).

Changing the Ephemeral Port Range


It is desirable to change which port numbers are used for the ephemeral port range for any of the following: To use a larger range so that more simultaneous connections are possible To shift the range to the higher numbered ports. The higher numbered ports should be used as ephemeral ports because they are less likely to be used as port numbers for system services. Well-known service ports have traditionally been assigned to lower port numbers. To change the range to coincide with other systems for purposes of firewalls and automatic network address translation.

When changing the range, we suggest you change it to 49152 through 65535. If you need a larger range, continue downward from 49152, but leave 65535 as your upper bound.

You might also like