2.1 A Little Background on SMB
Like NetBIOS, the Server Message Block protocol originated a long time ago at IBM. Microsoft embraced it, extended it, and in 1996 gave it a marketing upgrade by renaming it "CIFS". Over the years there have been several attempts to document and
standardize the SMB/CIFS protocol:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Change is the essential process of all existence. -- Spock (Leonard Nimoy) Let That Be Your Last Battlefield, stardate 5730.2 |
Without a current and authoritative protocol specification, there
is no external reference against which to measure the
"correctness" of an implementation, and no way to hold anyone
accountable. Since Microsoft is the market leader, with a proven
monopoly on the desktop, the behavior of their clients and servers is
the standard against which all other implementations are measured.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
You knew the job was dangerous when you took it. -- Super Chicken Jay Ward and Bill Scott, ABC TV, 1967-1968 |
Jeremy Allison, the Samba Team's First Officer3, has stated that "The level of detail required to interoperate successfully is simply not documentable". One reason that this is true is that Microsoft can "enhance" SMB behavior at will. Combined with the dearth of authoritative references, this means that the only criteria for a well-behaved SMB implementation is that it works with Microsoft products. As a result, subtle inconsistencies and variations have crept into the protocol. They are discovered in much the same way that a dog-owner discovers poop in the yard in springtime when the snow melts4. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Many people dread spring chores, but spring also brings the flowers. The children play, the dog chases a butterfly, the birds sing...and it all seems suddenly worthwhile. Likewise with the work we have ahead. Things are not really too bad, once you've gotten started. 2.1.1 Getting StartedThis part of the book will cover the basics of SMB, enumerate and describe some of the SMB message types (commands), discuss protocol dialects, give some details on authentication, and provide a few examples. That should be enough to help you develop a working knowledge of the protocol, a working SMB client, and possibly a simple server. Bear in mind, though, that SMB is more complex and less well defined than NBT. In the NBT section it was possible to describe every message type and provide a comprehensive review of the entire NBT protocol. It is not practical to cover all of SMB in the same way. Instead, the goal here is to explain the basics of SMB, provide details that are missing from other sources, and describe how to go about exploring SMB on your own. In other words, the goal is to develop understanding rather than simply providing knowledge. The textbook for this class is the latest version of the SNIA CIFS Technical Reference. Additional sources are listed in the References section near the end of this book. The most important tool, however, is probably the protocol analyzer. Warm up your copy of Ethereal or NetMon, and get ready to do some packet shoveling. 2.1.2 NBT or Not NBTBefore we actually start, there is one more thing to mention: The SMB protocol is supposed to be "transport independent". That is, SMB should work over any reliable transport that meets a few basic criteria. NBT is one such transport, but SMB does not really require the NetBIOS API. It can, for instance, be run directly over TCP/IP. Just for fun, we will refer to SMB over TCP/IP without NBT as "naked" or "raw". When running naked, SMB defaults to using TCP port 445 instead of the NBT Session Service port (TCP/139). Windows2000, WindowsXP, and Samba all support raw transport, but the large number of "legacy" Windows clients still in use suggest that NBT will not go away any time soon. Other than the new port number, there are only two notable changes between NBT and naked transport. The first is that naked transport does not make use of the NBT SESSION REQUEST and POSITIVE SESSION RESPONSE messages. The second is that the two transports interpret the SESSION MESSAGE header a bit differently. Recall (from section 1.6) that the NBT Session Service prepends a four-byte header to each SESSION MESSAGE, like so:
The LENGTH field, as shown, is 17 bits wide5. Raw TCP transport also prepends a four-byte header, but the full 24 bits are available for the LENGTH:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Your mileage may vary. |
Appendix B of the SNIA CIFS Technical Reference is the only source that was found which clearly shows the naked transport LENGTH field as being 24 bits wide. 24 bits translates to 16 megabytes, though, and that's a bigbunch--more than is typically practical. Fortunately, the actual maximum message size is something that is negotiated when the client and server establish the session. When we discuss the SMB messages themselves we will ignore the SESSION MESSAGE headers, since they are part of the transport, not the SMB protocol.
2.2 An Introductory Tour of SMBWe will start with a quick museum tour of SMB. Our guide will be the venerable Universal Naming Convention (UNC). You may remember UNC from the brief introduction way back in section 1.1. UNC will provide directions and point out highlights along the tour. Please stay together, everyone. The UNC directions are presented in terms of a path, much like the Uniform Resource Identifier (URI) paths that are used on the World Wide Web. To explain UNC, let us first consider something more modern and familiar: http://ubiqx.org/cifs/SMB.html That string is in URI syntax, as used by web browsers, and it breaks down to provide these landmarks:
The landmarks guide us along a path which eventually leads us to the file we wanted to access. The SMB protocol pre-dates the use of URIs and was originally designed for use on LANs, not internetworks, so it naturally has a different (though surprisingly similar) way of specifying paths. A Universal Naming Convention (UNC) path comparable to the URI path above might look something like this:
\\ubiqx\cifs\SMB.html ...and would parse out like this:
Very similar indeed.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The devil is in the details. |
One obvious difference between the two formats is that UNC doesn't provide a protocol specification. That's not because it always assumes SMB. The UNC format can support all sorts of filesharing protocols, but it is up to the underlying operating system or application to try to figure out which one to use. Protocol and transport discovery are handled by trial-and-error, with each possibility tested until something works. As you might imagine, a system with AppleTalk, NetWare, and SMB all enabled may have a lot of work to do. The UNC format is handled natively by Microsoft & IBM's extended family of operating systems: DOS, OS/2, and Windows6. Samba's smbclient utility can also parse UNC names, but it does so at the application level rather than within the OS and it only ever tries to deal with SMB. Even so, smbclient must handle both NBT and naked transport, which can be tricky. 2.2.1 The Server IdentifierThe first stop on our UNC tour of SMB is the server name field, which is really a server identifier field because it will accept addresses in addition to names. This book concerns itself with only two transports--NBT and naked TCP transport--so the only identifiers we care about are:
NetBIOS and DNS names both resolve to IP addresses, so all three are equivalent. Sort of... Recall that the NBT SESSION REQUEST packet requires a CALLED NAME in order to set up an NBT session with the server. Without a correct CALLED NAME, the NBT SESSION REQUEST may be rejected (different implementations behave differently). So...
...then we're in a bit of a pickle. How do we find the correct NetBIOS name to put into the CALLED NAME field? There really is no "right" way to reverse-map an IP address to a particular NetBIOS service name. The solution to this problem involves some guessing, and it's not pretty. We will go into detail when we discuss the interface between SMB and the transport layer. Of course, if SMB is running over raw transport then there is no NBT SESSION REQUEST message and, therefore, no CALLED NAME. In that case, the NetBIOS name isn't needed at all, which saves a lot of fuss and bother. 2.2.2 The Directory Path
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A path! A path! |
The directory path looks just like a directory path, but there is one small thing that makes it different. That thing is called the "share name". Whenever a resource is made available (shared) via SMB it is given a share name. The share name doesn't need to be the same as the actual name of the object being shared as it exists on the server. For example, consider the directory path below:
/dogs/corgi/stories/jolyon/ Suppose we just want to share the /stories subdirectory. If we simply call it "stories" no one will know what kind of stories it contains, so we should give it a more descriptive name. We might, for example, call it "dogbytes". The share name takes the place of the actual directory name when the share is accessed via SMB. If the server is named "petserver", then the UNC path to the same directory would be:
\\petserver\dogbytes\jolyon\ As shown in figure 2.1, there can be more than one share name pointing to the same directory and access rules may be applied on a per-share basis. The idea is similar, in some ways, to that of symbolic links (symlinks) in Unix, or shortcuts in Windows. The share is a named pointer--with its own set of attributes--to the object being made available by the server. 2.2.3 The FileThis is the last stop on our quick UNC tour of SMB. Files, like directories, should be fairly familiar and fairly straight-forward. As has been continually demonstrated, however, things in the CIFS world are not always as simple as they ought to be. Our point of interest on this part of the tour is the distinction between server filesystem syntax and semantics and client expectations...a very gnarled knot for CIFS implementors. Consider, for example, a bunch of Windows clients connecting to an SMB server running on Linux. On the Linux system the filenames Corgi, corgi, and CORGI would all be distinct because Linux filesystems are typically case-sensitive. Windows, however, expects filenames to be case-insensitive, so all three names are the same from the Windows point of view. Thus, we have a conflict. How does a Linux server make all three files available to the Windows client? Other difficult issues include:
These are complex problems, not easily solved. The CIFS protocol suite is not designed to be agnostic with regard to such things. In fact, CIFS goes out of its way at times to support features that are specific to DOS, OS/2, and Windows. ...and that concludes our tour. It's time to visit the gift shoppe. 2.2.4 The SMB URLThe UNC format is specific to one family of operating systems. Earlier on, though, we compared UNC with the more portable and modern URI format. That's called foreshadowing. It's a literary trick used to build suspense and anticipation. There is, in fact, such a thing as an SMB URL. It fits into the general URI syntax7 and can be used to specify files, directories, and other SMB-shared stuff. It is intended as a more portable, and more complete way to specify SMB paths at the application level. As of this writing, the SMB URL is only documented in an IETF Internet Draft, and is not yet any kind of standard. That hasn't stopped folks from implementing it, though. The SMB URL is supported in a wide variety of products including the KDE and GNOME desktop GUI environments, web browsers such as Galeon and Konqueror, and Open Source CIFS projects like jCIFS and libsmbclient (the latter is included with Samba). Thursby Software and Apple Computer also make use of the SMB URL in their commercial CIFS implementations. That's good news for CIFS implementors because it means that there is an accepted, cross-platform way to identify SMB-shared resources, both within LANs and across the Internet. 2.2.5 Was That Trip Really Necessary?Our quick UNC tour provided an introduction to some of the basic concepts, and annoyances, of SMB. We will expand upon those ideas as we dig more deeply into the protocol. The UNC format itself is also important for a variety of reasons, both historical and practical. Not least among these is that UNC strings are used within some of the SMB messages that cross the wire. The SMB URL format is equally significant. It is portable, flexible, and gaining in popularity. It will also form the basis for examples given later in the text. If you are implementing an SMB client, you will most likely want to have some convention for identifying resources. You could invent your own, or use UNC, but the SMB URL is probably your best option.
2.3 First Contact: Reaching the Server
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Getting there is half the fun. -- Unknown |
We are approaching this thing in layers. A little history, a quick introductory tour...and now this. It may seem like a bit of a diversion, but the goal in this section is to figure out how a client finds the server and initiates a connection. No, we're not dealing with SMB protocol yet, but we can't send SMB messages until we can talk to a server. Think of a telephone call. If you want to call your cousin in New York the first thing you need to know is the telephone number. You could ask your uncle for the number or look it up in the telephone book, or perhaps you have it written on a scrap of paper somewhere in the kitchen with your favorite tofu recipes. If you dial the wrong number you will annoy some guy in a gas station in Brooklyn. When you dial the correct number, the underlying system will go through a complex process to set up the connection so that you can start talking to your cousin (or, more likely, to the answering machine). ...and if you want to connect with an SMB server you might need to resolve a NetBIOS or DNS name to an IP address. Once you have the address, you can attempt to open a session with the server. Consider this simple SMB URL: smb://server/ From the user's perspective, that should be enough to build an initial connection to an SMB server named "server". From an implementation point of view, the first thing to do with this example is to parse out the "server" substring. In URI parlance, the field we are looking for is called the "host non-terminal"8, and it contains the name or address of the server to which we are trying to connect. Our term for the parsed-out string is "Server Identifier". Once we have extracted it, the next thing we need to know how to do is interpret it so that we can use the information to create the session. 2.3.1 Interpreting the Server IdentifierThe SMB URL format supports the use of three different identifier
types in the host field. We went over them briefly before. They are
the IP address, DNS name, or NetBIOS name of the destination. Our
next task is to figure out which is which.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
If you want something done right you have to do it yourself. -- Well-known axiom |
Presentation is everything, and it turns out that the code for interpreting the Server Identifier is verbose and tedious. Most of the busywork for handling NetBIOS names was covered in section 1, and there are plenty of tools for dealing with IP addresses and DNS names, so to save time we will describe how to interpret and resolve the address (and let you write the code yourself9).
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
You are likely to be eaten |
That is how to go about determining which kind of Server Identifier you've been given. Isn't overloading fun? Now you see why the code for handling all of this is tedious and verbose. It really is not very difficult, though, it's just that it takes a bit of work to get it all coded up. 2.3.2 The Destination PortPort 139 is for NBT, and port 445 is for raw TCP--good rules of thumb. Recall, though, that the NBT Session Service provides a mechanism for redirection. In addition, some security protocols use high-numbered ports to tunnel SMB connections through firewalls. That means that the use of non-standard ports should be supported on the client side. The SMB URL allows the specification of a destination port number, like so: smb://server:1928/ Once again, that fits into standard URI syntax. If you spend any time using a web browser, the port field should be familiar. What this all means, however, is that the port number does not always indicate which transport should be used. Rather the opposite; if the port number is not specified, the default port depends upon the transport. Knowing which transport to choose is, once again, something that requires some figuring out. 2.3.3 Transport DiscoveryAs has been stated previously, we are only considering the NBT and naked TCP transports. Both of these are IP-based and the behavior of SMB over these two is nearly identical, so it does not seem as though separating them would be very important...but this is CIFS we're talking about. The crux of the problem is whether or not the NBT SESSION REQUEST message is required. If the server is expecting correct NBT semantics, then we will need to find a valid NetBIOS name to place into the CALLED NAME field. This is a complicated process, involving a lot of trial-and-error. The recipe presented below is only one way to go about it. A good chef knows how to adjust the ingredients and choose seasonings to get the desired result. This is as much an art as it is a science. 2.3.3.1 Run NakedRunning naked is probably the easiest transport test to try first. The procedure is tasteful and dignified: simply assume that the server is expecting raw TCP transport. Open a TCP connection to port 445 on the server, but do not send an NBT SESSION REQUEST--just start sending SMB messages and see if that works. There are four possible results from this test:
All of the above applies if the user did not specify a non-standard port number. If the input looks more like this: smb://server:2891/ ...then the option of falling back to NBT on port 139 is excluded. In addition, there is no way to guess which transport type should be used if a port number other than 139 or 445 is specified. (In theory, it is also possible to run NBT transport on port 445 and naked transport on port 139. If you catch anyone doing such a twisted thing you should probably notify the authorities.) Fortunately, Windows systems (Windows95, '98, and W2K were tested)
return an NBT NEGATIVE SESSION RESPONSE if they get naked
semantics on an NBT service port. This makes sense, because it lets
the client know that NBT semantics are required. Samba's
smbd goes one better and simply ignores the lack of a
SESSION REQUEST message. Samba's behavior effectively merges
the two transport types and makes the distinction between them
irrelevant, which simplifies things on the server side and makes life
easier for the client.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Real Programmers don't draw flowcharts. --Unknown |
The transport discovery process is illustrated using the anachronistic flowchart presented in figure 2.2. 2.3.3.2 Using the NetBIOS NameIf running naked didn't work, then you will probably need to try NBT transport. Also, back in section 2.3.1 we talked about the different types of Server Identifiers that most implementations support. One of those is the NetBIOS name, and it seems logical to assume that if the Server Identifier is a NetBIOS name then the transport will be NBT. That's two good reasons to give NBT transport a whirl. As stated earlier, the critical difference between the raw TCP and NBT transports is that NBT requires the SESSION REQUEST/POSITIVE SESSION RESPONSE exchange before the SMB messages can start flowing. The SESSION REQUEST, in turn, must contain a valid CALLED NAME. If the CALLED NAME is not correct, then some server implementations will reject the connection. (Windows seems to be quite picky, but Samba ignores the CALLED NAME field.) Finding a valid CALLED NAME is easy if the Server Identifier is a NetBIOS name because, well... because there you are. The NetBIOS name is the correct CALLED NAME. Also, since the Server Identifier was resolved via an NBT Name Query, the server's IP address is known. That's everything you need. There is one small problem with this scenario that could cause a little trouble: some NBNS servers can be configured to pass NetBIOS name queries through to the DNS system, which means that the DNS--not the NBNS--may have resolved the name to an IP address. That would mean that we have a false-positive and the Server Identifier is not, in fact, a NetBIOS name. If that happens, you could wind up trying to make an NBT connection to a system that isn't running NBT services. (The opposite of the "run naked" test described above.) Detecting an SMB service that wants naked transport is not as clean and easy as detecting one that wants NBT. In testing, a Windows2000 system running naked TCP transport did not respond at all to an NBT SESSION REQUEST, and the client timed out waiting for the reply. This problem is neatly avoided if naked transport is attempted before NBT transport. Since Samba considers the SESSION REQUEST optional, this kind of transport confusion is not an issue when talking to a Samba server. 2.3.3.3 Reverse-Mapping a NetBIOS NameReverse-mapping is the last, desperate means for finding a workable NetBIOS CALLED NAME so that a valid SESSION REQUEST can be sent. Reverse-mapping is also quite common. Your code will need to try this technique if naked transport didn't work and the Server Identifier was a DNS name or IP address--a situation which is not unusual. As stated before, there is no right way to do reverse-mapping. Fortunately, there are a few almost-right ways to go about it. Here they are:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
"Guess," said Marvin. |
If none of those options worked, then it is finally time to send an error message back to the user explaining that the Server Identifier is no good.
2.3.4 Connecting to the ServerWe are still dealing with the transport layer and haven't actually seen any SMBs yet. It is, however, finally time for some code. Listing 2.1 handles the basics of opening the connection with an SMB server. It is example code so, of course, it takes a few shortcuts. For instance, it completely side-steps Server Identifier interpretation and transport discovery (that is, everything we just covered). The code in listing 2.1 provides an outline for setting up the session via NBT or raw TCP. With that step behind us, we won't have to deal with the details of the transport layer any longer. Let's run through some code highlights quickly and put all that transport stuff behind us.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
We leave this as an exercise for the reader. -- Unknown |
Use the program above as a starting point for building your own SMB client utility. Add a parser capable of dissecting the UNC or SMB URL format, and then code up Server Identifier resolution and transport discovery, as described above. When you have all of that put together, you will have completed the foundation of your SMB client.
2.4 SMB in its Natural Habitat
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Are we there yet? -- Kids in the back seat. |
We have spent a lot of time and effort preparing for this expedition, and we are finally ready to venture into SMB territory. It can be a treacherous journey, though, so before we push ahead we should re-check our equipment.
Keep in mind that the goal of our first trip into the wilds of SMB-land is to become familiar with the terrain and to study SMBs in their natural habitat, so we can learn about their anatomy and behavior. We are not ready yet for a detailed study of SMB innards. That will come later. 2.4.1 Our Very First Live SMBsWe need to capture a few SMBs to see what they look like up close. That means it's time to take a look at the wire and see what's there to be seen. Fire up your protocol analyzer, and then your SMB client. If you can configure your test server to allow anonymous connections (no username, no password) it will simplify things at this stage. If you can't, then things won't run quite as they are shown below. Don't worry, it will be close enough. For this example, we will use the Exists.java program that comes with jCIFS. It is a very simple utility that does nothing more than verify the existence of the object specified by the given SMB URL string, like so:
The above shows that we were able to access the HOME share on node SMEDLEY. A similar test can be performed using Samba's smbclient, or with the NET USE command under Windows12:
Those simple commands will generate the packets we want to capture and study. Stop your sniffer and take a look at the trace. You should see a chain of events similar to the following:
The above is edited output from an Ethereal capture13. The packets were generated using the jCIFS Exists utility, as described above. In this case jCIFS was talking to an old Windows95 system, but any SMB server should produce the same or similar results. The trace is reasonably simple. The first thing that node MARIKA does is send a broadcast NBT Name query to find node SMEDLEY, and SMEDLEY responds. Packets 3, 4, & 5 show the TCP session being created. (Note that netbios-ssn is the descriptive name given to port 139.) Packets 6 and 7 are the NBT SESSION REQUEST/SESSION RESPONSE exchange, and packet #8 is an ACK message, which is just TCP taking care of its business. Packets 9 and 10 are what we want. These are our first SMBs. 2.4.2 SMB Message Structure
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I never metaphor I couldn't mix. -- Me |
Figure 2.3 provides an overview of SMB gross anatomy. It shows that SMBs are composed of three basic parts:
Either or both of the latter two segments may be vestigial (size == 0) in some specimens. 2.4.2.1 SMB Message HeaderStarting at the top, the SMB header is arranged like so:
We can also dissect the header using the simple syntax presented previously: SMB_HEADER { PROTOCOL = "\xffSMB" COMMAND = <SMB Command code (one byte)> STATUS = <Status code> FLAGS = <Old flags> FLAGS2 = <New flags> EXTRA = <Sometimes used for additional data> TID = <Tree ID> PID = <Process ID> UID = <User ID> MID = <Multiplex ID> } We now have a pair of perspectives on the header structure. Time for some good, old-fashioned descriptive text.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Be afraid. Be very afraid. -- Veronica Quaife (Geena Davis) The Fly (1986) |
2.4.2.2 SMB Message ParametersIn the middle of the SMB message are two fields labeled WordCount and Words[]. For our purposes, we will identify these two fields as being the SMB_PARAMETERS block, which looks like this:
SMB_PARAMETERS { WordCount = <Number of words in the Words array> Words[WordCount] = <SMB parameters; varies with SMB command> } The Words field is simply a block of data that is 2 × WordCount bytes in length. Perhaps at one time the intention was that it would contain only two-byte values (a quick look at COREP.TXT suggests that this is the case). In practice, all sorts of stuff is thrown in there. Each SMB message type (species?) has a different record structure that is carried in the Words block. Think of that structure as representing the parameters passed to a function (the function identified by the SMB command code listed in the header). 2.4.2.3 SMB Message DataFollowing the SMB_PARAMETERS is another block of data, the content of which also varies in structure on a per-SMB basis:
SMB_DATA { ByteCount = <Number of bytes in the Bytes field> Bytes[ByteCount] = <Contents varies with SMB command> } The Bytes field holds the data to be manipulated. For example, it may contain the data retrieved in response to a READ operation, or the data to be written by a WRITE operation. In many cases, though, the SMB_DATA block is just another record structure with several subfields. Through time, SMB has evolved lazily and any functional distinction that may have separated the Parameter and Data blocks has been blurred. Note that the SMB_DATA.ByteCount field is an unsigned short, while the SMB_PARAMETERS.WordCount field is an unsigned byte. That means that the SMB_PARAMETERS.Words block is limited in length to 510 bytes (2 × 255), while SMB_DATA.Bytes may be as much as 65535 bytes in length. If you add all that up, and then add in the SMB_PARAMETERS.WordCount field, the SMB_DATA.ByteCount field, and the size of the header, you will find that the whole thing fits easily into the 217-1 bytes made available in the NBT SESSION MESSAGE header. 2.4.3 Case in Point: NEGOTIATE PROTOCOLNow that we have an overview of the structure of SMB messages, we can take a closer look at our live specimen. Remember packets 9 and 10 from the capture we made earlier? They show a NEGOTIATE PROTOCOL exchange. Let's get out the tweezers, the pocket knife, & dad's hammer and see what's inside. NEGOTIATE_PROTOCOL_REQUEST { SMB_HEADER { PROTOCOL = "\xffSMB" COMMAND = SMB_COM_NEGOTIATE (0x72) STATUS { ErrorClass = 0x00 (Success) ErrorCode = 0x0000 (No Error) } FLAGS = 0x18 (Pathnames are case-insensitive) FLAGS2 = 0x8001 (Unicode and long filename support) EXTRA { PidHigh = 0x0000 Signature = 0 (all bytes zero filled) } TID = 0 (Not yet known) PID = <Client Process ID> UID = 0 (Not yet known) MID = 2 (often 0 or 1, but varies per OS) } SMB_PARAMETERS { WordCount = 0 Words = <empty> } SMB_DATA { ByteCount = 12 Bytes { BufferFormat = 0x02 (Dialect) Name = "NT LM 0.12" (nul terminated) } } } The breakdown of packet 9 shows the SMB NEGOTIATE PROTOCOL REQUEST as sent by the jCIFS Exists utility. Other clients will use slightly different values, but they are all variations on the same theme. Some features worth noting:
2.4.4 The AndX MutationIn the trace given above, Ethereal has identified packets 11 and 12
as being a SESSION SETUP ANDX exchange16. The term "ANDX" at
the end of the names indicates that these messages belong to a curious
class of creatures known as "AndX messages". SMB AndX
messages are actually several SMBs combined into a single symbiotic
packet as shown in figure 2.4. It is an efficient mutation.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
<tpot> shouldn't that be an AntX? -- Tim Potter on IRC |
AndX messages work something like a linked list. Each Parameter block in an AndX message begins with the following structure:
The AndXCommand field provides the SMB command code for the next AndX block in the list (not the current one). The AndXOffset contains the byte index, relative to the start of the SMB header, of that next AndX block--think of it as a pointer. Since the AndXOffset value is independent of the SMB_PARAMETERS.WordCount and SMB_DATA.ByteCount values, it is possible to provide padding between the AndX blocks as shown in figure 2.5. Now that we have a general idea of what an SMB AndX message looks like we are ready to dissect packet 11. It looks like this:
SESSION_SETUP_ANDX_REQUEST { SMB_HEADER { PROTOCOL = "\xffSMB" COMMAND = SMB_COM_SESSION_SETUP_ANDX (0x73) STATUS { ErrorClass = 0x00 (Success) ErrorCode = 0x0000 (No Error) } FLAGS = 0x18 (Pathnames are case-insensitive) FLAGS2 = 0x0001 (Long filename support) EXTRA { PidHigh = 0x0000 Signature = 0 (all bytes zero filled) } TID = 0 (Not yet known) PID = <Client Process ID> UID = 0 (Not yet known) MID = 2 (often 0 or 1, but varies per OS) } ANDX_BLOCK[0] (Session Setup AndX Request) { SMB_PARAMETERS { WordCount = 13 AndXCommand = SMB_COM_TREE_CONNECT_ANDX (0x75) AndXOffset = 79 MaxBufferSize = 1300 MaxMpxCount = 2 VcNumber = 1 SessionKey = 0 CaseInsensitivePasswordLength = 0 CaseSensitivePasswordLength = 0 Capabilities = 0x00000014 } SMB_DATA { ByteCount = 20 AccountName = "GUEST" PrimaryDomain = "?" NativeOS = "Linux" NativeLanMan = "jCIFS" } } ANDX_BLOCK[1] (Tree Connect AndX Request) { SMB_PARAMETERS { WordCount = 4 AndXCommand = SMB_COM_NONE (0xFF) AndXOffset = 0 Flags = 0x0000 PasswordLength = 1 } SMB_DATA { ByteCount = 22 Password = "" Path = "\\SMEDLEY\HOME" Service = "?????" (yes, really) } } } There is a lot of information in that message, but we are not yet ready to dig into the details. There is just too much to cover all of it at once. Our goals right now are simply to highlight the workings of the AndX blocks, and to provide a glimpse inside the SESSION SETUP ANDX & TREE CONNECT ANDX sub-messages so that we will have something to talk about later on. The block labeled ANDX_BLOCK[0] is the body of the SESSION SETUP REQUEST, and ANDX_BLOCK[1] contains the TREE CONNECT REQUEST. Note that the AndXCommand field in the final AndX block is given a value of 0xFF. This, in addition to the zero offset in the AndXOffset field, indicates the end of the AndX list. 2.4.5 The Flow of ConversationSMB conversations start after the session has been established via the transport layer. As a rule, the client always speaks first. Clients send requests, servers respond, and that's the way SMB is supposed to work. This is a hard-and-fast rule which means, of course, that there is an exception. Fortunately, we can (and will) put off talking about that exception until we talk about Opportunistic Locks (OpLocks). The NEGOTIATE PROTOCOL REQUEST/RESPONSE is always the first SMB exchange in the conversation. The client and server need to know what language to speak before they can say anything else. This is also a hard-and-fast rule, but there are no exceptions (which is an exception to the rule that all hard-and-fast rules have exceptions). Once the dialect has been selected, the next formality is to establish an SMB session using the SMB SESSION SETUP REQUEST message. We keep running into terminology twists, and here we have yet another. The SMB SESSION SETUP exchange sets up an SMB session within the NBT or naked TCP session. Huh? Well, yes, that's confusing. The problem is that we are talking about two different kinds of sessions here.
Ah, there's a clue! The SESSION SETUP is used to perform authentication and establish a user session with the server17. A quick look at the SESSION SETUP ANDX REQUEST block in the packet above shows that the Exists utility did in fact send a username--the name "GUEST", passed via the AccountName field--to the server. Once the user session is established, the client may try to connect to a share using a TREE CONNECT SMB. It is a hard-and-fast rule that TREE CONNECT SMBs must follow the SESSION SETUP. There is an exception to this as well, which we will cover when we get to share-mode vs. user-mode authentication. Figure 2.6 shows the right way to start an SMB conversation. Combining the SESSION SETUP ANDX and TREE CONNECT ANDX SMBs into a single AndX message is optional (jCIFS' Exists does, but Samba's smbclient doesn't). Once the conversation has been initiated using the above sequence, the client is free to improvise. 2.4.6 A Little More CodeThere is another small detail you may have noticed while studying the captured SMB packets--or perhaps you remember this from one of the !Alert boxes in the NBT section: SMBs are written using little-endian byte order. If your target platform is big-endian, or if you want your code to be portable to big-endian systems, you will need to be able to handle the conversion between host and SMB byte order. The htonl(), htons(), ntohl(), and ntohs() functions won't help us here. They convert between host and network order. We need to be able to convert between host and SMB order (and SMB order is definitely not the same as network order). So, to solve the problem, we need a little bit of code, which is presented here mostly to get it out of the way so that we won't have to bother with it when we are dealing with more complex issues. The functions in Listing 2.2 read short and long integer values directly from incoming message buffers and write them directly to outgoing message buffers. 2.4.7 Take a BreakOur field trip into SMB territory is now over. We have covered a lot of ground, collected samples, and taken a look at SMBs in the wild. Our next step will be doing the lab work, studying our specimens under a microscope. It is time to take a break, relax, and reflect on what we have learned so far. Time for a cup of tea. In the next section we will go back over the SMB header in a lot more detail with the goal of explaining some of the key concepts that we have only touched on so far. You will probably want to be well rested and in a good mood for that.
2.5 The SMB Header in DetailDuring that first expedition into SMB territory we continually deferred studying the finer details of the SMB header, among other things. We were trying to cover the general concepts, but now we need to dig into the guts of SMB to see how things really work. Latex gloves and lab coats required. Let's start by revisiting the header layout. Just for review, here's what it looks like:
The first four bytes are constant, so we won't worry about those. The COMMAND field is fairly straight-forward too; it's just a one byte field containing an SMB command code. The list of available codes is given in section 5.1 of the SNIA doc. The rest of the header is where the fun lies.... 2.5.1 The SMB_HEADER.STATUS Field ExposedThings get interesting starting at the STATUS field. It wouldn't be so bad except for the fact that there are two possible error code formats to consider. There is the DOS & OS/2 format, and then there is the NT_STATUS format. In C-language terms, the STATUS field looks something like this: typedef union { ulong NT_Status; struct { uchar ErrorClass; uchar reserved; ushort ErrorCode; } DosError; } Status; From the client side, one way to deal with the split personality
problem is to use the DOS codes exclusively18. These are fairly well documented
(by SMB standards), and should be supported by all SMB servers.
Using DOS codes is probably a good choice, but there is a catch...
There are some advanced features which simply don't work unless the
client negotiates NT_STATUS codes.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Rats! -- Charlie Brown Peanuts, by Charles Schultz |
Another reason to support NT_STATUS codes is that they provide finer-grained diagnostics, simply because there are more of them defined than there are DOS codes. Samba has a fairly complete list of the known NT_STATUS codes, which can be found in the samba/source/include/nterr.h file in the Samba distribution. The list of DOS codes is in doserr.h in the same directory. We have already described the structure of the DOS error codes. NT_STATUS codes also have a structure, and it looks like this:
In testing, it appears as though the Facility field is always set to zero (FACILITY_NULL) for SMB errors. That leaves us with the Level and ErrorCode fields to provide variety ... and, as we have suggested, there is quite a bit of variety. Samba's nterr.h file lists over 500 NT_STATUS codes, while doserr.h lists only 99 (and some of those are repeats). Level is one of the following: 00 == Success Since the next two bits (the <reserved> bits) are always zero, the highest-order nibble will have one of the following values: 0x0, 0x4, 0x8, or 0xC. At the other end of the longword, the ErrorCode is read as an unsigned short (just like the DOS ErrorCode field). The availability of Samba's list of NT_STATUS codes makes things easy. It took a bit of doing to generate that list, however, as most of the codes are not documented in an accessible form. Andrew Tridgell described the method below, which he used to generate a list of valid NT_STATUS codes. His results were used to create the nterr.h file used in Samba.
Okay, now for the next conundrum... Servers have it tougher than clients. Consider a server that needs to respond to one client using DOS error codes, and to another client using NT_STATUS codes. That's bad enough, but consider what happens when that server needs to query yet another server in order to complete some operation. For example, a file server might need to contact a Domain Controller in order to authenticate the user. The problem is that, no matter which STATUS format the Domain Controller uses when responding to the file server, it will be the wrong format for one of the clients. To solve this problem the server needs to provide a consistent mapping between DOS and NT_STATUS codes. WindowsNT and Windows2000 both have such mappings built-in but, of course, the details are not published (a partial list is given in section 6 of the SNIA doc). Andrew Bartlett used a trick similar to Tridge's in order to generate the required mappings. His setup uses a Samba server running as a Primary Domain Controller (PDC), and a Windows2000 system providing SMB file services. A third system, running Samba's smbtorture testing utility, acts as the client. When the client system tries to log on to the Windows server, Windows passes the login request to the Samba PDC. The test works like this:
Andrew's test must be rerun periodically. The mappings have been known to change when Windows service packs are installed. See the file samba/source/libsmb/errormap.c in the Samba distribution for more fun and adventure19. 2.5.2 The FLAGS and FLAGS2 Fields Tell AllMost (but not all) of the bits in the older FLAGS field
are of interest only to older servers. They represent features that
have been superseded by newer features in newer servers. It would be
nice if all of the old stuff would just go away so that we wouldn't
have to worry about it. It does seem, in fact, as though this is
slowly happening. (Maybe it would be better if the old stuff stayed
and the new stuff had never happened. Hmmm...)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Duh... dat sounds logical! -- Baby Huey Harvey Entertainment |
In any case, this next table presents the FLAGS bits in order of descending significance--the opposite of the order used in the SNIA doc. English speaking people tend to read from left to right and from top to bottom, so it seems logical (as this book is, more or less, written in English20) to transpose left-to-right order into a top-to-bottom table.
The NEGOTIATE PROTOCOL REQUEST that we dissected back in section 2.4.3 shows only the SMB_FLAGS_CANONICAL_PATHNAMES and SMB_FLAGS_CASELESS_PATHNAMES bits set, which is probably the best thing for new implementations to do. Testing with other clients may reveal other workable combinations. Now let's take a look at the newer flags in the FLAGS2 field.
Some of the flags are used to modify the interpretation of the SMB message, while others are used to negotiate features. Some do both. It may take some experimentation to find the safest way to handle these bits. Implementations are not consistent, so new code must be fine-tuned. You may need to refer back to these tables as we dig further into the details. Note that the constant names listed above may not match those in the SNIA doc, or those in other docs or available source code. There doesn't seem to be a lot of agreement on the names. 2.5.3 EXTRA! EXTRA! Read All About It!Um, actually we are going to delay covering the EXTRA field yet again. EXTRA.PidHigh will be thrown in with the PID field, and EXTRA.Signature will be handled as part of authentication. 2.5.4 TID and UID: Separated at Birth?It would seem logical that the [V]UID and TID fields would be somehow related. Both are assigned and managed by the server, and we said before that the SESSION SETUP (where the logon occurs) is supposed to happen before the TREE CONNECT. Well, put all that aside and pay attention to this little story...
So what the purplebananafish does this have to do with TIDs and UIDs? Well, see, it's like this... Early corporate LANs, such as those in our story, were small and self-contained. The driving goal was to make sure that the data were available to everyone in the office who could legitimately claim to need access. Security was not considered a top priority, so PC OSes (eg. DOS) did not support complicated minicomputer features like user-based authentication. Given the environment, it is not surprising that the authentication system originally built into SMB was (by today's standards) quite primitive. Passwords, if they were used at all, were assigned to shares--not users--and everyone who wanted to access the same share would use the same password. This early form of SMB authentication is now known as "Share Level" security. It does not include the concept of user accounts, so the UID field is always zero. The password is included in the TREE CONNECT message, and a valid TID indicates a successfully authenticated connection. In fact, though the UID field is listed in the SMB message format layout described in the ancient COREP.TXT scrolls, it is not mentioned again anywhere else in that document. There is no mention of a SESSION SETUP message either. There are some interesting tricks that add a bit of flexibility to Share Level security. For example, a single share may have multiple passwords assigned, each granting different access rights. It is fairly common, for instance, to assign separate passwords for read-only vs. read/write access to a share. Another interesting fudge is often used to provide access to user home directories. The server (which, in this case, understands user-based authentication even if the protocol and/or client do not) simply offers usernames as share names. When a user connects to the share matching their username, they give their own login password. The server then checks the username/password pair using its normal account validation routines. Thus, user-based authentication can be mapped to Share Level security. (See figure 2.8.) Share Level security, though still used, is considered deprecated.
It has been replaced with "User Level" security which, of
course, makes use of username/password instead of sharename/password
pairs.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
What a difference 15 years can make -- John Lewkowicz, The Complete MUMPS |
Under User Level security, the SESSION SETUP is performed as the authentication step before any TREE CONNECT requests may be sent. If the logon succeeds, the server will assign a valid (non-zero) UID. Subsequent TREE CONNECT attempts can use the UID as an authentication token when requesting access to a share. If User Level security is in use, the password field in the TREE CONNECT message will be blank. So, with User Level security, the client must authenticate to get a valid UID, and then present the UID to gain access to shares. Thing is, more than one UID may be generated within a single connection, and the UID used to connect to the share does not need to be the same as the one used to access files within the share. 2.5.5 PID and MID RevealedSimply put:
That's the idea, anyway. The client provides values for these fields when it sends a request to the server, and the server is supposed to echo the values back in the response. That way, the client can match the reply to the original request. Some systems (such as Windows and OS/2) multiplex all of the SMB traffic between a client and a server over a single TCP connection. If the client OS is multi-tasking there may be several active SMB sessions running concurrently, so there may be several requests outstanding at any given time. The SMB conversations are all intertwined, so the client needs a way to sort out the replies and hand them off to the correct thread within the correct process. (See figure 2.9.) The PID field is also used to maintain the semantics of local file I/O. Think about a simple program, like the one in listing 2.3 which opens a file in read-only mode and dumps the contents. Consider, in particular, the call to the open() function, which returns a file descriptor. File descriptors are maintained on a per-process basis--that is, each process has its own private set. The descriptor is an integer used by the operating system to identify an internal record that keeps track of lots of information about the open file, such as:
Now take all of that and stretch it out across a network. The files physically reside on the server and information about locks, offsets, etc. must be kept on the server side. The process that has opened the files, however, resides on the client and all of the file status information is relevant within the context of that process. That brings us back to what we said before: The PID identifies a client process. It lets the server keep track of client context, and associate it correctly with the right customer when the requests come rolling in. Further complicating things, some clients support multiple threads running within a process. Threads share context (memory, file descriptors, etc.) with their sister threads within the same process, but each thread may generate SMB traffic all on its own. The MID field is used to make sure that server replies get back to the thread that sent the request. The server really doesn't do much with the MID. It just echoes it back to the client so, in fact, the client could make whatever use it wanted of the MID field. Using it as a thread identifier is probably the most practical thing to do. There is an important rule which the client should obey with regard to the MID and PID fields: Only one SMB request should ever be outstanding per [PID, MID] pair per connection. The reason for this rule is that the client will generally need to know the result of a request before sending the next request, especially if an error occurred. The problems which might result should this rule be broken probably depend upon the server, but defensive programming practices would suggest avoiding trouble. 2.5.5.1 EXTRA.PidHigh Dark Secrets UncoveredEarlier on we promised to cover the EXTRA.PidHigh field. Well, a promise is a promise... The PidHigh field is supposed to be a PID extension, allowing the use of 32-bit rather than 16-bit values as process identifiers. As with all extensions, however, there is the basic problem of backward compatibility. In this case, trouble shows up if (and only if) the client supports 32-bit process IDs but the server does not. In that situation, the client must have a mechanism for mapping 32-bit process IDs to 16-bit values that can fit into the PID field. It doesn't need to be an elaborate mapping scheme, and it is unlikely that there will be 64K client processes talking to the same server at the same time, so it should be a simple problem to solve. Since that mapping mechanism needs to be in place in order for the client to work with servers that don't support the PidHigh field, there's no reason to use 32-bit process IDs at all. In testing, it appears as though the PidHigh field is, in fact, always zero (except in some obscure security negotiations that are still not completely understood). Best bet, leave it zero. 2.5.6 SMB Header Final ReportCode... The next two listings (2.4a and 2.4b) provide support for reading and writing SMB message headers. Most of the header fields are simple integer values, so we can use the smb_Set*() and smb_Get*() functions from listing 2.2 to move the data in and out of the header buffer. To make subsequent code easier to read, we provide a set of macros with nice clear names to front-end the function calls and assignments that are actually used. The smb_hdrInit() and smb_hdrCheck() functions are there primarily to ensure that the SMB headers are reasonably sane. They check for things like the buffer size, and ensure that the "\xffSMB" string is included correctly in the header buffer. Note that none of these functions or macros handle the reading and writing of the four-byte session header, though that would be trivial. The SESSION MESSAGE header is part of the transport layer, not SMB. It is handled as a simple network-byte-order longword; something from the NBT Session Service that has been carried over into naked transport. (We covered all this back in sections 1.6 and 2.1.2.)
2.6 Protocol Negotiation
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This one goes to eleven. -- Nigel Tufnel (Christopher Guest), This Is Spinal Tap |
CIFS is a very rich and varied protocol suite, a fact that is evident in the number of SMB dialects that exist. Five are listed in the X/Open SMB protocol specification, and the SNIA doc--published ten years later--lists eleven. That's a bigbunch, and they probably missed a few. Each new dialect may add new SMBs, deprecate old ones, or extend existing ones. As if that were not enough, implementations introduce subtle variations within dialects. All that in mind, our goal in this section will be to provide an overview of the available dialects, cover the workings of the NEGOTIATE PROTOCOL SMB exchange, and take a preliminary peek at some of the concepts that we have yet to consider (things like virtual circuits and authentication). For the most part, the examples and discussion will be based on the "NT LM 0.12" dialect. The majority of the servers currently available support some variation of NT LM 0.12, and at least one client implementation (jCIFS) has managed to get by without supporting any others. Server writers should be warned, however, that there really are a lot of clients still around that use older calls. Even new clients will use older calls, simply because of the difficulty of acquiring reliable documentation on the newer stuff. 2.6.1 A Smattering of SMB DialectsIn keeping with tradition, the list of dialects is presented as a table with the dialect name in the left-hand column and a short description in the right, ordered from oldest to newest. Most of the references to these dialects seem to do it this way. Our list is not quite as complete as you might find elsewhere. The aim here is to highlight some of the better-known examples in order to provide a bit of context for the examination of the SMB_COM_NEGOTIATE message. Where relevant, important differences between dialects will be
noted. It would be very difficult, however, to try to document all of
the features of each dialect and all of the changes between them. If
you really, really need to know more (which is likely, if you are
working on server code) see the SNIA doc, the X/Open doc, the expired
IETF drafts, and the other old Microsoft documentation that is still
freely available from their FTP server22.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A language is a dialect with an army and a navy. -- Uriel Weinreich |
Section 3.16 of the SNIA CIFS Technical Reference, V1.0 provides a list of of SMB message types categorized by the dialect in which they were introduced. There is also a slightly more complete list of dialects in section 5.4 of the SNIA doc. 2.6.2 Greetings: The NEGOTIATE PROTOCOL REQUESTWe have already provided a detailed breakdown of a NEGOTIATE PROTOCOL REQUEST SMB (back in section section 2.4.3), so we don't need to go to the trouble of fully dissecting it again. The interesting part of the request is the data section (the parameter section is empty). If we were to write a client that supported all of the dialects in our chart, the NEGOTIATE_PROTOCOL_REQUEST.SMB_DATA field would break out something like this: SMB_DATA { ByteCount = 131 Bytes { Dialect[0] = "\x02PC NETWORK PROGRAM 1.0" Dialect[1] = "\x02MICROSOFT NETWORKS 1.03" Dialect[2] = "\x02MICROSOFT NETWORKS 3.0" Dialect[3] = "\x02LANMAN1.0" Dialect[4] = "\x02LM1.2X002" Dialect[5] = "\x02LANMAN2.1" Dialect[6] = "\x02Samba" Dialect[7] = "\x02NT LM 0.12" Dialect[8] = "\x02CIFS" } } Each dialect string is preceded by a byte containing the value 0x02. This, perhaps, was originally intended to make it easier to parse the buffer. In addition to the 0x02 prefix the dialect strings are nul-terminated, so if you go to the trouble of counting up the bytes to see if the ByteCount value is correct in this example don't forget to add 1 to each string length. Listing 2.5 provides code for creating a NEGOTIATE PROTOCOL REQUEST message. It also takes care of writing an NBT Session Message header for us--something we must not forget to do. 2.6.3 Gesundheit: The NEGOTIATE PROTOCOL RESPONSEThe NEGOTIATE PROTOCOL RESPONSE SMB is more complex than the request. In addition to the dialect selection, it also contains a variety of other parameters that let the client know the capabilities, limitations, and expectations of the server. Most of these values are stuffed into the SMB_PARAMETERS block, but there are a few fields defined in the SMB_DATA block as well. 2.6.3.1 NegProt Response ParametersThe NEGOTIATE_PROTOCOL_RESPONSE.SMB_PARAMETERS.Words block for the NT LM 0.12 dialect is 17 words (34 bytes) in size, and is structured as shown below. Earlier dialects use a different structure and, of course, the server should always match the reply to the dialect it selects. typedef struct { uchar WordCount; /* Always 17 for this struct */ struct { ushort DialectIndex; /* Selected dialect index */ uchar SecurityMode; /* Server security flags */ ushort MaxMpxCount; /* Maximum Multiplex Count */ ushort MaxNumberVCs; /* Maximum Virtual Circuits */ ulong MaxBufferSize; /* Maximum SMB message size */ ulong MaxRawSize; /* Obsolete */ ulong SessionKey; /* Unique session ID */ ulong Capabilities; /* Server capabilities flags */ ulong SystemTimeLow; /* Server time; low bytes */ ulong SystemTimeHigh; /* Server time; high bytes */ short ServerTimeZone; /* Minutes from UTC; signed */ uchar EncryptionKeyLength; /* 0 or 8 */ } Words; } smb_NegProt_Rsp_Params;
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
She's so dull, come on rip her to shreds. -- Rip Her To Shreds Blondie |
That requires a lot of discussion. Let's tear it up and take a close look at the tiny pieces.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Proof by Faith: I believe that this has been proven ...somewhere. -- Jonathan Young, PhD. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The job's not done until the paperwork's finished. -- Lavatory Axiom |
Wow... a lot of stuff there. No time to sit and chat about it right now, though. We still need to finish out the of the NEGOTIATE_PROTOCOL_RESPONSE.SMB_DATA block. 2.6.3.2 NegProt Response DataSMB_DATA, of course, is handed to us as an array of bytes with the length provided in the ByteCount field. The parsing of those bytes depends upon the values in the SMB_PARAMETER block that we just examined. The structure is completely different depending upon whether Extended Security has been negotiated. Here is what it looks like, more or less, in the NT LM 0.12 dialect: typedef struct { ushort ByteCount; /* Number of bytes to follow. */ union { struct { uchar GUID[16]; /* 16-byte Globally Unique ID */ uchar SecurityBlob[]; /* Auth-system dependent */ } ext_sec; /* Extended Security */ struct { uchar EncryptionKey[]; /* 0 or 8 bytes long */ uchar DomainName[]; /* nul-terminated string */ } non_ext_sec; /* Non-Extended Security */ } Bytes; } smb_NegProt_Rsp_Data; The first thing to note is that this SMB_DATA.Bytes block structure is the union of two smaller structures:
The second thing to note is that this is pseudo-code, not valid C code. Some of the array lengths are unspecified because we don't know the byte-length of the fields ahead of time. In real code, you will probably need to use pointers or some other mechanism to extract the variable-length data from the buffer. Okay, let's chop that structure into little bits...
2.6.4 Are We There Yet?Okay, let's be honest... Ripping apart that NEGOTIATE PROTOCOL RESPONSE SMB was about as exciting as the epic saga of undercooked toast. It doesn't get any better than that, though, and there's a lot more of it. Implementing SMB is a game of patience and persistence. It also helps if you get a cheap thrill from fiddly little details. (Just don't go parsing your packets in public or people will look at you funny.) It seems, too, that our overview of the SMB Header and the NEGOTIATE PROTOCOL exchange has left a bit of a mess on the floor. We have pulled a lot of concepts off of the shelves and out of the closets, and we will need to do some sorting and organizing before we can put them back. Let's see what we've got:
The only way to approach all of these topics is one-at-a-time. ...but first, take another break. Every now and then, it is a good idea to stop and think about what has been covered so far. This is one of those times. We have finished tearing apart SMB headers and the body of a NEGOTIATE PROTOCOL message. That should provide some familiarity with the overall structure of SMBs. Try doing some packet captures, or skim through the SNIA CIFS Technical Reference. It should all begin to make a little more sense now than it did when we started.
2.7 Session SetupOriginally, the SESSION SETUP was not required by--or even defined as part of--the SMB protocol. It was introduced in the LANMAN days in order to handle User Level authentication and could be skipped if the server was in Share Level security mode. These days, however, the SESSION SETUP takes care of a lot of unfinished business, like cleaning up some of the debris left by the NEGOTIATE PROTOCOL RESPONSE. In the NT LM 0.12 dialect there must be a SESSION SETUP exchange before a TREE CONNECT may be sent, even if the server is operating in Share Level security mode. 2.7.1 SESSION SETUP ANDX REQUEST ParametersThe SESSION SETUP SMB is actually a SESSION SETUP ANDX, which simply means that there's an AndX block in the parameter section. In the NT LM 0.12 dialect, the Parameter block is formatted as shown below: typedef struct { uchar WordCount; /* 12 or 13 words */ struct { struct { uchar Command; uchar Reserved; ushort Offset; } AndX; ushort MaxBufferSize; ushort MaxMpxCount; ushort VcNumber; ulong SessionKey; ushort Lengths[]; /* 1 or 2 elements */ ulong Reserved; ulong Capabilities; } Words; } smb_SessSetupAndX_Req_Params; When looking at these C-like structures, keep in mind that they are intended as descriptions rather than specifications. On the wire, the parameters are packed tightly into the SMB messages, and they are not aligned. Though the structures show the type and on-the-wire ordering of the fields, the C programming language does not guarantee that the layout will be retained in memory. That's why our example code includes all of those functions and macros for packing and unpacking the packets28. Many of the fields in the SESSION_SETUP_ANDX.SMB_PARAMETERS block should be familiar from the NEGOTIATE PROTOCOL RESPONSE SMB. This time, though, it's the client's turn to set the limits.
You might notice, upon careful examination, that the client does not send back a MaxRawSize value. That's because it can specify raw read/write sizes in the SMB_COM_RAW_READ and SMB_COM_RAW_WRITE requests, if it sends them. These SMBs are considered obsolete, so newer clients really shouldn't be using them. There are a couple of fields in the SESSION SETUP REQUEST which touch on esoteric concepts that we have been promising to explain for quite a while now--specifically virtual circuits and capabilities--so let's get it over with... 2.7.1.1 Virtual CircuitsIt does seem as though there's a good deal of cruft in the SMB protocol. The SessionKey, for example, appears to be a vestigial organ, the purpose of which has been mostly forgotten. Originally, such fields may have been intended to compensate for a limitation in a specific transport or an older implementation, or to solve some other problem that isn't a problem any more. Consider virtual circuits... The LAN Manager documentation available from Microsoft's ftp site provides the best clues regarding virtual circuits (see SMB-LM1X.PS, for instance). According to those docs a virtual circuit (VC) represents a single transport layer connection, and the VcNumber is a tag used to identify a specific transport link between a specific client/server pair. That concept probably needs to be considered in context. The LANMAN dialects were developed in conjunction with OS/2 (an honest-to-goodness, really-truly, multitasking OS). OS/2 clients pass SMB traffic through a redirector--just like DOS and Windows--and it seems as though there was some concern that multiplexing the SMB traffic from several processes across a single connection might cause a bit of a bottleneck. So, to avoid congestion, the redirector could create additional connections to facilitate faster transfers for individual processes29. Under this scheme, all of the transport level connections from a client to a server were considered part of a single logical "session" (we now, officially, have way too many meanings for that term). Within that logical session there could, conversely, be multiple transport level connections--aka. virtual circuits--up to the limit set in the NEGOTIATE PROTOCOL RESPONSE. Figure 2.10 illustrates the point, and here's how it's supposed to work:
Ah-Hahhh! The mystery of the SessionKey field is finally revealed. Kind of a let-down, isn't it? Whenever a new transport-layer connection is created, the client is supposed to assign a new VC number. Note that the VcNumber on the initial connection is expected to be zero to indicate that the client is starting from scratch and is creating a new logical session. If an additional VC is given a VcNumber of zero, the server may assume that any existing connections with that same client are now bogus, and shut them down. Why do such a thing?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There is a finite amount of clue in the Universe... and the Universe is expanding. -- Unknown (thanks to John Ladwig and Marcus Ranum) |
The explanation given in the LANMAN documentation, the Leach/Naik IETF draft, and the SNIA doc is that clients may crash and reboot without first closing their connections. The zero VcNumber is the client's signal to the server to clean up old connections. Reasonable or not, that's the logic behind it. Unfortunately, it turns out that there are some annoying side-effects that result from this behavior. It is possible, for example, for one rogue application to completely disrupt SMB filesharing on a system simply by sending Session Setup requests with a zero VcNumber. Connecting to a server through a NAT (Network Address Translation) gateway is also problematic, since the NAT makes multiple clients appear to be a single client by placing them all behind the same IP address30. The biggest problem with virtual circuits, however, is that they are not really needed any more (if, in fact, they ever were). As a result, they are handled inconsistently by various implementations and are not entirely to be trusted. On the client-side, the best thing to do is to ignore the concept and view each transport connection as a separate logical session, one VC per session. Oh! ...and contrary to the specs the client should always use a VcNumber of one, never zero. On the server side, it is important to keep in mind that the TID, UID, PID, and MID are all supposed to be relative to the VC. In particular, TID and UID values negotiated on one VC have no meaning (and no authority) on another VC, even if both VCs appear to be from the same client. Another important note is that the server should not disconnect existing VCs upon receipt of a new VC with a zero VcNumber. As described above, doing so is impractical and may break things. The server should let the transport layer detect and report session disconnects. At most, a zero VcNumber might be a good excuse to send a keep-alive packet. The whole VC thing probably seemed like a good idea at the time. 2.7.1.2 Capabilities BitsRemember a little while back when we said that there were subtle variations within SMB dialects? Well, some of them are not all that subtle once you get to know them. The Capabilities bits formalize several such variations by letting the client and server negotiate which special features will be supported. The server sends its Capabilities field in the NEGOTIATE PROTOCOL RESPONSE, and the client returns its own set of capabilities in the SESSION SETUP ANDX REQUEST. The table below provides a listing of the capabilities defined for
servers. The client set is smaller.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
* <---- Tribble . <---- Tribble.gz -- Karen Swanberg |
On the server side, the implementor's rule of thumb regarding capabilities is to start by supporting as few as possible and add new ones one at a time. Each bit is a cornucopia--or Pandora's box--of new features and requirements, and most represent a very large development effort. As usual, if there is documentation it is generally either scarce or encumbered. Things are not quite so bad if you are implementing a client, though the client also has a list of capabilities that it can declare. The client list is as follows:
The client should not set any bits that were not also set by the server. That is, the Capabilities bits sent to the server should be the intersection (bitwise AND) of the client's actual capabilities and the set sent by the server. The Capabilities bits are like the razor-sharp barbs on a government fence. Attempting to hurdle any one of them can shred your implementation. Consider adding Unicode support to a system that doesn't already have it. Ooof! That's going to be a lot of work32. Some Capabilities bits indicate support for sets of function calls that can be made via SMB. These function calls, which are sometimes referred to as "sub-protocols", fall into two separate (but similar) categories:
Of the two, the RAP sub-protocol is older and (relatively speaking) simpler. Depending upon the SMB dialect, server support for some RAP calls is assumed rather than negotiated. Fortunately, much of RAP is documented...if you know where to look33. Microsoft's RPC system--known as MS-RPC--is newer, and has a lot in common with the better-known DCE/RPC system. MS-RPC over SMB allows the client to make calls to certain Windows DLL library functions on the server side which, in turn, allows the client to do all sorts of interesting things. Of course, if you are building a server and you want to support the MS-RPC calls you have to implement all of the required functions in addition to SMB itself. Unfortunately, much of MS-RPC is undocumented34. The MS-RPC function call APIs are defined using a language called Microsoft Interface Definition Language (MIDL). There is a fair amount of information about MIDL available on the web and some of the function interface definitions have been published. CIFS implementors have repeatedly asked Microsoft for open access to all of the CIFS-relevant MIDL source files. Unencumbered access to the MIDL source would go a long way towards opening up the CIFS protocol suite. Since MIDL provides only the interface specifications and not the function internals, Microsoft could release them without exposing their proprietary DLL source code. Both the RAP and MS-RPC sub-protocols provide access to a large set of features, and both are too big to be covered in detail here. Complete documentation of all of the nooks and crannies of CIFS would probably require a set of books large enough to cause an encyclopedia to cringe in awe, so it would seem that our attempt to clean up the mess we made with the NEGOTIATE PROTOCOL exchange has instead created an even bigger mess and left some permanent stains on the carpet. Ah, well. Such is the nature of CIFS. 2.7.2 SESSION SETUP ANDX REQUEST DataThe dissection of the SMB_PARAMETERS portion of the
SESSION SETUP ANDX REQUEST cleared up a few issues and
exposed a few others. Now we get to look at the SMB_DATA
block and see what further mysteries may lie uncovered.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
...just another piece of
useless information. -- Rain on the Hills, Judie Tzuke |
Fortunately, the Data block is much less daunting. It contains a few fields used for authentication and the rest is just useful bits of information about the client's operating environment. The structure looks like this: typedef struct { ushort ByteCount; struct { union { uchar SecurityBlob[]; struct { uchar CaseInsensitivePassword[]; uchar CaseSensitivePassword[]; uchar Pad[]; uchar AccountName[]; uchar PrimaryDomain[]; } non_ext_sec; } auth_stuff; uchar NativeOS[]; uchar NativeLanMan[]; uchar Pad2[]; } Bytes; } smb_SessSetupAndx_Req_Data;
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Who is your user? -- Tron |
We have done a lot of work ripping apart packet structures and studying the internal organs. Don't worry, that's the last of it. You should be familiar enough with this stuff by now, so from here on out we will rely on the SNIA doc and packet traces to provide the gory details.
2.7.3 The SESSION SETUP ANDX RESPONSE SMBThe SESSION SETUP ANDX RESPONSE SMB structure is described in section 4.1.2 of the SNIA doc. In the NT LM 0.12 dialect, there are two versions of the SESSION SETUP ANDX RESPONSE message. They differ, of course, based on whether or not Extended Security is in use. In the Extended Security version the Parameter block has a SecurityBlobLength field, and there is an associated SecurityBlob within the Data block. These two fields are missing from the non-Extended Security version. Other than that, the two are the same. The SESSION SETUP ANDX RESPONSE message also has an interesting little bitfield called SMB_PARAMETERS.Action. Only the low-order bit (bit 0) of this field is defined. If set, it indicates that the username was not recognized by the server (that is, authentication failed--no such user) but the logon is being allowed to succeed anyway. That's rather odd, eh? What it means is this: If the username (in the AccountName field) is not recognized, the server may choose to grant anonymous or guest authorization instead. Anonymous access typically provides only very limited access to the server. For example, it may allow the use of a limited set of RAP function calls such as those used for querying the Browse Service. So, the Action bit is used to indicate that the logon attempt failed, but anonymous access was granted instead. No error code will be returned in this case, so the Action bit is the only indication to the client that the rules have changed. Server-side support for this behavior is optional.
2.8 AuthenticationNow for the big one... If you are familiar with authentication schemes then this section
should be comfortable for you. If not, then perhaps it's time for a
fresh pot of tea. Some people find their first experience with the
innards of password security to be a bit intimidating, possibly
because the encryption formulae are sometimes made to look a lot like
mathematics. Authentication itself isn't really that complex, though.
The basic idea is that the would-be user needs to prove that they are
who they say they are in order to get what they want. The proof is
usually in the form of something private or secret--something that
only the user has or knows.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Car locks are there |
Consider, for example, the key to an automobile (something you have). With the key in hand, you are able to unlock the door, turn the ignition switch, and start the engine. As far as the car is concerned, you have proven that you have the right to drive. Likewise with the password you use to access your computer (something you know). If you enter a valid username/password pair at the login prompt, then you can access the system. Unfortunately passwords, like keys, can be stolen or forged or copied. Just as locks can be picked, so passwords can be cracked36. In the early days of SMB, when the LANs were small and sheltered, there was very little concern for the safety of the password itself. It was sent in plaintext (un-encrypted) over the wire from the client to the server. Eventually, though, corporate networks got bigger, modems were installed to provide access from home and on the road, the "disgruntled employee" boogeyman learned how to use a keyboard, and everything got connected to the Internet. These were hard times for plaintext passwords, so a series of schemes was developed to keep the passwords safe--each more complex than its predecessor. For SMB, the initial attempt was called LAN Manager Challenge/Response authentication, often simply abbreviated "LM". The LM scheme turned out to be too simple and too easy to crack, and was replaced with something stronger called WindowsNT Challenge/Response (known as "NTLM"). NTLM was superseded by NTLMv2 which has, in turn, been replaced with a modified version of MIT's Kerberos system. Got that? We'll go through them all in various degrees of detail. The LM algorithm is fairly simple, so we can provide a thorough description. At the other extreme, Kerberos is an entire system unto itself and anything more than an overview would be overkill. 2.8.1 Anonymous and Guest LoginGather and study piles of SMB packet captures and you will notice that some SESSION SETUP requests contain no username and password at all. These are anonymous logins, and they are used to access special-purpose SMB shares such as the hidden "IPC$" share (the Inter-Process Communications share). You can learn more about IPC$ in the Browsing section. Put simply, though, this share allows one system to query another using RAP function calls. Anonymous login may be a design artifact; something created in the days of Share Level security when it seemed safe to leave a share unprotected, and still with us today because it cannot easily be removed. Maybe not. One guess is as good as another. "GUEST" account logons are also often sent sans password. The guest login is sometimes used in the same way as the anonymous login, but there are additional permissions which a guest account may have. Guest accounts are maintained like other "normal" accounts, so they can be a security problem and are commonly disabled. When SMB is doing its housekeeping, the anonymous login is generally preferred over the guest login. 2.8.2 Plaintext PasswordsThis is the easiest SMB authentication mechanism to implement--and the least secure. It's roughly equivalent to leaving your keys in the door lock after you've parked the car. Sure, the car is locked, but... Plaintext passwords may still be sufficient for use in small, isolated networks, such as home networks or small office environments (assuming no disgruntled employees and a well-configured firewall on the uplink--or no Internet connection at all). Plaintext passwords also provide us with a nice opportunity to get our feet wet in the mired pool of authentication. We can look at the packets and clearly see what is happening on the wire. Note, however, that many newer clients are configured to prevent the use of plaintext. Windows clients have registry entries that must be twiddled in order to permit plaintext passwords, and jCIFS did not support them at all until version 0.7. In order to set up a workable test environment you will need a server that does not expect encrypted passwords, and a client that doesn't mind sending the passwords in the clear. That is not an easy combination to come by. Most contemporary SMB clients and servers disable plaintext by default. It is easy, however, to configure Samba so that it requests unencrypted passwords. Just change the encrypt passwords parameter to no in the smb.conf file, like so: ; Disable encrypted passwords. encrypt passwords = no Don't forget to signal smbd to reload the configuration file after making this change. On the client side we will, once again, use the jCIFS Exists utility in our examples. If you would rather use a Windows client for your own tests, you can find a collection of helpful registry settings in the docs/Registry/ subdirectory of the Samba distribution. You will probably need to change the registry settings to permit the Windows client to send plaintext passwords. Another option as a testing tool is Samba's smbclient utility, which does not seem to argue if the server tells it not to encrypt the passwords. This is what our updated Exists test looks like:
A few things to note:
2.8.2.1 User Level Security with Plaintext PasswordsUser and Share Level security were described back in section 2.5.4, along with the TID and [V]UID header fields. The SecurityMode field of the NEGOTIATE PROTOCOL RESPONSE SMB will indicate the authentication expectations of the server. For User Level plaintext passwords, the value of the SecurityMode field will be 0x01. Below is an example SESSION_SETUP_ANDX.SMB_DATA block such as would be generated by the jCIFS Exists tool. Note, once again, that the discussion is focused on the NT LM 0.12 dialect. SMB_DATA { ByteCount = 27 Bytes { CaseInsensitivePassword = "p@ssw0rd" CaseSensitivePassword = <NULL> Pad = <NULL> AccountName = "PAT" PrimaryDomain = "?" NativeOS = "Linux" NativeLanMan = "jCIFS" } } There are always fiddly little details to consider when working with SMB. In this case, we need to talk about upper- and lower-case. (bLeCH.) The example above shows that the AccountName field has been converted to upper-case. This is common practice, but it is not really necessary and some implementations don't bother. It is a holdover from the early days of SMB when lots of things (filenames, passwords, share names, NetBIOS names, bagels, and pop singers) were converted to upper-case as a matter of course. Some older servers (pre-NT LM 0.12) may require upper-case usernames, but newer servers shouldn't care. Converting to upper-case is probably the safest option, just in case... Although the AccountName in the example is upper-case, the
CaseInsensitivePassword is not. Hmmm... Odd, eh? The
situation here is that some server operating systems (eg. most Unixy
OSes) use case-sensitive password verification algorithms. If the
password is sent all upper-case it probably won't match what the OS
expects, resulting in a login failure even though the user entered the
correct password. The field may be labeled case-insensitive (and that
really is what it is intended to be) but some server OSes
prefer to have the original password, case preserved, just as the user
entered it.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A tradition is a mistake that you have made more than once. -- Stephanie Cohen |
This is a sticky problem, though, because some clients insist on converting passwords to upper-case before sending them to the server. Windows95 and '98 may do this, for example. As you might have come to expect by now, the reason for this odd behavior is backward compatibility. There are older (pre-NT LM 0.12) servers still running that will reject passwords that are not all upper-case. Windows9x systems solve the problem by forcing all passwords to upper-case even when the NT LM 0.12 dialect has been selected. Samba's smbd server, which generally runs on case-sensitive platforms, must go through a variety of contortions to get upper-case plaintext passwords to be accepted39. Another annoyance is that Windows98 will pad the plaintext password string to 24 bytes, filling the empty space with semi-random garbage. This behavior was noted in testing, but there wasn't time to investigate the problem in-depth so it may or may not be wide-spread. Still, it's the odd case that will break things. Server implementors should be careful to both check the field length and look for the first terminating nul byte when reading the plaintext password. In short, client-side handling of the plaintext CaseInsensitivePassword is inconsistent and problematic--and the server has to compensate. That's why you need piles of SMB packet captures and lots of different clients to test against when writing a server implementation. It can be done, but it takes a bit of perseverance. When writing a new client, ensure that the client sends the password as the user intended. If that fails, and the dialect is pre-NT LM 0.12, then convert to upper-case and try again. Believe it or not, the use of challenge/response authentication bypasses much of this trouble. ...but that's only half the story. In addition to the CaseInsensitivePassword field there is also a CaseSensitivePassword field in the data block, and we haven't even touched on that yet. This latter field is only used if Unicode has been negotiated, and it is rare that both Unicode and plaintext will be used simultaneously. It can happen, though. As mentioned earlier, Samba can be easily configured to provide support for Unicode plaintext passwords40. In theory, this should be a simple switch from ASCII to Unicode. In practice, no client really supports it yet--and weird things have been seen on the wire. For example:
Empirically, it would seem that Unicode plaintext passwords were never meant to be. An interesting fact-ette that can be gleaned from this discussion is that there is a linkage between the password fields and the negotiation of Unicode. Simply put:
That is, ASCII plaintext passwords are stored in the CaseInsensitivePassword field, and Unicode plaintext passwords should be placed into the CaseSensitivePassword field. Indeed, Ethereal names these two fields, respectively, as "ANSI Password" and "Unicode Password" instead of using the longer names shown above. This relationship carries over to the challenge/response passwords as well, as we shall soon see. 2.8.2.2 Share Level Security with Plaintext PasswordsWe won't spend too much time on this. It is easy to see by looking at packet captures. Basically, in Share Level security mode the plaintext password is passed to the server in the TREE CONNECT ANDX request instead of the SESSION SETUP ANDX. In the NT LM 0.12 dialect, however, a valid username should also be placed into the SESSION SETUP AccountName field if at all possible. Doing so allows the server to map Share Level security to its own user-based authentication system.
2.8.3 LM Challenge/ResponseIn plaintext mode, the client proves that it knows the password by sending the password itself to the server. In challenge/response mode, the goal is to prove that the password is known without risking any exposure. It's a bit of a magic trick. Here's how it's done:
That's a rough, general overview of challenge/response. The details of its use in LAN Manager authentication are a bit more involved, but are fairly easy to explain. As we dig deeper, keep in mind that the goal is to protect the password while still allowing authentication to occur. Also remember that LM challenge/response was the first attempt to add encrypted password support to SMB. 2.8.3.1 DESThe formula used to generate the LM response makes use of the U.S. Department of Commerce Data Encryption Standard (DES) function, in block-mode. DES has been around a long time. There are a lot of references which describe it and a good number of implementations available, so we will not spend a whole lot of time studying DES itself41. For our purposes, the important thing to know is that the DES function--as used with SMB--takes two input parameters and returns a result, like so: result = DES( key, source );The source and result are both eight-byte blocks of data, the result being the DES encryption of the source. In the SNIA doc, as in the Leach/Naik draft, the key is described as being seven bytes (56 bits) long. Documentation on DES itself gives the length of the key as eight bytes (64 bits), but each byte contains a parity bit so there really are only 56 bits worth of "key" in the 64-bit key. As shown in figure 2.12, there is a simple formula for mapping 56 bits into the required 64-bit format. The seven byte string is simply copied, seven bits at a time, into an eight byte array. A parity bit (odd parity) is inserted following each set of seven bits (but some existing DES implementations use zero and ignore the parity bit). The key is used by the DES algorithm to encrypt the source. Given the same key and source, DES will always return the same result. 2.8.3.2 Creating the ChallengeThe challenge needs to be very random, otherwise the logon process could be made vulnerable to "replay" attacks. A replay attack is fairly straight-forward. The attacker captures the exchange between the server and the client and keeps track of the challenge, the response, and the username. The attacker then tries to log on, hoping that the challenge will be repeated (this step is easier if the challenge is at all predictable). If the server sends a challenge that is in the stored list, the attacker can use the recorded username and response to fake a logon. No password cracking required. Given that the challenge is eight bytes (64 bits) long, and that random number generators are pretty good these days, it is probably best to create the challenge using a random number function. The better the random number generator, the lower the likelihood (approaching 1 in 264) that a particular challenge will be repeated. The X/Open doc (which was written a long time ago) briefly describes a different approach to creating the challenge. According to that document, a seven-byte pseudo-random number is generated using an internal counter and the system time. That value is then used as the key in a call to DES(), like so: Ckey = fn( time( NULL ), counter++ ); challenge = DES( Ckey, "????????" ); (...the source string is honest-to-goodnessly given as eight question marks.) That formula actually makes a bit of sense, though it's probably
overkill. The pseudo-random Ckey is non-repeating (because
it's based on the time), so the resulting challenge is likely
to be non-repeating as well. Also note that the pseudo-random value
is passed as the key, not the source, in the call to
DES(). That makes it much more difficult to reverse and,
since it changes all the time, reversing it is probably not useful
anyway.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Anybody remotely interesting is mad, in some way or another. --Doctor Who |
As Andrew Bartlett42 points out, however, the time and counter inputs are easily guessed so the challenge is predictable, which is a potential weakness. Adding a byte or two of truly random "salt" to the Ckey in the recipe above would prevent such predictability.
Using a plain random number generator is probably faster, easier, and safer. 2.8.3.3 Creating the LM HashLM challenge/response authentication prevents password theft by ensuring that the plaintext password is never transmitted across a network or stored on disk. Instead, a separate value known as the "LM Hash" is generated. It is the LM Hash that is stored on the server side for use in authentication, and used on the client side to create the response from the challenge. The LM Hash is a sixteen byte string, created as follows:
That outline would make a lot more sense as code, wouldn't it? Well, you're in luck. Listing 2.6 shows how the steps given above might be implemented. 2.8.3.4 Creating the LM ResponseNow we get to the actual logon. When a NEGOTIATE PROTOCOL REQUEST arrives from the client, the server generates a new challenge on the fly and hands it back in the NEGOTIATE PROTOCOL RESPONSE. On the client side, the user is prompted for the password. The client generates the LM Hash from the password, and then uses the hash to DES-encrypt the challenge. Of course, it's not a straight-forward DES operation. As you may have noticed, the LM Hash is 16 bytes but the DES() function requires 7-byte keys. Ah, well... Looks as though there's a bit more padding and chopping to do.
Once again, we provide demonstrative code. Listing 2.7 shows how the LM Response would be generated. The server, which has the username and associated LM Hash tucked away safely in its authentication database, also generates the 24-byte response string. When the client's response arrives, the server compares its own value against the client's. If they match, then the client has authenticated. Under User Level security, the client sends its LM Response in the SESSION_SETUP_ANDX.CaseInsensitivePassword field of the SESSION SETUP request (yes, the LM response is in the SESSION SETUP REQUEST). With Share Level security, the LM Response is placed in the TREE_CONNECT_ANDX.Password field. 2.8.3.5 LM Challenge/Response: Once More With FeelingThe details sometimes obfuscate the concepts, and vice versa. We have presented a general overview of the challenge/response mechanism, as well as the particular formulae of the LAN Manager scheme. Let's go through it once again, quickly, just to put the pieces together and cover anything that we may have missed.
Well, that's a lot of work and it certainly goes a long way towards
looking complicated. Unfortunately, looking complicated isn't enough
to truly protect a password. LM challenge/response is an improvement
over plaintext, but there are some problems with the formula and it
turns out that it is not, in fact, a very big improvement.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
...and we'll have fun, fun, fun
'till somebody takes the keyboard away. -- Not quite as The Beach Boys intended. |
Let's consider what an attacker might do to try and break into a system. We've already explained the replay attack. Other common garden varieties include the "dictionary" and the "brute force" attack, both of which simply try pushing possible passwords through the algorithm until one of them returns the same response seen on the wire. The dictionary attack is typically faster because it uses a database of likely passwords, so tools tend to try this first. The brute force method tries all (remaining) possible combinations of bytes, which is usually a longer process. Unfortunately, all of the upper-casing, nul-padding, chopping, and concatenating used in the LM algorithm makes LM challenge/response very susceptible to these attacks. Here's why: The LM Hash formula pads the original password with nul bytes. If the password is short enough (seven or fewer characters) then, when the 14-byte padded password is split into two seven-byte DES keys, the second key will always be a string of seven nuls. Given the same input, DES produces the same output... 0xAAD3B435B51404EE = DES( "\0\0\0\0\0\0\0", "KGS!@#$%" ) ...which results in an LM Hash in which the second set of eight bytes are known:
To create the LM Response, the LM Hash is padded with nuls to 21 bytes, and then split again into three DES keys:
...and now the problem is obvious. If the original password was seven bytes or less, then almost two-thirds of the encryption key used to generate the LM Response will be a known, constant value. The password cracking tools leverage this information to reduce the size of the keyspace (the set of possible passwords) that needs to be tested to find the password. Less obvious, but clear enough if you study the LM Response algorithm closely, is that short passwords are only part of the problem. Because the hash is created in pieces, it is possible to attack the password in 7-byte chunks even if it is longer than 7 bytes. Converting to upper-case also diminishes the keyspace, because lower-case characters do not need to be tested at all. The smaller the keyspace, the faster a dictionary or brute-force attack can run through the possible options and discover the original password46. 2.8.4 NTLM Challenge/ResponseAt some point in the evolution of WindowsNT a new, improved challenge/response formula was introduced. It was similar to the LAN Manager version, with the following changes:
...and that's basically it. The rest of the formula is the same. So what does it buy us? The first advantage of NTLM is that the passwords are more complex. They're mixed case and in Unicode, which means that the keyspace is much larger. The second advantage over LM is that the MD4() function doesn't require fixed length input. That means no padding bytes and no chopping to over-simplify the keys. The NTLM Hash itself is more robust than the LM Hash, so the NTLM Response is much more difficult to reverse. Unfortunately, the NTLM Response is still created using the same algorithm as is used with LM, which provides only 56-bit encryption. Worse, clients often include both the NTLM Response and the LM Response (derived from the weaker LM Hash) in the SESSION SETUP ANDX REQUEST. They do this to maintain backward-compatibility with older servers. Even if the server refuses to accept the LM Response, the client has sent it. Ouch.
2.8.5 NTLM Version 2
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There is a theory which states that if ever anyone discovers exactly what the Universe is for and why it is here, it will instantly disappear and be replaced by something even more bizarre and inexplicable. There is another which states that this has already happened. -- Douglas Adams The Restaurant at the End of the Universe |
NTLMv2, as it's called, has some additional safeguards thrown into the recipe that make it more complex--and hopefully more secure--than its predecessors. There are, however, two small problems with NTLMv2:
Regarding the first point, Appendix B of Luke K. C. Leighton's book DCE/RPC over SMB: Samba and Windows NT Domain Internals provides a recipe for NTLMv2 authentication. We'll do our best to expand on Luke's description. The other option, of course, is to look at available Open Source code. The second point is really conjecture, based in part on the fact that it took a very long time to get NTLMv2 implemented in Samba and few seemed to care. Indeed, NTLMv2 support had already been added to Samba-TNG by Luke and crew, and needed only to be copied over. It seems that the delay in adding it to Samba was not a question of know-how, but of priorities. Another factor is that NTLMv2 is not required by default on most Windows systems. When challenge/response is negotiated, even newer Windows versions will default to using the LM/NTLM combination unless they are specifically configured not to. 2.8.5.1 The NTLMv2 ToolboxWe have already fussed with the DES algorithm and toyed with the MD4 algorithm. Now we get to use the HMAC-MD5 Message Authentication Code hash. This one's a power tool with razor-sharp keys and swivel-action hashing. The kind of thing your Dad would never let you play with when you were a kid. Like all good tools, though, it's neither complex nor dangerous once you learn how it works. HMAC-MD5 is actually a combination of two different algorithms: HMAC and MD5. HMAC is a Message Authentication Code (MAC) algorithm that takes a hashing function (such as MD5) and adds a secret key to the works so that the resulting hash can be used to verify the authenticity of the data. The MD5 algorithm is basically an industrial-strength version of MD4. Put them together and you get HMAC-MD5. HMAC-MD5 is quite well documented48, and there are a lot of implementations available. It's also much less complicated than it appears in figure 2.13, so we won't need to go into any of the details. For our purposes, what you need to know is that the HMAC_MD5() function takes a key and some source data as inputs, and returns a 16-byte (128-bit) output. Hmmm... Well, it's not actually quite that simple. See, MD4, MD5, and HMAC-MD5 all work with variable-length input, so they also need to know how big their input parameters are. The function call winds up looking something like this: hash16 = HMAC_MD5( Key, KeySize, Data, DataSize ); There is, as it turns out, more than one way to skin an HMAC-MD5. Some implementations use a whole set of functions to compute the result:
Conceptually, though, the multi-function approach is the same as the simpler example shown above. That is: Key and Data in, 16-byte hash out.
Another important tool is the older NTLM hash algorithm. It was described earlier but it is simple enough that we can present it again, this time in pseudo-code: uchar *NTLMhash( uchar *password ) { UniPasswd = UCS2LE( password ); KeySize = 2 * strlen( password ); return( MD4( UniPasswd, KeySize ) ); } The ASCII password is converted to Unicode UCS-2LE format, which requires two bytes per character. The KeySize is simply the length of that (Unicode) password string, which we calculate here by doubling the ASCII string length (which is probably cheating). Finally, we generate the MD4 hash (that's MD4, not MD5) of the password, and that's all there is to it. Note that the string terminator is not counted in the KeySize. That is common behavior for NTLM and NTLMv2 challenge/response when working with Unicode strings. The NTLM Hash is of interest because the SMB/CIFS designers at Microsoft (if indeed such people truly exist any more, except in legend) used it to cleverly avoid upgrade problems. With LM and NTLM, the hash is created from the password. Under NTLMv2, however, the older NTLM (v1) Hash is used instead of the password to generate the new hash. A server or Domain Controller being upgraded to use NTLMv2 may already have the older NTLM hash values in its authentication database. The stored values can be used to generate the new hashes--no password required. That avoids the nasty chicken-and-egg problem of trying to upgrade to NTLMv2 Hashes on a system that only allows NTLMv2 authentication. 2.8.5.2 The NTLMv2 Password HashThe NTLMv2 Hash is created from:
The process works as shown in the following pseudo-code example: v1hash = NTLMhash( password ); UniUser = UCS2LE( upcase( user ) ); UniDest = UCS2LE( upcase( destination ) ); data = uni_strcat( UniUser, UniDest ); datalen = 2 * (strlen( user ) + strlen( destination )); v2hash = HMAC_MD5( v1hash, 16, data, datalen ); Let's clarify that, shall we?
A bit more explanation is required regarding the destination value (which gets converted to UniDest). In theory, the client can use NTLMv2 challenge/response to log into a stand-alone server or to log into an NT Domain. In the former case, the server will have an authentication database of its very own, but an NT Domain logon requires authentication against the central database maintained by the Domain Controllers. So, in theory, the destination name could be either the NetBIOS name of the stand-alone server or the NetBIOS name of the NT Domain (no NetBIOS suffix byte in either case). In practice, however, the server logon doesn't seem to work reliably. The Windows systems used in testing were unable to use NTLMv2 authentication with one another when they were in stand-alone mode, but once they joined the NT Domain NTLMv2 logons worked just fine49. 2.8.5.3 The NTLMv2 ResponseThe NTLMv2 Response is calculated using the NTLMv2 Hash as the Key. The Data parameter is composed of the challenge plus a blob of data which we will refer to as "the blob". The blob will be explained shortly. For now, just think of it as a mostly-random bunch of garblement. The formula is shown in this pseudo-code example: blob = RandomBytes( blobsize ); data = concat( ServerChallenge, 8, blob, blobsize ); hmac = HMAC_MD5( v2hash, 16, data, (8 + blobsize) ); v2resp = concat( hmac, 16, blob, blobsize ); Okay, let's take a closer look at that and see if we can force it to make some sense.
If the client sends the NTLMv2 Response, it will take the place of the NTLM Response in the SESSION_SETUP_ANDX.CaseSensitivePassword field. Note that, unlike the older NTLM Response, the NTLMv2 Response algorithm uses 128-bit encryption all the way through. 2.8.5.4 Creating The BlobIf you have ever taken a college-level Invertebrate Zoology course, you may find the dissection of the blob to be nauseatingly familiar. The rest of you... try not to be squeamish. One more warning before we cut into this: The blob's structure may not matter at all. We'll explain why a little later on. Okay, now that the disclaimers are out of the way, we can get back to work. The blob does have a structure, which is more or less as follows:
The list of names near the end of the blob may contain the NT Domain and/or the server name. As with the names used to generate the NTLMv2 Hash, these are NetBIOS names in upper-case UCS-2LE Unicode with no string termination and no suffix byte. The name list also has a structure:
The blob structure is probably related to (the same as?) data formats used in the more advanced security systems available under Extended Security50. 2.8.5.5 Improved Security Through Confusion
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
...they have weapons of mass confusion and aren't afraid to use them. -- iomud on Slashdot |
Now that we have the formula worked out, let's take a closer look at the NTLMv2 challenge/response algorithm and see how much better it is than NTLM. With the exception of the password itself, all of the inputs to NTLMv2 are known or knowable from a packet capture. Even the blob can be read off the wire, since it is sent as part of the response. That means that the problem is still a not-so-simple case of solving for a single variable: the password. The NTLMv2 Hash is derived directly from the NTLM (v1) Hash. Since there is no change to the initial input (the password) the keyspace is exactly the same. The only change is that the increased complexity of the algorithm means that there are more encryption hoops through which to jump than in the simpler NTLM process. It takes more computer time to generate a v2 response, which doesn't impact a normal login but will slow down dictionary and brute force attacks against NTLMv2 (though Moore's Law may compensate). Weak passwords (those that are near the beginning of the password dictionary) are still vulnerable. Another thing to consider is the blob. If the blob were zero length (empty), the NTLMv2 Response formula would reduce to: v2resp = HMAC_MD5( v2hash, ServerChallenge ); ...which would still be pretty darn secure. So the question is this: Does the inclusion of the blob improve the NTLMv2 algorithm and, if so, how? Well, see, it's like this... Instead of being produced by the key and challenge alone, the NTLMv2 Response involves the hash of a chunk of semi-random data. As a result, the same challenge will not always generate the same response. That's good, because it prevents replay attacks...in theory. In practice, the randomness of the challenge should be enough to prevent replay attacks. Even if that were not the case, the only way that the blob could help would be if it, too, were non-repeating and if the server could somehow verify that the blob was not a repeat. That, quite possibly, is why the timestamp is included. The timestamp could be used to let the server know that the blob is "fresh". That is, that it was created a reasonably short amount of time before it was received. Fresh packets can't easily be forged because the response is HMAC-signed using the v2hash as the key (and that's based on the password which is the very thing the cracker doesn't know). Of course, the timestamp test won't work unless the client and server clocks are synchronized, which is not always the case. In all likelihood the contents of the blob are never tested at all. There is code and commentary in the Samba-TNG source that shows that they have done some testing, and that their results indicate that a completely random blob of bytes works just fine. If that's true, then the blob does little to improve the security of the algorithm except perhaps by adding a few more CPU cycles to the processing time.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
...it is a tale Told by an idiot, full of sound and fury, signifying nothing. -- Macbeth, Act V, Scene v, William Shakespeare |
This isn't the first time that we have put a lot of effort into figuring out some complex piece of the protocol only to discover that it's almost pointless, and it probably won't be the last time either.
2.8.5.6 Insult to Injury: LMv2There is yet one more small problem with the NTLMv2 Response, and that problem is known as pass-through authentication. Simply put, a server can pass the authentication process through to an NT Domain Controller. The trouble is that some servers that use pass-through assume that the response string is only 24 bytes long. You may recall that both the LM and NTLM responses are, in fact, 24 bytes long. Because of the blob, however, the NTLMv2 response is much longer. If a server truncates the response to 24 bytes before forwarding it to the NT Domain Controller almost all of the blob will be lost. Without the blob, the Domain Controller will have no way to verify the response so authentication will fail. To compensate, a simpler response--known as the LMv2 response--is also calculated and returned alongside the NTLMv2 response. The formula is identical to that of NTLMv2, except that the blob is really small. blip = RandomBytes( 8 ); data = concat( ServerChallenge, 8, blip, 8 ); hmac = HMAC_MD5( v2hash, 16, data, 16 ); LMv2resp = concat( hmac, 16, blip, 8 ); The "blip", as we've chosen to call it, is sometimes referred to as the "Client Challenge". If you go back and look, you'll find that the blip value is also included in the blob, just after the timestamp. It is fairly easy to spot in packet captures. The blip is 8 bytes long so that the resulting LMv2 Response will be 24 bytes, exactly the number needed for pass-through authentication. If it is true that the contents of the blob are not checked, then the LMv2 Response isn't really any less secure than the NTLMv2 Response--even though the latter is bigger. The LMv2 Response takes the place of the LM Response in the SESSION_SETUP_ANDX.CaseInsensitivePassword field. 2.8.5.7 Choosing NTLMv2The use of NTLMv2 is not negotiated between the client and server. There is nothing in the protocol to determine which challenge/response algorithms should be used. So, um... how does the client know what to send, and how does the server know what to expect? The default behavior for Windows clients is to send the LM and NTLM responses, and the default for Windows servers is to accept them. Changing these defaults requires fiddling in the Windows registry. Fortunately, the fiddles are well known and documented so we can go through them quickly and get them out of the way51. The registry path to look at is: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\LSA On Win9x the variable is called LMCompatibility, but on WindowsNT and 2000 it is LMCompatibilityLevel. That variable may not be present in the registry, so you might have to add it. In general, it's best to follow Microsoft's instructions when editing the registry52. The settings for LMCompatibilityLevel are as follows:
That's just a quick overview of the settings and their meanings. The important points are these:
2.8.6 Extended Security: That Light at the End of the TunnelOur discussion of SMB authentication mechanisms is winding down now. There are a few more topics to be covered and a few others that will be carefully, but purposefully, avoided. Extended Security falls somewhere in between. We will dip our toes into its troubled waters, but we won't wade in too deep (or the monsters might get us). One reason for trepidation is that--as of this writing--Extended Security is still an area of active research and development for the Samba Team and others. Though much has been learned, and much has been implemented, the dark pools are still being explored and the fine points are still being examined. Another deterrent is that Extended Security represents a full set of sub-protocols--a whole, vast world of possibilities to be explored ...some other day. As with MS-RPC (which we touched on just long enough to get our fingers burned), the topic is simply too large to cover here. As suggested in figure 2.14, Extended Security makes use of nested protocols. Go back to section 2.6.3.2 and take a look at the NEGOTIATE_PROTOCOL_RESPONSE.SMB_DATA structure. Note that the ext_sec.SecurityBlob field is nothing more than a block of bytes--and it's what's inside that block that matters. If the client and server agree to use Extended Security, then the whole NEGOTIATE PROTOCOL RESPONSE / SESSION SETUP REQUEST business becomes a transport for the authentication protocol. In some cases the security exchange may require several packets and
a few round trips to complete. When that happens, a single
NEGOTIATE PROTOCOL RESPONSE / SESSION SETUP REQUEST
pair will not be sufficient to handle it all. The solution to this
dilemma is fairly simple: The server sends an error message to force
the client to send another SESSION SETUP REQUEST containing
the next chunk of data.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The only spec I trust is written in C. -- Andrew Bartlett, Samba Team |
The process is briefly (and incompletely) described in section 4.1.2 of the SNIA doc as part of the discussion of the SESSION SETUP RESPONSE. Simply put, as long as there are more Extended Security packets required, the server will reply to the SESSION SETUP REQUEST by sending a NEGATIVE SESSION SETUP RESPONSE with an NT_STATUS value of 0xC0000016 (which is known as STATUS_MORE_PROCESSING_REQUIRED). The client then sends another SESSION SETUP REQUEST containing the additional data. This continues until the authentication protocol has completed. There is no DOS error code equivalent for STATUS_MORE_PROCESSING_REQUIRED, something we have already whined about in the "!Strange Behavior Alert" box back in section 2.5.1. It seems that Extended Security expects that the client can handle NT_STATUS codes, which may be a significant issue for anyone trying to implement an SMB client53. 2.8.6.1 The Extended Security Authentication ToolkitThere are several different authentication protocols which may
be carried within the SecurityBlob. Those protocols, in
turn, are built on top of a whole pile of different languages and APIs
and data transfer formats. The result is an alphabet soup of
acronyms. Here's a taste:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
...mit Schlag. |
Quite a list, eh?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I am friend to the undertow I take you in, I don't let go -- Undertow, Suzanne Vega |
As you can see, there is a lot going on below the surface of Extended Security. We could try diving into a few of the above topics, but the waters are deep and the currents are strong and we would quickly be swept away. Out of necessity, we will spend a little time talking about Kerberos, but we won't swim out too far and we will be wearing a PFD (Personal Floatation Device--don'cha just love acronyms?). 2.8.7 KerberosAs already stated, we won't be going into depth about Kerberos. There is a lot of documentation available on the Internet and in print, so the wiser course is to suggest some starting points for research. There are, of course, several starting points presented in the References section of this very book. A good place to get your feet wet is Bruce Schneier's Applied Cryptography, Second Edition. Kerberos version 5 is specified in RFC 1510, but this is CIFS we're talking about. Microsoft has made a few "enhancements" to the standard. The best known is probably the inclusion of a proprietary Privilege Access Certificate (PAC) which carries Windows-specific authorization information. Microsoft heard a lot of grumbling about the PAC, and in the end they did publish the information required by third-party implementors. They even did so under acceptable licensing terms (and the CIFS community sighed a collective sigh of relief). The PAC information is available in a Microsoft Developer Network (MSDN) document entitled Windows 2000 Authorization Data in Kerberos Tickets. There are a lot of Kerberos-related RFCs. The interesting ones for our purposes are:
There is also (as of this writing) a set of Internet Drafts that cover Microsoft Kerberos features, including a draft for Kerberos authentication over HTTP. Finally, a web search for "Microsoft" and "Kerberos" will toss up an abundant salad of opinions and references, both historical and contemporary. Where CIFS is concerned, it seems that there is always either too little or too much information. Microsoft-compatible Kerberos falls under the latter curse. There is a lot of stuff out there, and it is easy to get overwhelmed. If you plan to dive in, find a buddy. Don't swim alone. 2.8.8 Random Notes on W2K and NT Domain AuthenticationWe have been delicately dancing around the role of the Domain Controller in authentication. It's time to face the music. The concept is fairly simple: Take the password database that is normally kept locally by a stand-alone server and move it to a central authority so that it can be shared by multiple servers, then call the whole thing a "Domain". The central authority that stores the shared database is, of course, the Domain Controller. As shown in figure 2.15, the result is that the SMB fileserver must now consult the Domain Controller when a user tries to access SMB services. That general description applies to both NT and W2K Domains, even though the two are implemented in very different ways. Windows2000 Domains are based on Active Directory and Kerberos, while WindowsNT Domains make use of a Security Accounts Manager (SAM) Database and MS-RPC. Let's see what bits of wisdom we can pull out of the hat regarding these two Domain systems... 2.8.8.1 A Quick Look at W2K DomainsAs with Microsoft's Kerberos implementation, there is probably too much information available on this topic. A full description would also be very much beyond the stated scope of this book. So, as briefly as possible, here are some notes about W2K Domains and Domain Controllers:
...and that barely begins to scratch the surface. CIFS client and server participation in a W2K Domain requires Kerberos support, but does not require a detailed understanding of Active Directory architecture. The points above are of interest here primarily for comparison with the NT Domain system notes, presented below. 2.8.8.2 A Few Notes about NT DomainsIn contrast to W2K Domains, NT Domains have the following features:
There are two mechanisms that an SMB Server can use to ask a Domain Controller to validate a client logon attempt. These are known as pass-through and NetLogon authentication. The NetLogon mechanism uses MS-RPC, so we won't cover it here except to say that it provides a more intimate relationship between the SMB server and the Domain Controller than does the pass-through mechanism. There are several good sources for further reading listed in the References section. In particular:
Pass-through, in contrast to NetLogon, is really quite simple. It is also documented in (yet) an(other) expired Leach/Naik IETF draft, titled CIFS Domain Logon and Pass Through Authentication, which can be found on Microsoft's CIFS FTP site (under the name cifslog.txt). Basically, pass-through authentication is a man-in-the-middle mechanism. It goes like this:
It should be easy to capture an example of pass-through authentication using your network sniffer. Windows9x systems (and possibly other Windows varieties) do not support NetLogon so they always use the pass-through method if they are part of an NT Domain. Samba can be configured to use either method.
2.8.8.3 It's Good to Have a BackupIn the NT Domain system, there is a single Domain Controller that is primarily responsible for the maintenance of the domain's SAM database. This Domain Controller is known as the (surprise) Primary Domain Controller (PDC). The domain may also have zero or more Backup Domain Controllers (BDCs). The BDCs keep read-only replicas of the PDC's SAM database. BDCs can be used for authentication just as the PDC can, and if the PDC is accidentally thrown out of a twelfth-story window into an active volcano, a BDC can be "promoted" to fill the role of the dearly departed PDC. Windows2000 Domains do things differently. They do not distinguish between Primary and Backup DCs. Instead, Active Directory makes use of something called "multimaster replication". Updates to any replica are propagated to all of the other replicas, so there is no longer any need to specify one copy of the database as the primary. 2.8.8.4 Trust Me On ThisThis is one of those concepts that we have to cover because--unless you're already familiar with it--you'll read about it somewhere else and think to yourself "What the heck is that all about?". Somewhere back a few paragraphs it was stated that NT Domains are, conceptually, stand-alone entities ...and so they are, but it is possible to introduce them to one another and get them to cooperate. The agreements forged between the domains are known as "Inter-Domain Trust Relationships". Let's use an example to explain what this is all about. Consider a large corporate organization with several divisions, departments, committees, consultants, and such-like. In this corporation, the Business Units Reassignment Planning Division runs the BURP_DIV domain, and the Displacement Entry Department calls theirs the DISENTRY domain. Now, let's say that the BURP_DIV folks need access to files stored on DISENTRY servers (so they can move the files around a bit). One way to handle this would be to create accounts for the BURP_DIV users in the DISENTRY domain. That would cause a bit of a problem, however, because the BURP_DIV users would need two accounts, one per domain. That is likely to result in things like passwords, preferences, and web browser bookmarks getting a bit out of sync. Also, the Benefits Reduction Committee will want to know why all of the BURP_DIV employees are moonlighting in the DISENTRY department and how they could possibly be doing two jobs at once. It could become quite a mess, resulting in the hiring of dozens of consultants to ensure that the problem is properly ignored. The better way to handle this situation is to create a trust relationship between the DISENTRY and BURP_DIV domains. With inter-domain trust established, the BURP_DIV folks can log on to DISENTRY servers using their BURP_DIV credentials. As shown in figure 2.16, the DISENTRY Domain Controller will ask the BURP_DIV Domain Controllers to validate the logon. Note that, in the non-extended-security version of the SESSION SETUP REQUEST message, there is a field called PrimaryDomain. This field identifies the NT domain against which the client wishes to authenticate. That is, the PrimaryDomain field should contain the name of the NT Domain to which the user belongs. Windows2000 domains also support trust relationships. This is useful for creating trust between two separate W2K Domain trees, or between W2K Domains and NT Domains. The mechanisms used to support inter-domain trust are very advanced topics, and won't be covered here. 2.8.9 Random Notes on Message Authentication CodesMessage Authentication Codes (MACs) are used to prevent "pickle-in-the-middle" attacks (more commonly known as "man-in-the-middle" attacks 54). This form of attack is simple to describe, but it can be difficult to pull off in practice (though wireless LAN technology has the potential to make it much easier). Figure 2.17 provides some visuals. Generally speaking, in the pickle-in-the-middle attack an evil interloper allows the "real" client to authenticate with the server and then assumes ownership of the TCP/IP connection, thus bypassing the whole problem of needing to know the password. There are a number of ways to hijack the TCP session, but with SMB
that step isn't necessary. Instead, the evil interloper can simply
impersonate the server to fool the client. For instance, if the evil
interloper is on the same IP subnet as both the client and server (a
B-mode network) then it can usurp the server's name by responding to
broadcast name queries sent by the client faster than the server does.
Server identity theft can also be accomplished by "poisoning"
the NBNS database (or, possibly, the DNS). That is, by somehow
forcing it to swallow false information. A simple way to do that is
to register the server's name--with the interloper's IP address--in
the NBNS before the server does (perhaps by registering the name while
the server is down for maintenance or something).
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Proof by Familiarity: Well, that looks familiar! -- Jonathan Young, PhD. |
In any case, when the client tries to open an SMB session with the server it may wind up talking to the evil interloper instead. The evil interloper will pass the authentication request through to the real server, then pass the challenge back to the client, and then pass the client's response to the server... and that... um... um... um... that looks exactly like pass-through authentication. In fact, the basic difference between pass-through authentication and this type of attack is the ownership of the box that is relaying the authentication. If the crackers control the box, consider it an attack. This authentication stuff is fun, isn't it? So, given a situation in which you are concerned about evil interlopers gaining access to your network, you need a mechanism that allows the client and server to prove to one another on an ongoing basis that they are the real client and server. That's what the MACs are supposed to do.
2.8.9.1 Generating the Session KeyThe server and client each generate a special key, known as the Session Key. There are several potential uses for the Session Key, but we will only be looking at its use in MAC signing. The Session Key is derived from the password hash--something that only the client and server should know. There are several hash types available: LM, NTLM, LMv2, and NTLMv2. The hash chosen is probably the most advanced hash that the two systems know they share. So, if the client sent an LM Response--but did not send an NTLM Response--then the Session Key will be based on the LM Hash. The LM Session Key is calculated as follows: char eightnuls[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; LM_Session_Key = concat( LM_Hash, 8, eightnuls, 8 ); That is, take the first eight bytes of the LM Hash and add eight nul bytes to the end for a total of 16 bytes. Note that the resulting Session Key is not the same as the LM Hash itself. As stated earlier, the password hashes can be used to perform all of the authentication functions we have covered so far, so they must be protected as if they were the actual password. Overwriting the last eight bytes of the hash with zeros serves to obfuscate the hash (though this method is rather weak). A different formula is used if the client did send an NTLM Response. The NTLM Session Key is calculated like so: NTLM_Session_Key = MD4( NTLM_Hash ); Which means that the NTLM Session Key is the MD4 of the MD4 of the Unicode password. The SNIA doc says there's only one MD4, but that would make the NTLM Session Key the same as the NTLM Hash. Andrew Bartlett of the Samba Team says there are two MD4s; the second does a fine job of protecting the password-equivalent NTLM Hash from exposure. Moving along to LMv2 and NTLMv2, we find that the Session Key recipe is slightly more complex, but it's all stuff we have seen before. We need the following ingredients:
The LMv2 and NTLMv2 session keys are computed as follows: LMv2_Session_Key = HMAC_MD5( v2hash, 16, lmv2_hmac, 16 ); NTLMv2_Session_Key = HMAC_MD5( v2hash, 16, ntlmv2_hmac, 16 ); The client is able to generate the Session Key because it knows the password and other required information (because the user entered the required information at the logon prompt). If the server is stand-alone, it will have the password hash and other required information in its local SAM database, and can generate the Session Key as well. On the other hand, if the server relies upon a Domain Controller for authentication then it won't have the password hash and won't be able to generate the Session Key. What's a server to do? As we have already pointed out, the MAC protocol is designed to prevent a situation that looks exactly like pass-through authentication, so a pass-through server simply cannot do MAC signing. A NetLogon-capable server, however, has a special relationship with the Domain Controller. The NetLogon protocol is secured, so the Domain Controller can generate the Session Key and send it to the server. That's how an NT Domain member server gets hold of the Session Key without ever knowing the user's password or password hash. 2.8.9.2 Sequence NumbersBoth the client and server maintain an integer counter which they initialize to zero. This counter is used as a message sequence number, and it gets incremented for every message such that requests always have an even sequence number and replies always have an odd. The zero-eth message is always a SESSION SETUP ANDX message, but it may not be the first SESSION SETUP ANDX of the session. Recall, from near the beginning of the Authentication section, that the client sometimes uses an anonymous or guest logon to access server information. Watch enough packet captures and you will see that MAC signing doesn't really start until after a real user logon occurs. Also, it appears from testing that the MAC Signature in the zero-eth message is never checked (and that existing clients send a bogus MAC Signature in the zero-eth packet). That's okay, since the authenticity of the zero-eth message can be verified by the fact that it contains a valid response to the server challenge. Once the MAC signing has been initialized within a session, all messages are numbered using the same counters and signed using the same Session Key. This is true even if additional SESSION SETUP ANDX exchanges occur. 2.8.9.3 Calculating the MACThe MAC itself is calculated using the MD5 function. That's the plain MD5, not HMAC-MD5 and not MD4. The input to the MD5 function consists of three concatenated blocks of data:
We start by combining the Session Key and the response into a single value known as the MAC Key. For LM, NTLM, and LMv2 the MAC Key is created like so: MAC_Key = concat( Session_Key, 16, Response, 24 ); The thing to note here is that all of the responses, with the exception of the NTLMv2 Response, are 24 bytes long. So, except for NTLMv2, all auth mechanisms produce a MAC Key that is 40 bytes long. (16 + 24 = 40). Unfortunately, the formula for creating the NTLMv2 MAC Key is not yet known. It is probably similar to the above, however. Possibly identical to the calculation of the LMv2 MAC Key, or possibly the concatenation of the Session Key with the first 28 bytes of the blob. Okay, now you need to pay careful attention. The last few steps of MAC Signature calculation are a bit fiddly.
...and that, to the best of our knowledge, is how it's done. 2.8.9.4 Enabling and Requiring MAC SigningWindows NT systems offer four registry keys to control the use of SMB MAC signing. The first two manage server behavior, and the second pair represent client settings.
Study those closely and you may detect some small amount of similarity between the client and server parameter settings. (Well, okay, they are mirror images of one another.) Keep in mind that the client and server must have compatible settings or the SESSION SETUP will fail. These options are also available under Windows2000, but are managed using security policy settings55. 2.8.10 Non Sequitur TimeA mathematician, a physicist, and an engineer were sitting together in a teashop, sharing a pot of Lapsang Souchong and discussing the relationship between theory and practice. The mathematician said "One of my students asked me today whether all odd numbers greater than one were prime numbers, so I provided this simple proof: "Stated: All odd numbers greater than one are prime. "Interesting.", replied the physicist. "Perhaps I have the same student. I was asked the same question today. I solved the problem using a thought-experiment, as Galileo might have done. Our experiment was as follows: "By observation we can see that:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Proof by Assertion: This has got to be true! -- Jonathan Young, PhD. |
The engineer interrupted before the physicist could draw a conclusion, and said "Out in the field we don't have time to mess with theory. We just define all odd numbers as prime and work from there. It's simpler that way." Consider this as you contemplate what you have learned about SMB authentication. 2.8.11 Further StudyYou should now have all you need to create an SMB session with an SMB server. As you become more comfortable with the system you will likely become curious about the vast uncharted jungle of Extended Security. Don't be afraid to go exploring. With the background provided here, and the guidebooks listed in the References section, you are well prepared. If you get it all mapped out, do us all a favor: write it up so that everyone can share what you've learned. A few more bits of advice before we move along...
These guidelines are quite general, but they apply particularly well to the study of SMB security and authentication.
2.9 Building Your SMB VocabularyLooking back over our shoulders, we see that we have performed only two SMB exchanges so far: the NEGOTIATE PROTOCOL and the SESSION SETUP. There may be a TREE CONNECT shoved into the packet with the SESSION SETUP as an AndX, but we haven't really described the TREE CONNECT in detail. So, although we have covered a tremendous amount of material, our progress seems rather pathetic doesn't it? What if the rest of SMB is just as tedious, verbose, and difficult? Relax. It's not. Certainly there are other difficulties lying in wait, but the biggest ones have already been identified and we are carefully avoiding them. If you pursue your dream of creating a complete and competitive CIFS implementation then you may, some day, need to know how things like MS-RPC and Extended Security really work inside. Fortunately, you can do without them for now. Let's just be clear on this before we move along: There is a lot you can do with CIFS without implementing any of the extended sub-protocols that SMB supports, but if you want to build a complete and competitive CIFS client/server implementation you will need to go well beyond the SMB protocol itself.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Everything should be made as simple as possible, but not simpler. -- Albert Einstein |
That's why it has taken the Samba Team (with help from hundreds if not thousands of people across the Internet) more than ten years to make Samba the industrial-strength server system it is today. Tridge worked out the basics of NBT and SMB in a couple of weeks back in 1991, but new things keep getting tacked on to the system. When implementing CIFS, the rule of thumb is this: Implement as little as possible to do the job you need to do. The minute you cross the border into uncharted territory you open up a whole new world to explore and discover. Sometimes, you just don't want to go there. Other times, you must. Anyway, in the spirit of keeping things simple we will cover only a few more SMB messages, and those in much less depth than we have done so far. There really is no need to study every message, longword, bit, and string. If you've come this far, you should know how to read packet captures and interpret the message definitions in the SNIA doc. It is time to take the training wheels off and learn to ride. 2.9.1 That TREE CONNECT ThingyWe have talked a lot about the TREE CONNECT ANDX REQUEST SMB. There was even an example way back in section 2.4.4. The example looked like this: SMB_PARAMETERS { WordCount = 4 AndXCommand = SMB_COM_NONE (0xFF) AndXOffset = 0 Flags = 0x0000 PasswordLength = 1 } SMB_DATA { ByteCount = 22 Password = "" Path = "\\SMEDLEY\HOME" Service = "?????" (yes, really) } Notice that the TREE CONNECT includes a Password field, but that in this example the Password field is almost empty (it contains a nul byte). If the server negotiates Share Level security, then the password that would otherwise be in the SESSION_SETUP_ANDX.CaseInsensitivePassword field will show up in the TREE_CONNECT_ANDX.Password field instead. The password may be plaintext, or it may be one of the response values we calculated earlier. The TREE_CONNECT_ANDX.Path field is also worth mentioning. It contains the UNC pathname of the share to which the client is trying to connect. In this example, the client is attempting to access the HOME share on node SMEDLEY. Note that the Path will be in Unicode if negotiated. Finally there is that weird quintuple question mark string in the TREE_CONNECT_ANDX.Service field. There are, as it turns out, five possible values for that field:
It's annoying for the client to need to know the kind of share to which it is connecting, which is probably why the wildcard option is available. The server will return the service type in the Service field of the Response. Note that the Service strings are always in 8-bit ASCII characters--never Unicode. The response (for LANMAN2.1 and above) looks like this: SMB_PARAMETERS { WordCount = 3 AndXCommand = <Next ANDX command> AndXOffset = <Next ANDX block offset> OptionalSupport = <A bitfield> } SMB_DATA { ByteCount = <variable> Service = <"A:" | "LPT1:" | "IPC" | "COMM"> NativeFileSystem = <"" | "FAT" | "NTFS"> } The example above shows the empty string, "FAT", or "NTFS" as the valid values for the NativeFileSystem field. Other values are possible. (Samba, for instance, has a configuration option that allows you to put in anything you like.) The empty string is used for the hidden IPC$ share. There are two bits defined in the OptionalSupport bitfield:
There is a note in the SNIA doc that states that some servers will leave out the OptionalSupport field even if the LANMAN2.1 or later dialect is negotiated. It does not say whether SMB_SUPPORT_SEARCH_BITS should be assumed in such cases. 2.9.2 SMB EchoHere's a toy we can play with. ECHO is really as simple as it sounds. It's sort of the SMB equivalent of ping. The client sends a packet with a data block full of bytes, and the server echoes the block back. Simple. ...but this is CIFS we're talking about. Although the ECHO itself is simple, there are many quirks to be found in existing implementations. We will dig into this just a tiny bit to give you a taste of the kinds of problems you are likely to encounter. Let's start with a quick look at the ECHO REQUEST structure: SMB_PARAMETERS { WordCount = 1 EchoCount = <In theory, anything from 0 to 65535> } SMB_DATA { ByteCount = <Number of data bytes to follow> Bytes = <Your favorite soup recipe?> }
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Neun und neunzig luftballons Auf ihrem Weg zum Horizont. -- 99 Luftballoons, Nena |
The EchoCount field is a multiplier. It tells the server to respond EchoCount times. If EchoCount is zero, you shouldn't get any reply at all. If EchoCount is 9,999, then you are likely to get nine thousand, nine hundred, and ninety-nine replies. We say likely because of the wide variety of weirdity that can be seen in testing. One bit of weirdation is that all of the systems that were tested would respond to an ECHO REQUEST even if no SESSION SETUP had been sent and no authentication performed. This behavior is, in fact, per design, but it means that any client that can talk to your server from anywhere can ask for EchoCount replies to a single request. (It would probably be safer for the server to send a ERRSRV/ERRnosupport error message in response to an un-authenticated ECHO REQUEST.) Other strangisms of note:
The ECHO SMB may be one of those things that gets coded up just because it's in the documentation and it seems easy. It also appears as though ECHO hasn't been tested much. Certainly, the more it is stressed the more variation that can be seen. There is, however, something to note in the last example in the list above and in the message from Conrad: once you know what you're looking at, you will find common themes that appear and reappear across a given implementation. These common themes are derived from common internals, and they can provide many clues about the inner workings of the implementation. Another fine point highlighted by our quick look at the ECHO SMB is that TCP is designed to carry streams of data--not discrete packets. This can be seen in the results of the tests against Samba, in which multiple replies were contained in a single TCP packet. At the other extreme, several TCP packets are needed to transfer a single ECHO if it has a very large data payload. As a result, a single read operation may or may not return one and only one complete SMB message.
2.9.3 Readin', Writin', and 'RithmaticHere is a quick run-down on some of the basic essentials of SMB.
Remember earlier when we talked about SMB messages as if we were dissecting some strange, new species of multi-legged critter? Well, we've moved beyond Entomology, Invertebrate Zoology, Taxonomy, and such. We're now studying really complex stuff like Sociology, Psychology, and Numismatics, and we get to put the little critters into Skinner boxes and see how they react to various stimuli. It's important research, and there are all sorts of interesting things to discover. Consider, for example, the SMB_COM_COPY command. It's supposed to allow you to copy a file from one location on the server to another location. That saves the client from having to read the data over the wire and write it back again. A good idea, eh? Unfortunately, no one seems to be able to get it to work--at least, not against Windows servers. There has been some limited success in the laboratory...
SMB is an old protocol, and it has gotten sloppy over the years. As you work your way through the SMB messages, implementing first the easy ones and then the more difficult ones, keep this thought in mind: It's not your fault. Say it to yourself now: "It's not my fault." Very good. That will prevent you from getting frustrated and doubting your own skill. It's really not your fault. 2.9.4 Transaction SMBsWe are going to blast through this, so you'd better get your running shoes on. The purpose of the Transaction SMBs is to carry specialized sub-protocols. Examples include the Remote Administration Protocol (RAP) and Microsoft's implementation of DCE/RPC (MS-RPC). There are other, more esoteric sets of calls as well. We will play with some of them when we get to the Browse Service. Think of these sub-protocols as sets of function calls that are stretched across the network. As suggested in figure 2.18, a function call is made on the client side and the parameters and data are packed up and shoved across the network. The call is then completed at the remote end and the results (if any) are packed up and shoved back. In CIFS jargon, that's called a transaction. Transactions are designed to be able to transfer more data than the limit imposed by the negotiated buffer size. They do so by fragmenting the payload. The protocol for sending large Protocol Data Units (PDUs) is described in a variety of documents, but here is a quick run-down:
There are three primary Transaction SMBs:
Those are really long names, so folks on the various mailing lists tend to shorten them to "SMBtrans", "Trans2", and "NTtrans", respectively. Each of these also has a matching secondary:
There is very little difference between these three transaction types, except that the NTtrans SMB has 32-bit fields where the other two have 16-bit fields. That means that NTtrans can handle a lot more data (that is, much larger transactions). Besides that, the real difference between these three is the set of functions that are traditionally carried over each. The SNIA doc and the Leach/Naik CIFS draft provide examples of transactions that use Trans2 and NTtrans. Calls that use SMBtrans are documented elsewhere. Places to look include Luke's book (DCE/RPC over SMB), the Leach/Naik Browser and RAP Internet Drafts, and the X/Open documentation (particularly IPC Mechanisms for SMB). These (as you already know) are listed in the References section. 2.9.4.1 Mailslots and Named PipesJust to simplify things even further, SMBtrans supports yet another layer of abstraction. Mailslots and Named Pipes are used to access specific sets of remote functions. For example, the "LANMAN" pipe (which is identified as \PIPE\LANMAN) is always used for RAP calls. Named Pipes are two-way inter-process communications channels. Once opened, they can be read from or written to as if they were files. In contrast, Mailslots are used for one-way, connectionless communications. ...and this is where something unexpected happens. Mailslot messages are sent using SMBs transported via the NBT Datagram Service. You'll have to see it to believe it, but that is easily arranged. All you need to do is grab a packet capture of port 138 on an active LAN, one with a few local servers that announce themselves to the working Network Neighborhood. If you don't like to wait, reboot something. A Windows/9x system that offers shares will do nicely. This topic will be revisited in the Browse Service section. If you want to do some extra-curricular reading, the X/Open IPC Mechanisms for SMB document is recommended.
2.10 The Remaining OdditiesPromises were made, and promises should be kept. Remember that closet full of concepts that burst open and spilled out all over the floor? Well, we have managed to clean up a good deal of the mess, but there are still a few things that we said we would put away--and we will. We can provide a brief explanation of each of these as we shove them back into the closet, just so you are not surprised when you stumble across them in the literature. 2.10.1 Opportunistic Locks (OpLocks)OpLocks are a caching mechanism. A client may request an OpLock from an SMB server when it opens a file. If the server grants the request, then the client knows that it can safely cache large chunks of the file and not tell the server what it is doing with those cached chunks until it is finished. That saves a lot of network I/O round-trip time and is a very big boost to performance. The problem, of course, is that other clients may want to access the same file at the same time. As long as everyone is just reading the file things are okay, but if even one client makes a change then all of the cached copies held by the other clients will be out of sync. That's why OpLock handling is a bit tricky. There are two types of OpLocks that a client may request:
We came across these two when digging into the SMB_HEADER.FLAGS field way back in section 2.5.2. In olden times, the client would request an OpLock by setting the SMB_FLAGS_REQUEST_OPLOCK bit and, optionally, the SMB_FLAGS_REQUEST_BATCH_OPLOCK bit in the FLAGS field when opening a file. Now-a-days the FLAGS bits are (supposedly) ignored and fields within newer-style SMBs are used instead. Anyway, an Exclusive OpLock can be granted if no other client or application is accessing the file at all. The client may then read, write, lock, and unlock the cached portions of the file without informing the server. As long as the client holds the Exclusive OpLock, it knows that it won't cause any conflicts. It's sort of like a kid sitting in a corner of the kitchen with a spoon and a big ol' carton of ice cream. As long as no one else is looking, that kid's world is just the spoon and the ice cream. Batch OpLocks are similar to Exclusive OpLocks except that they cause the client to delay sending a CLOSE SMB to the server. This is done specifically to bypass a weirdity in the way that DOS handles batch files (batch files are the DOS equivalent of shell scripts). The problem is that DOS executes these scripts in the following way:
Yes, you've read that correctly. The batch file is opened and closed for every line. It's ugly, but that's what DOS reportedly does and that's why there are Batch OpLocks. To make Batch OpLocks effective, the client's SMB layer simply delays sending the CLOSE message. If the file is opened again, the CLOSE and OPEN simply cancel each other out and nothing needs to be sent over the wire at all. That also means that the client can keep hold of the cached copy of the batch file so that it doesn't have to re-read it for every line of the script. There is also a third type of OpLock, known as a Level II OpLock, which the client cannot request but the server may grant. Level II OpLocks are, essentially, "read-only" OpLocks. They permit the client to cache data for reading only. All operations which would change the file or meta-data must still be sent to the server. Level II OpLocks may be granted when the server cannot grant an Exclusive or Batch OpLock. They allow multiple clients to cache the same file at the same time so, unlike the other two, Level II OpLocks are not exclusive. As long as all of the clients are just reading their cached copies there is no chance of conflict. If one client makes a change, however, then all of the other clients need to be notified that their cached copies are no longer valid. That's called an OpLock Break. 2.10.1.1 OpLock BreaksIt's called an "OpLock Break" because it involves breaking an existing OpLock. The more formal term is "revocation", but no one actually says that when they get together after hours to sit around, drink tea, and whine about CIFS. OpLock Breaks are sent from the server to the client. This is unusual, because SMB request/response pairs are always initiated by the client. The OpLock Break is sent out-of-band by the server, which is against the rules...but this is CIFS we're talking about. Who needs rules? The OpLock Break is sent in the form of a SMB_COM_LOCKING_ANDX message. The server may send this to reduce an Exclusive or Batch OpLock to a Level II OpLock, or to revoke an existing OpLock entirely. In either case, the client's immediate responsibility is to flush its cache to comply with the new OpLock status. If the client held an Exclusive or Batch OpLock, it must send all writes to the server and request any byte-range locks that it needs in order to continue processing. If the OpLock has been reduced to a Level II OpLock, the client may keep its local cache for read-only purposes. Note that there is a big difference between OpLocks and the more traditional types of locks. With a traditional file or byte-range lock, the client is in charge once it has obtained a lock. It can maintain it as long as needed, relinquishing it only when it is finished using it. In contrast, an OpLock is like borrowing your neighbor's lawnmower. You have to give it back when your neighbor asks for it. Support for OpLocks is optional on both the client and the server side, but implementing them provides a hefty performance boost. More information on OpLocks may be found in the Paul Leach/Dan Perry article CIFS: A Common Internet File System (listed in the References), as well as the usual sources. 2.10.2 Distributed File System (DFS)The CIFS Distributed File System (DFS) is not nearly as fancy as it sounds. It is simply a way to collect separate shares into a single, virtual tree structure. It also has some limited ability to provide fileserver redundancy and load balancing. The key feature of DFS is that it can create links from within a shared tree on one server to shares and directories on another, thus providing a single point of entry to a virtual SMB tree. From the user's perspective, the whole thing looks like a single share, even though the resources are scattered across separate SMB servers. Clear as mud? Perhaps an illustration will help... In figure 2.19, the client is shown attempting to access a file on server PETSERVER. Well, that's where the client thinks the file resides. On the server side, the name CORGIS in the DOGS directory is actually a link to another UNC pathname: \\DATADOG\CORGIS. Following that link leads us to a different share on a different server. The server offering the DFS share (PETSERVER, in our example) does not act as a proxy for the client. That is, it won't follow the DFS links itself. Instead, the server sends an error code to the client indicating that there is some additional work to be done. The error code is either a DOS code of ERRSRV/ERRbadtype (0x02/0x0003), or an NT_STATUS code of STATUS_DFS_PATH_NOT_COVERED (0xC0000257). The client's task, at this point, is to query the server to resolve the link. The client sends a TRANS2_GET_DFS_REFERRAL which is passed to the server via the Trans2 transaction mechanism, briefly described earlier. The client will use the information provided in the query response to create a new UNC path. It must then establish an SMB session with the new server. This whole mess is known as a "DFS referral". It was mentioned above that DFS can provide a certain amount of redundancy. This is possible because the links in the DFS tree may contain multiple references. If the client fails to connect to the first server listed in the referral it can try the second, and so on. DFS can also provide a simple form of load balancing by reshuffling the order in which the list of links is presented each time it is queried. Of course, load balancing and redundancy are only workable if all of the linked copies are in sync. A quick search of the web will turn up a lot of articles and papers that do a better job of describing the behavior of DFS that the blurb provided here. If you are planning on implementing DFS it is worthwhile to read up on the subject a bit, just to get a complete sense of how it is supposed to work from the user or network administrator's perspective. The SNIA doc provides enough information to get you started building a working client implementation. The server side is more complex because doing it right involves implementing a set of management functions as well. 2.10.3 DOS Attributes, Extended File Attributes, Long Filenames, and SuchlikeThese all present the same problem. The CIFS protocol suite is designed, in its heart and soul, to work with DOS, OS/2, and Windows systems. As a result, the protocols that make up the CIFS suite have a tendency to reflect the behavior of those operating systems. DOS, of course, is the oldest and simplest of the IBM/Microsoft family of PC OSes. The filesystem used with DOS is the venerable File Allocation Table (FAT) filesystem which, according to legend, was originally coded up by Bill Gates himself. The characteristics of the FAT filesystem should be familiar to anyone who has spent any time working with DOS. Consider, for example, the following FAT features:
It's a fairly spartan system. There are improvements and extensions that have appeared over the years. The FAT32 filesystem, for example, is a modified version of FAT that uses disk space more efficiently and also supports much larger disk sizes than the original. There is also VFAT, which keeps track of both 8.3 format filenames and longer secondary filenames that may contain a wider variety of characters than the 8.3 format allows. VFAT long filenames are case-preserving (but not case sensitive) so, overall, VFAT allows a lot more creativity with file and directory names57. Even with these extensions, the semantics of the FAT filesystem are not sufficient to meet the needs of more powerful OSes such as OS/2 and WindowsNT. These OSes have newer, more complex filesystems which they support in addition to FAT. Specifically, OS/2 has HPFS (High Performance File System) and WindowsNT & W2K can make use of NTFS (New Technology File System). These newer filesystems have lots and lots of features which, in turn, have to be supported by CIFS. Problems arise when the server semantics (made available via CIFS) do not match those expected by the client. Consider, for instance, Samba running on a Unix system. Unix filesystems typically have these general characteristics:
Now consider a Windows application that requires the old 8.3 name format. (Such applications do exist. They make calls to older, 16-bit OS functions that assume 8.3 format.) Unlike VFAT, Unix filesystems do not normally keep track of both long and short names. That causes a problem, and Samba has to compensate by generating 8.3 format names on the fly. The process is called "Name Mangling". There are other gotchas too. Indeed, name mangling is just the tip of the proverbial iceberg. One solution that some CIFS vendors have been able to implement is to develop a whole new filesystem for their server platform, one that maintains all of the required attributes and maps between them as necessary. This is a pain, but it works in situations in which the server vendor has control over the deployment of their product. One such filesystem is Microsoft's NTFS, which can handle a very wide variety of attributes and map them to the semantics required by Apple Macintosh clients, Unix clients, DOS clients, OS/2 clients... You've got the basic idea. Let's run through some of the trouble spots to give you a sense of what you're up against.
CIFS offers facilities to support all of these features and more. That's good news if you are writing client code, because you can pick and choose the sets of attributes you want to support. It's bad for server systems, which may need to offer various levels of compatibility in order to contend with client expectations.
2.11 That Just About Wraps Things Up for SMBIf the Internet has proven anything it's that a very large number
of primates banging randomly on keyboards over a long enough period of
time can and will produce some amazingly useful software. On the
other hand, if you gather some of those primates together, place them
into cubicles, and train them to perform like circus animals...
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Let's just get rid of these horrible protocols. -- Andrew Tridgell, Samba Team Leader |
...well, we've just put a lot of effort into cleaning up the mess that was made in those cubicles. A shame, really. It was a nice little protocol when it started out. Although the SNIA gave it their best shot, there are currently no industry committees or standards groups writing bona fide specifications to be reviewed and voted on, and no standard test suites to verify conformity. That's not to say that specifications and test suites don't exist--quite the contrary. The problem is that they have no teeth. With no real standards and no real enforcement, the only measure of correctness for an SMB implementation is whether or not it works most of the time. Since most of the clients out there are Windows clients, the formula simplifies down to whether or not an implementation works with Windows. An additional problem is that SMB itself is not enough for true interoperability with Windows systems--particularly if you want to write a workable server. In San Jose, California, there is a mansion known as the Winchester
Mystery House. It started out as a simple farmhouse, but it was
expanded over a period of thirty-eight years by a millionaire widow
with an obsessive compulsion to keep on adding new rooms. It has
stairways that rise directly into the ceiling, windows in the floor,
doors that open to solid walls...and that's just for starters. The
building covers four and a half acres and has an estimated 160 rooms.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CIFS is like that. The original SMB protocol was simple and well suited to its environment. Over the years, however, it has been greatly expanded. Several sub-protocols have been added on as well. These subprotocols (which include such things as the Extended Security protocols, RAP, MS-RPC, etc.) are implemented by Windows so, if you want to build something truly compatible, SMB alone just isn't enough. ...but don't go away feeling that it is all just a hopeless mess. It is really a question of how much effort you are willing to put into solving the problems you will encounter. Take it one step at a time, because the individual pieces are much less daunting than the whole. |
1 The X/Open SMB documentation is out of print, but electronic copies are now available on-line (free registration required). See: http://www.opengroup.org/products/publications/catalog/, and look for documents #C195 and #C209.
2 I must rely on anecdotal evidence to support this claim. Due to the licensing restrictions, I have not read these documents, which were released in March of 2002.
3 ...and Tactical Officer. He's the one with the prosthetic forehead.
4 I live in Minnesota, where it most definitely snows in winter. I share my home with a Pembrokeshire Welsh Corgi and a Golden Retriever, so the springtime scenario described above is vividly real and meaningful to me. Some of my Australian Samba Team friends have suggested that people in other parts of the world may find it less familiar. Use your imagination.
5 There are some old, archived conversations on Microsoft's CIFS mailing list which suggest that some implementors were--and possibly still are--only allowing for a 16 bit LENGTH field in the NBT SESSION MESSAGE.
6 Steve French says that OS/2 may have been the first OS to fully support the UNC scheme.
7 The distinction between a URL and a URI is subtle, and confuses me to no end. Fortunately, it is not something we need to worry about.
8 The "host" field is not really a field, but the name of a non-terminal in the BNF grammar presented in RFC 2396. That grammar has been amended to support IP version 6 (IPv6) addressing in RFC 2732. The SMB URL format adds support for the use of NetBIOS names and Scope IDs, so it is a further extension of the syntax.
9 Additional source code is available at http://ubiqx.org/libcifs/.
10 See RFC 2732 for information on the use of IPv6 addresses in URLs.
11 Samba's nmbd daemon spawns a separate process to handle DNS queries, just to get around this very problem.
12 When working with the NET USE command, it is important to remember to close the connection to the server using the /d command-line option. Type NET HELP at the DOS prompt for more information.
13 The original was much more detailed and interesting. It had to be edited so that it would fit on the page, and because all those details can be distracting.
14 ...to me.
15 The first place to look is Microsoft's CIFS FTP site: ftp://ftp.microsoft.com/developr/drg/CIFS/. The COREP.TXT file is formatted for printing on an old-style dot-matrix printer, which makes it look a little goofy in places (eg. bold font is accomplished by typing a character, then backspacing, then re-typing the same character). The same content is available in an alternate format in the file SMB-CORE.PS. See the References section.
16 Ethereal version 0.9.3 will report the name of the last AndX Command in the chain, rather than the first. This was fixed somewhere between 0.9.3 and 0.9.6. The trick with Ethereal is to update early and often.
17 We are dealing with a vague definition here. According to the SNIA doc, the SESSION SETUP is meant to "set up" the session created by the NEGOTIATE PROTOCOL, which also makes some sort of sense. Thing is, there may be multiple SESSION SETUP exchanges following the NEGOTIATE PROTOCOL, meaning multiple SMB user sessions per NBT or naked TCP transport session. The waters are muddy.
18 This is exactly what jCIFS does (up through release 0.6.6 and the 0.7.0beta series). There has been a small amount of discussion about supporting the NT_STATUS codes, but it's not clear whether there is any need to change.
19 After all that work... Sometime around August of 2002, Microsoft posted a bit of documentation listing the DOS error codes that they have defined. Not all are used in CIFS, but it's a nice list to have. In addition, they have documented an NTDLL.DLL function that converts DOS error codes into NT_STATUS codes. [Thanks to Jeremy for finding these.]
20 The English language is Copyright (C) 1597 by William Shakespeare & Co., used by permission, all rights deserved.
21 One of the reasons that the jCIFS project was started is that Java has built-in Unicode support, which solves a lot of problems. That, plus the native threading model and a few other features, made an SMB implementation in Java very tempting. Support for Unicode in a CIFS implementation is not really optional any more except, perhaps, in the simplest of client systems. Unfortunately, Unicode is way beyond the scope of this book. See the References section for some web links to get you started with Unicode.
22 ...or was, last time I checked. Once again, that URL is: ftp://ftp.microsoft.com/developr/drg/CIFS/. See the References section for links to specific documents.
23 There may be a further problem with raw mode. Microsoft has made some obtuse references to obscure patents which may or may not be related to READ RAW and WRITE RAW. The patents in question have been around for quite some time, and were not mentioned in any of the SMB/CIFS documentation that Microsoft released up until March of 2002. Still, the best bet is to avoid ReadRAW and WriteRAW (since they are not particularly useful anyway) and/or check with a patent lawyer. The Samba Team released a statement regarding this issue. See: http://us1.samba.org/samba/ms_license.html.
24 There is no name for 10-7seconds. Other fractions of seconds have names with prefixes like deci, centi, milli, micro, nano, pico, even zepto, but there is no prefix that applies to 10-7. In honor of the fact that this rare measure of time is used in the CIFS protocol suite, I propose that it be called a bozosecond.
25 January 1, 1970, 00:00:00.0 UTC, known as "the Epoch", is sometimes excused as being the approximate birthdate of Unix.
26 This is probably because Saint Paul is at the center of the universe. The biomagnetic center of the universe used to be located across the river in Minneapolis until they closed it down. It was a little out of whack in the same way that the magnetic poles are not quite where they should be. The magnetic north pole, for instance, is on or near an island in northern Canada instead of at the center of the Arctic Ocean where it belongs.
27 A lot of time was wasted trying to figure out which configuration options would change the behavior. The results were inconclusive. At first it seemed as though the DomainName was included if the Windows98 system running in User Level security mode, and passing logins through to an NT Domain Controller. Further testing, however, showed that this was not a hard-and-fast rule. It should also be mentioned that if the systems are running naked transport there may not be an NT Domain or Workgroup name. SMB can be mightily inconsistent--but not all the time.
28 To be pedantic, the correct terms are "marshalling" and "unmarshalling". "Marshalling" means collecting data in system-internal format and re-organizing it into a linear format for transport to another system (virtual, physical, or otherwise). "Unmarshalling", of course, is the reverse process. These terms are commonly associated with Remote Procedure Call (RPC) protocols, but some have argued (not unreasonably) that SMB is a simple form of RPC.
29 If you enjoy digging into odd details, this is a great one. See the SMB-LM1X.PS file, also known as Microsoft Networks/SMB File Sharing Protocol Extensions, Version 2.0, Document Version 3.3. In particular, see the definition of a VC on page 2, and the description of the "Virtual Circuit Environment" in section 4.a on page 10.
30 See Microsoft Knowledge Base article #301673 for more information.
31 There are a few small notes scattered about the SNIA doc that suggest that the prescribed compression algorithm is something called LZNT. I haven't been able to find a definitive reference that explains what LZNT is, but it appears from the name that it is a form of Lempel-Ziv compression.
32 It was, in fact, a lot of work for the Samba Team. Those involved did a tremendous job, and they deserve several rounds of applause. Things were much easier for jCIFS because Java natively supports Unicode.
33 Information on RAP calls is scattered among several sources, including the archives of Microsoft's CIFS mailing list. The SNIA doc has enough to get you started with the basics of RAP, but see also the file cifsrap2.txt which can be found on Microsoft's aforementioned FTP site.
34 Luke Kenneth Casson Leighton's book DCE/RPC over SMB: Samba and Windows NT Domain Internals is an essential reference for CIFS developers who need to know more about MS-RPC.
35 I vaguely remember a conversation with Tridge in which he indicated that there was an obscure exception to the misalignment of the Data block. I'm not sure which SMB, or which dialect, but if I recall correctly there's one SMB that has an extra byte just before the ByteCount field. Keep your eyes open.
36 In addition to "something you have " and "something you know " there is another class of access token that is sometimes described as "something you are". This latter class, also known as "biometrics", includes such things as your fingerprints, your DNA pattern, your brainwaves, and your karmic aura. Some folks have argued that these features are simply "something you have" that is a little harder (or more painful) to steal. There was great hope that biometrics would offer improvements over the other authentication tokens, but it seems that they may be just as easy to crack. For example, a group of researchers in Japan was able to fool fingerprint scanners using fake fingertips created from gelatin and other common ingredients.
37 ...sort of. Support for inclusion of a password within a URL is considered very dangerous. The recommendation from the authors of RFC 2396 is that new applications should not recognize the password field and that the application should instead prompt for both the username and password.
38 Yet again we seek the wisdom of the RFCs. See Appendix A of RFC 2396 for the full generic syntax of URLs, and RFC 2732 for the IPv6 update.
39 See the discussion of the password level parameter in Samba's smb.conf(5) documentation for more information about these problems.
40 I don't know whether a Windows server can be configured to support Unicode plaintext passwords. To test against Samba, however, you need to use Samba version 3.0 or above. On the client side, Microsoft has a Knowledge Base article--and a patch--that addresses some of the message formatting problems in Windows2000. See: Microsoft Knowledge Base Article #257292. Thanks to Nir Soffer for finding this article.
41 If you are interested in the workings of DES, Bruce Schneier's Applied Cryptography, Second Edition provides a very complete discussion. See the References section.
42 ...without whom the Authentication section would never have been written.
43 Both the the X/Open doc and the expired Leach/Naik draft state that the padding character is a space, not a nul. They are incorrect. It really is a nul.
44 The magic string was considered secret, and was not listed in the Leach/Naik draft. The story of Tridge and Jeremy's (pre-DMCA) successful effort to reverse-engineer this value is quite entertaining.
45 A "cracker", not a "hacker". The former is someone who cracks passwords or authentication schemes with the goal of cracking into a system (naughty). The latter is one who studies and fiddles with software and systems to see how they work and, possibly, to make them work better (nice). The popular media has mangled the distinction. Don't make the same mistake. If you are reading this book, you most likely are a hacker (and that's good).
46 Jeremy Allison proved it could be done with a little tool called PWdump. Mudge and other folks at the L0pht then expanded on the idea and built the now semi-infamous L0phtCrack tool. In July of 1997, Mudge posted a long and detailed description of the decomposition of LM challenge/response, a copy of which can be found at: http://www.insecure.org/sploits/l0phtcrack.lanman.problems.html. For a curious counterpoint, see Microsoft Knowledge Base Article #147706.
47 Andrew Bartlett prefers to call this the "NT Hash", stating that the NT Hash is passed through the LM response algorithm to produce the NTLM (NT+LM) response.
48 MD4 is explained in RFC 1320 and MD5 is in RFC 1321. HMAC in general, and HMAC-MD5 in particular, is written up in RFC 2104. ...an embarrassment of riches. As usual with this sort of thing, a deeper understanding can be gained by reading about it in Bruce Schneier's Applied Cryptography, Second Edition. See the References section.
49 The lab in the basement is somewhat limited which, in turn, limits my ability to do rigorous testing of esoteric CIFS nuances. You should probably verify these results yourself. Andrew Bartlett (him again!) turned up an interesting quirk regarding the NTLMv2 Response calculation when authenticating against a stand-alone server. It seems that the NT Domain name is left blank in the v2hash calculation. That is: destination = "";
50 Luke Kenneth Casson Leighton's book DCE/RPC over SMB: Samba and Windows NT Domain Internals gives an outline of the structure of the data blob used in NTLMv2 Response creation. Using Luke's book as a starting point, the details presented above were worked out during a late-night IRC session. My thanks to Andrew Bartlett, Richard Sharpe, and Vance Lankhaar for their patience, commitment, and sudden flashes of insight. Thanks also to Luke Howard for later clarifying some of the finer points.
51 A quick web search for "LMCompatibility" will turn up a lot of references, Microsoft Knowledge Base Article #147706 among them.
52 ...so that if something goes wrong you can blame them, and not me.
53 It might be worth doing some testing if you really want to use DOS codes in your implementation, but also want Extended Security. It may be possible to use the NT_STATUS codes for this exchange only, or you might try interpreting any unrecognized DOS error code as if it were STATUS_MORE_PROCESSING_REQUIRED.
54 The latter name--though decidedly less Freudian--is somewhat gender-biased.
55 Jean-Baptiste Marchand has done some digging and reports that starting with Windows2000 the SMB redirector (rdr) has been redesigned, which may impact which registry keys are fiddled. The preferred way to configure SMB MAC signing in Windows 2000 is to use the Local Security Settings/Group Policy Management Console (whatever that is). Basically, this means that Windows2000 and WindowsXP have MAC signing settings comparable to those in WindowsNT, but they are handled in a different way.
56 I vaguely remember a presentation given by David Korn, author of the Korn Shell (ksh), regarding AT&T's UWIN project. At the end of the presentation there was some discussion regarding the differences between standard Posix APIs and Win32 APIs. It was pointed out that there were hundreds or possibly thousands of permutations of parameter values that could be passed to the Posix open() function. The permutations for the equivalent Win32 function, it was reported, was on the order of millions. How the heck do you test all those possibilities?
57 Digging through the documentation, it appears that the FAT family consists of FAT12, FAT16, FAT32, and VFAT. There is documentation on the web that provides implementation details, if you are so inclined.
58 NTFS is a complex filesystem based on some simple concepts. One such is that each "file" is actually a set of "attributes" (records). Many of these attributes are pre-defined to contain such things as the short name, the long name, file creation and access times, etc. The actual content of the file is stored in a specific, pre-defined "stream", where a stream is a particular kind of attribute. NTFS supports OS/2-style Extended Attributes in another type of NTFS attribute... and it just gets more confusing from there. There is a lot of documentation on the web about the workings of NTFS, and there is a project aimed at implementing NTFS for Linux.
Copyright © 1999-2004 Christopher R. Hertel All rights reserved. $Revision: 1.289 $ |