1. NBT: NetBIOS over TCP/IP


    

1.1 A Short Bio of NetBIOS

In those days spirits were
brave, the stakes were high,
men were REAL men,
women were REAL women,
and small furry creatures from
Alpha Centauri were REAL
small furry creatures from
Alpha Centauri.
-- The Hitchhikers Guide
To The Galaxy
,
Douglas Adams

  

It all started back in the frontier days of the PC when Microsoft was a lot smaller, IBM seemed a whole lot bigger, and Apple owned personal computer territory as far as the eye could see. Back then, you didn't need no dang standards. If you wanted to sell LANs, you just went out and branded yourself a protocol. Apple had AppleTalk, Digital had DECnet and, for their longhorn Mainframes, IBM had Systems Network Architecture (SNA). SNA was a mighty big horse for little PCs, so IBM hired on a company called Sytek [Annotation] and together they rustled up a product they called "PC Network". Not an inspiring name, but it was a simpler time.

PC Network was a Local Area Network (LAN) system designed to support about 80 nodes at best, with no provision for routing. NetBIOS (Network Basic Input Output System) was the software interface to the PC Network hardware. It offered a set of commands that could control the hardware, establish and delete sessions, transfer data, etc.

1.1.1 NetBIOS and DOS: The Early Years

Starting with DOS version 3.1, Microsoft used the NetBIOS API to transport SMB file service messages. They created something called a redirector, and its job was to catch disk drive or port references (eg. "C:" or "LPT3:") and look them up in a table. If the device was not in the table, the call was passed along to DOS. If the device was in the table, then the call would be redirected. For example:

  • Using the SUBST command, a user could substitute a drive letter for a local path. This simple aliasing provided convenient shortcuts for long path names.

    subst S: C:\FILES\DEEP\IN\A\DIRECTORY
  • Using the NET command, a drive letter could be mapped to a remote file service. So, if the redirector found a remote service entry in its table, it would convert the request into an SMB packet and send it out via NetBIOS.

    net use N: \\SERVER\SERVICE

    Note the double backslash preceding the server name. This syntax is part of Microsoft's "Universal Naming Convention" (UNC) for network services.

[Annotation] These commands are still available from within the DOS shells of contemporary Windows products. It is worth-while to fiddle with them a bit. From the DOS prompt you can type NET HELP for a summary of the NET command and its options1.


1.2 Speaking NetBIOS

The hardware part of IBM's PC Network is no longer in use and the protocol that actually ran on the wire is all but forgotten, yet the NetBIOS API remains. Vast untold hoards of programs--including DOS itself--were written to use NetBIOS. Like COBOL, it may never die.
 

Genuine Imitation
-- well known oxymoron

  

Many vendors, eager for a piece of the Microsoft desktop pie, figured out how to implement the NetBIOS API on top of other protocols. There is NetBIOS over DECnet, NetBIOS over NetWare, NetBIOS over mashed potatoes and gravy with creamed corn, NetBIOS over SNA, NetBIOS over TCP/IP, and more. Of these, the most popular, tasty, and important is NetBIOS over TCP/IP, and that's what this chapter is really all about.

[Figure 1.1]

NetBIOS over TCP/IP is sometimes called NetBT or NBT. Folks from IBM--for reasons unfathomable--sometimes call it TCPBEUI. NBT is the simplest and most common name, so we'll stick with that.

On the 7-layer OSI reference model, NetBIOS is a session-layer (layer 5) API. Under DOS and its offspring, applications talk to NetBIOS by filling in a record structure known as a Network Control Block (NCB) and signaling an interrupt. The NCBs are used to pass commands and messages between applications and the underlying protocol stack.

Fortunately, the NetBIOS API is specific to DOS and its kin. Unix and other systems do not need to implement the NetBIOS API, as there is no legacy of programs that use it. Instead, these systems participate in NBT networks by directly handling the TCP and UDP packets described in two Internet Engineering Task Force (IETF) Request for Comments documents: RFC 1001 and RFC 1002 (known collectively as Internet Standard #19). These RFCs describe a set of services which work together to create virtual NetBIOS LANs over IP.

1.2.1 Emulating "NetBIOS LANs"

...there is often confusion in
the use of some of the terms...
-- NetBIOS NetBEUI NBF
Networking
,
Timothy D. Evans

  

At this point, we hit an interesting twist in the terminology. NetBIOS is a driver that presents an API; it is neither a protocol nor a topology. The API does, however, make a number of assumptions about the workings of the underlying network, and it presents some quirky restrictions. The terms "NetBIOS Network" and "NetBIOS LAN" are commonly used to identify the network architecture that is, essentially, defined by the NetBIOS API.

RFC 1001 and 1002 list three basic services which must be supported in order to implement NetBIOS LAN emulation. These are:

  • the Name Service,
  • the Datagram Service, and
  • the Session Service.

The Name Service is used to map NetBIOS names (addresses) to IP addresses in the underlying IP network. The Datagram Service provides for the delivery of NetBIOS datagrams via UDP, and the Session Service is used to establish and maintain point-to-point, connection-oriented NetBIOS sessions over TCP.

1.2.1.1 The NetBIOS Name Service

The NetBIOS name service is
the collection of procedures
through which nodes acquire,
defend, and locate the holders
of NetBIOS names.
-- RFC 1001, Section 15
  

The NetBIOS LAN architecture is very simple. No routers, no switches--just a bunch of nodes connected to a (virtual) wire. There is no need for separate hardware addresses, network addresses, or even port numbers as there is in IP. Instead, the communications endpoints are identified by 16-byte strings known as "NetBIOS Names".

NetBIOS addressing is dynamic. Applications may add names as needed, and remove those names when they are finished. Each node on the LAN will also have a default name, known as the Machine Name or the Workstation Service Name, which is typically added when NetBIOS starts. The process of adding a name is called registration.

There are two kinds of names that can be registered: unique and group. Group names may be shared by multiple clients, thus providing a mechanism for multicast. In contrast, unique names may only be used by one client per LAN. Keep in mind, though, that these are virtual LANs which may actually be spread out across different subnets in a routed IP internetwork.

[Figure 1.2]

The Name Service is supposed to keep track of all of the NetBIOS names in use within the virtual LAN, and ensure that messages sent to a given NetBIOS name are directed to the correct underlying IP address. It does this in two ways:
 
On an IP LAN:  Each node keeps a list of the names that it has registered (that is, the names it "owns"). When sending a message, the first step is to send an IP broadcast query, called a NAME QUERY REQUEST. If there is a machine on the IP LAN that owns the queried name, it will reply by sending a NAME QUERY RESPONSE.

So, to send a message to the node which has registered the name EADFRITH the sender calls out "Yo! Eadfrith!". EADFRITH responds with an "I am here!" message, giving its IP address.

[Figure 1.3]

This is known as 'B mode' (broadcast) name resolution, and participants are referred to as 'B nodes'. In B mode, each node keeps track of--and answers queries for--its own names, so the NetBIOS Name Service "database" is a distributed database.
 

Over a Routed Internet:  Broadcasts aren't meant to cross subnet boundaries, so a different mechanism is used when the nodes are separated by routers.

The Network Administrator chooses a machine to be the NetBIOS Name Server (NBNS, aka WINS Server2). Typically this will be a Unix host running Samba, or a Windows NT or W2K server. In order to use the NBNS, all of the nodes that are participating in the virtual NetBIOS LAN must be given the server's IP address. This can be done by entering the address in the client's NetBIOS configuration or, on Windows systems, via DHCP.

NBT client nodes send NetBIOS name registrations and queries directly to the NBNS, which maintains a central database of all registered names in the virtual LAN. This is known as 'P mode' (point-to-point) name resolution, and participants are referred to as 'P nodes'.

[Figure 1.4]

These are the two basic modes of NetBIOS Name Resolution over NBT. There are, of course, others. The RFCs describe 'M mode' (mixed mode), which combines P and B mode characteristics. 'H mode' (hybrid mode) was introduced later. It is similar to M mode except for the order in which B and P mode behavior is applied.

The Name Service runs on UDP port 137. According to the RFCs the use of TCP port 137 can be negotiated for some queries, though few (if any) implementations actually support this.

1.2.1.2 The NetBIOS Datagram Service

Upon receipt, the duct tape
is removed and the paper
copy of the datagram is
optically scanned into
a[n] electronically
transmittable form.
-- RFC 1149
  

In the IP world, TCP provides connection-oriented sessions in which packets are acknowledged, put in order, and retransmitted if lost. This creates the illusion of a continuous, sequential data stream from one end to the other. In contrast, UDP datagrams are simply sent. Thus, UDP requires less overhead, but it is less reliable than TCP. NetBIOS also provides connection-oriented (session) and connectionless (datagram) communications. Naturally, NBT maps NetBIOS sessions to TCP and NetBIOS datagrams to UDP.

The Datagram Distribution Service is the NBT service that handles NetBIOS datagram transport. It runs on UDP port 138, and can handle unicast (also known as "specific"), multicast (group), and broadcast NetBIOS datagrams.

Unicast (Specific):  

The handling of unicast datagrams is fairly straight-forward. The Name Service is used to resolve the destination name to an IP address. The NetBIOS packet is then encapsulated in a UDP packet and sent to the specified IP.
 

Multicast (Group)
and Broadcast:
 

According to the RFCs, a B node can simply encapsulate NetBIOS multicast and broadcast datagrams in UDP and send them to the IP broadcast address. The UDP datagram will then be picked up by all local nodes listening on the Datagram Service port (138/UDP). Thus, NetBIOS broadcast datagrams will reach all nodes in the virtual LAN. In the case of multicast datagrams, nodes which are not members of the group (have not registered the group name) will discard the message.

P, M, and H nodes are a bit more complicated, as you might expect. When the virtual LAN extends beyond the physical LAN, an IP broadcast will not reach all of the nodes in the NetBIOS name space. In order to deliver group and broadcast datagrams, the NBNS database must be consulted. How this is (or isn't) actually done will be explained in strikingly painful detail later on. Section 1.5 is dedicated to the workings of datagram distribution.
 

In theory, theory and practice
are the same. In practice,
they're not.
-- Unknown
  

The Datagram Service is probably the second-least well understood aspect of NBT, most likely because correct implementation isn't critical to filesharing. Many implementations get it wrong, and there is much debate over the value of getting it right.

1.2.1.3 The NetBIOS Session Service

The Session Service is the traditional transport for SMB, and this is our primary reason for caring about NetBIOS at all. The Session Service runs on TCP port 1393. There is no particular mechanism for multicast or broadcast because each session is, by definition, a one-to-one connection. The RFCs do, however, briefly discuss what might happen if a session setup request were sent to a group name (see RFC 1001, Section 16.1.1.2).

We will get to the details of session creation, use, and closure when we discuss Session Service implementation.

Weirdness Alert:

TCP/138 has no defined behavior under NBT and Microsoft never implemented support for NBT Name Resolution over TCP/137, yet some versions of Windows seem to listen on these two TCP ports when NBT is active.
C:\> netstat -a

Active Connections

  Proto  Local Address          Foreign Address        State
  TCP    paris:137              PARIS:0                LISTENING
  TCP    paris:138              PARIS:0                LISTENING
  TCP    paris:nbsession        PARIS:0                LISTENING
  UDP    paris:nbname           *:*
  UDP    paris:nbdatagram       *:*

It turns out that this is due to a known bug in the netstat utility included with older Windows releases:

email


From: Jean-Baptiste Marchand
To: Chris Hertel

Hello,

I've noticed that, in section 1.2, there is a weirdness alert about the Windows netstat program showing TCP ports 137 and 138 open, whereas only UDP ports 137 and 138 are actually opened by the NetBT driver.

In fact, it is a known problem in Windows NT (this is fixed in Windows 2000 and later) that netstat shows TCP ports opened, whereas only UDP ports with the same number are opened. This is documented in an entry of Microsoft's knowledge base (194171).

This article states that this is only a display problem. This is true and can be verified using any TCP port scanner.
 

1.2.2 Scope: The Final Frontier

This is a good point at which to get up, stretch, make a nice hot cup of tea for yourself, take a soothing bath, play with your cat, go for a long walk in the park, take dance lessons, volunteer in your community, sort and organize your old photographs, or join a United Nations Peace Keeping Force. The Datagram Service was previously described as 'the second-least well understood aspect of NBT'. Guess which bit wins first prize.

Scope is an oddity of NBT, not because it was a bad idea (though perhaps it was) but because few have ever bothered to really understand it. In practice this feature is rarely used, in part because it is rarely implemented to its full potential.

In the RFCs, the term scope is used as a name for:
 

Proof by Scope: The proof
of this theorem is beyond the
scope of this book.
-- Jonathan Young, PhD.
  
  • the set of NetBIOS nodes that participate in an NBT virtual LAN,
  • an identifier used to distinguish one virtual LAN from another, and
  • that which is included within the purpose of the RFC document.

...but the last of these is beyond the scope of this discussion, so let's take a closer look at the first two.

Scope is explained in RFC 1001, Section 9, which starts off by saying:

A "NetBIOS Scope" is the population of computers across which a registered NetBIOS name is known. NetBIOS broadcast and multicast datagram operations must reach the entire extent of the NetBIOS scope.

This basically means all nodes connected to the virtual LAN. So, for B nodes the NetBIOS scope consists of all nodes within the local IP broadcast domain that are running NBT. For P nodes, the NetBIOS scope includes all nodes across the routed internetwork that run NBT and share the same NBNS. For an M or H node, the scope is the union of the local broadcast and the NBNS scopes.

This is all quite straight-forward when all NBT nodes are of the same node type, but strange things can happen when you mix modes, particularly in a routed environment.

P & B: 

Two separate scopes are defined. The B nodes will only see other B nodes on the same wire, and the P nodes will only see other P nodes using the same NBNS. If creating separate NetBIOS vLANs is your goal, then mixing P and B nodes on the same wire is perfectly okay.
 

P & M: 

This results in a single scope. The M nodes perform all of the functions of a P node, including registering their names with the NBNS. Thus, all P nodes can see all M nodes, though M nodes on the same wire can bypass the NBNS when resolving names.
 

B & M: 

On a single, non-routed IP LAN there will be only one scope. The M nodes will register and resolve names via the broadcast mechanism, making their use of the NBNS pointless.

Things start going terribly wrong, though, when the NetBIOS vLAN is distributed across multiple subnets in a routed internetwork. When this happens the result is multiple, intersecting scopes. B nodes on one subnet will not be able to see any node on any other subnet. M nodes will see all other M nodes, but only the B nodes on their local wire. Thus, parts of the NetBIOS vLAN are hidden from other parts, yet all are somewhat connected via the common M node scope.

One result of this mess is the potential for name collisions. A B node could register a name that is already in the NBNS database, and an M node might register a name that one or more B nodes on remote subnets have already claimed. Name resolution then essentially fails because the same name does not resolve to the same IP address across the fractured scope.

The RFCs recognize this potential for disaster and warn against it. See RFC 1001, Section 10.
 

P, B, & M: 

From bad to worse. The P nodes can see all of the M nodes which can see some of the B nodes which cannot see any P nodes at all. B nodes and M (or H) nodes don't mix.

We now have a good handle on our first definition of scope: "the set of NetBIOS nodes that participate in a virtual LAN". What about the second: "an identifier used to distinguish one virtual LAN from another"? (This is a good point at which to get up, stretch, make a nice hot cup of tea for yourself...)

Every scope has a name, called the Scope Identifier (Scope ID). The most common Scope ID is the empty string: "". Indeed, this is the default in Windows, Samba, jCIFS, and every other system encountered so far. The only problem with this name is that it becomes too easy to forget that the Scope ID exists.

We have already seen that distinct NetBIOS vLANs can be created by using the behavior of B, P, M, and H nodes to create separate scopes. For example, multiple scopes are defined when multiple independent NBNS's provide service for P nodes. B nodes on separate IP LANs are also in separate scopes, and so on. The Scope ID provides another, more refined mechanism for separating scopes.
 

I can listen,
but I can't hear.
-- Turn a Deaf Ear
Rab Noakes
  

Think of an IP LAN with a bunch of B nodes. Some of the B nodes have Scope ID DOG, and others have Scope ID CAT. Only members of scope DOG will listen to messages sent with that ID; the cats will ignore messages sent to the dogs. So, even though all of the B nodes are on the same wire, we have two separate scopes. The same applies to P and M nodes. The Scope IDs identify, and separate, virtual NetBIOS LANs. Note, though, that an NBNS will handle requests from any node regardless of scope. A single NBNS server can, therefore, support multiple scopes.

[Figure 1.5]

According to RFC 1001/1002, a node may belong to more than one scope. In practice, however, it is much easier to choose a single scope and stick with it. This is particularly true for DOS and Windows systems because NetBIOS itself has no concept of scope. The Scope ID is a feature of NBT, and programs that call the NetBIOS API have no way of telling NBT which scope to use.

The RFCs suggest that extensions might be added to NetBIOS to manage scope, but using those extensions would require changes to applications. Further, other NetBIOS transports would not support the extensions which would result in compatibility problems.

Confusion Alert:

Scope IDs are used by the Name Service and the Datagram Service, but not the Session Service. This seems awkward at first, but it makes sense when you consider that the NetBIOS API itself has no knowledge of Scope.

Once again, Scope IDs serve only to identify virtual NetBIOS LANs. They operate at a lower level than the NetBIOS API.
 

1.2.3 Thus Endeth the Overview

Now that you have a clear and precise understanding of the workings of NetBIOS over TCP, go read RFC 1001. That ought to muddy the waters a bit. Clear or not, the next step is to write some code and see what works--and what doesn't. Actual implementation will provide a lot of opportunity to discuss details, bugs, and common errors.


1.3 The Basics of NBT Implementation

This is where the rubber
meets the road.
-- Unknown
  

Ready?

We have identified the three key parts of NBT: the Name Service, the Datagram Service, and the Session Service. This is enough to get us started. We will begin by coding up a simple Name Service Query, just to see what kind of trouble that gets us into.

Before we start, though, it's probably a good idea to check our tools.

Sniffer
You need one of these. If you have Windows systems available, see if you can get a copy of Microsoft's NetMon (Network Monitor). You will want the latest and most complete version. The advantage of NetMon is that Microsoft have included parsers for many of their protocols.

Another excellent choice is Ethereal, an Open Source protocol analyzer portable to most Unix-ish platforms and to Windows. It can create its own captures or read captures made by several other sniffer packages, including TCPDump and NetMon. Richard Sharpe and Tim Potter of the Samba Team have worked on NetBIOS and SMB packet parsers for Ethereal, which helps a big bunch.

Language
There are a lot of programming languages out there. Samba is written in C, and jCIFS is in Java. The key factors when choosing a language for your implementation are:
  • Good network coding capabilities.
  • That warm fuzzy feeling you get when you code in a language you truly grok.
Meditate on that for a while. Bad karma in the coding environment will distract you from your purpose.

Test Environment
If you do not have a couple of hubs, a router, various Windows boxes, and some Samba servers in your home, you may need to do your testing at the office. Netiquette and job security would suggest that you test after hours. (Um... actually, you probably shouldn't do any testing on a production network, and check office policy before you sniff.)

Medication
An aromatic black tea, such as a good Earl Grey, is best. Try Lapsang Souchong to get through really difficult coding sessions. Those sweet, mass-produced, over-caffeinated soft drinks will disturb your focus.

Ready!

In this section, we will implement a broadcast NAME QUERY REQUEST. That is, B mode name resolution. This will allow us to introduce some of the basic concepts and establish a frame of reference. In other words, we have to start somewhere and this seems to be as good a place as any.

1.3.1 You Got the Name, Look Up the Number

Shirley Shirley Bo Birley
Banana Fanna Fo Firley
-- The Name Game,
Shirley Ellis
  

The structure of an NBT name query is similar to that of a Domain Name System query. As RFC 1001, section 11.1.1, explains:

The NBNS design attempts to align itself with the Domain Name System in a number of ways.

The goal of this attempted alignment was an eventual merger between the NBNS and the DNS system. The NBT authors even predicted dynamic DNS update. With Windows 2000, Microsoft did move CIFS naming services to Dynamic DNS, though the mechanism is not quite what was envisioned by the authors of the NBT RFCs.

1.3.1.1 Encoding NetBIOS Names

RFC 1001 & 1002 reference RFC 883 when discussing domain name syntax rules. RFC 883 was later superseded by RFC 1035, but both give the same preferred4 syntax for domain names:

    <domain>      ::= <subdomain> | " "
    <subdomain>   ::= <label> | <subdomain> "." <label>
    <label>       ::= <letter> [ [ <ldh-str> ] <let-dig> ]
    <ldh-str>     ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>
    <let-dig-hyp> ::= <let-dig> | "-"
    <let-dig>     ::= <letter> | <digit>
    <letter>      ::= any one of the 52 alphabetic characters
                      A through Z in upper case and a through
                      z in lower case
    <digit>       ::= any one of the ten digits 0 through 9

This is the syntax that the NBT authors tried to match. Unfortunately, except for the 16-byte length restriction, there are few syntax rules for NetBIOS names. With a few notable exceptions just about any octet value may be used, so the NBT authors came up with a scheme to force NetBIOS names into compliance. Here's how it works:

  • Names shorter than 16 bytes are padded on the right with spaces. Longer names are truncated.
  • Each byte is divided into two nibbles (4 bits each, unsigned)5. The result is a string of 32 integer values, each in the range 0..15.
  • The ASCII value of the letter 'A' (65, or 0x41) is added to each nibble and the result is taken as a character. This creates a string of 32 characters, each in the range 'A'..'P'.

This is called First Level Encoding, and is described in RFC 1001, Section 14.1.

Using First Level Encoding, the name "Neko" would be converted as follows:

char hex split + 'A' hex result
N = 0x4E 0x04 +
0x0E +
0x41
0x41
= 0x45 =
= 0x4F =
E
O
e = 0x65 0x06 +
0x05 +
0x41
0x41
= 0x47 =
= 0x46 =
G
F
k = 0x6B 0x06 +
0x0B +
0x41
0x41
= 0x47 =
= 0x4C =
G
L
o = 0x6F 0x06 +
0x0F +
0x41
0x41
= 0x47 =
= 0x50 =
G
P
' ' = 0x20 0x02 +
0x00 +
0x41
0x41
= 0x43 =
= 0x41 =
C
A
' ' = 0x20 0x02 +
0x00 +
0x41
0x41
= 0x43 =
= 0x41 =
C
A
:
:

This results in the string: EOGFGLGPCACACACACACACACACACACACA
Lovely, isn't it?
 

Coding style is very
personal...
-- Linus Torvalds
  

...and here is our first bit of code:

[Listing 1.1]

This function reads up to 16 characters from the input string name and converts each to the encoded format, stuffing the result into the target string dst. The space character (0x20) always converts to the two-character value "CA" so, if the source string is less than 16 bytes, we simply pad the target string with CACA. Note that the target character array must be at least 33 bytes long--one extra byte to account for the nul terminator6.
 



Bedevere:
  Oooohoohohooo!
Lancelot:
  No, no. 'Aaaauugggh',
at the back of the throat.
Aaauugh.
-- Monty Python And
The Holy Grail
,
Monty Python's Flying
Circus
  

Typo Alert:

RFC 1001 provides an example of First Level Encoding in Section 14.1. The string "The NetBIOS name" is encoded as:

FEGHGFCAEOGFHEECEJEPFDCAHEGBGNGF

Decoding this string, however, we get "Tge NetBIOS tame".
Perhaps it's a secret message.

The correct encoding would be:

FEGIGFCAEOGFHEECEJEPFDCAGOGBGNGF

1.3.1.2 Fully Qualified NBT Names

Now that we've managed to convert the NetBIOS name into a DNS-aligned form, it is time to combine it with the NBT Scope ID. The result will be a fully-qualified NBT address, which we will call the "NBT Name". To be pedantic, when the RFCs talk about First Level Encoding, this fully qualified form is what they really mean.

As expected, the syntax of the Scope ID follows the DNS recommendations given in RFC 883 (and repeated in RFC 1035). That is, a Scope ID looks like a DNS name. So, if the Scope ID is cat.org, and the NetBIOS name is Neko, the resultant NBT name would be:

EOGFGLGPCACACACACACACACACACACACA.CAT.ORG

Imagine typing that into your web browser. This is why the RFC 1001/1002 scheme for merging the NBNS with the DNS never took hold.

1.3.1.3 Second Level Encoding

Now that we have an NBT name in a nice familiar format, it is time to convert it into something else.

DNS names (and, therefore, NBT names) are made up of labels separated by dots. Dividing the name above into its component labels gives us:

length label
32  EOGFGLGPCACACACACACACACACACACACA
3 CAT
3 ORG
0 <nul>

The Second Level Encoded NBT name is a concatenation of the lengths and the labels, as in:

'\x20' + "EOGFGLGPCACACACACACACACACACACACA" + '\x03' + "CAT" + '\x03' + "ORG" + '\0'

The empty label at the end is important. It is a label of zero length, and it represents the root of the DNS (and NBT) namespace. That means that the final nul byte is part of the encoded NBT name, and not a mere terminator. In practice, you can manipulate the encoded NBT name as if it were a nul-terminated string, but always keep in mind that it is really a series of length-delimited strings7.
 

Any useful piece of code
deserves to be rewritten
at least once.
-- Unknown
  

Our second bit of code will convert a NetBIOS name and Scope ID into a Second Level Encoded string:

[Listing 1.2]

Not the prettiest piece of code, but it does the job. We will run through the function quickly, just to familiarize you with the workings of this particular programmer's twisted little brain. If the code is fairly obvious to you, feel free to skip ahead to the next section.

  if( NULL == L1_Encode( &dst[1], name ) )
    return( -1 );
  dst[0] = 0x20;
  lenpos = 33;

Call L1_Encode() to convert the NetBIOS name into its First Level Encoded form, then prefix the encoded name with a length byte. This gives us the first label of the encoded NBT name. Note that we check for a NULL return value. This is paranoia on the programmer's part since this version of L1_Encode() does not return NULL. (An improved version of L1_Encode() might return NULL if it detected an error.)

The variable lenpos is set to the offset at which the next length byte will be written. The L1_Encode() function has already placed a nul byte at this location so, if the scope string is empty, the NBT name is already completely encoded.

  if( '\0' != *scope )
    {
    do
      {
      :
      } while( '.' == *(scope++) );
    dst[lenpos] = '\0';
    }

The processing of scope labels is contained within the do..while loop. If the scope is empty, then we can skip this loop entirely. Note that the root label is added to the end of the target string, dst, following the scope labels.

    for( i = 0, j = (lenpos + 1);
         ('.' != scope[i]) && ('\0' != scope[i]);
         i++, j++)
      dst[j] = scope[i];

Run through the current label, copying it to the destination string. The variable i keeps track of the length of the label. A dot or a nul will mark the end of the current label.

      dst[lenpos] = (uchar)i;
      lenpos     += i + 1;
      scope      += i;

Write the length byte for the current label, and then move on to the next by advancing lenpos. The variable scope is advanced by the length of the current label, which should leave it pointing to the dot or nul that terminated the label. It will be advanced one more byte within the while clause at the end of the loop.

Hopefully that was a nice, short waste of time. As we progress, it will become necessary to move more quickly and provide less code and less analysis of the code. There is a lot of ground to cover.

1.3.1.4 Name Service Packet Headers

Once again, our attention is drawn to the ancient lore of RFC 883, which was written about four years ahead of RFC 1001/1002 and was eventually replaced by RFC 1035. The comings and goings of the RFCs are a study unto themselves.

NBT Name Service packets are an intentional rip-off of DNS Messages. New flag field values, operation codes, and return codes were added but the design was in keeping with the goal of eventually merging NBNS services into the DNS.

This, conceptually, is what a Name Service packet header looks like:

 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
NAME_TRN_ID
R OPCODE NM_FLAGS RCODE
QDCOUNT
ANCOUNT
NSCOUNT
ARCOUNT

...and here is a description of the fields:

NAME_TRN_ID:   A two-byte transaction identifier. Each time the NBT Name Service starts a new transaction it assigns an ID so that it can figure out which responses go with which requests. The obvious way to handle this is to start with zero and increment each time one is used, allowing rollover at 0xFFFF.

For our purposes any number will do. So we will pick something semi-random for ourselves. How 'bout 1964?

R:   This one-bit field indicates whether the packet is:
0 ==  a request, or
1 ==  a response.

Ours is a request. It initiates a transaction, so we will use 0.

OPCODE:   Six operations are defined by the RFCs. These are:
0x0 ==  Query
0x5 ==  Name Registration
0x6 ==  Name Release
0x7 ==  WACK (Wait for Acknowledgement)
0x8 ==  Name Refresh
0x9 ==  Name Refresh (Alternate Opcode)

The 0x9 OpCode value is the result of a typo in RFC 1002. In section 4.2.1.1 a value of 0x8 is listed, but section 4.2.4 shows a value of 0x9. A sensible implementation will handle either, though 0x8 is the preferred value.

One more OpCode was added after the RFCs were published:

0xF ==  Multi-Homed Name Registration

Our immediate interest, of course, is with the Query operation--OpCode value 0x0.

NM_FLAGS:   As the name suggests, this is a set of one-bit flags, as follows:

 0  1  2  3  4  5  6
AA TC RD RA 0 0 B

We will go into the details of this later on. For now, note that the B flag means "Broadcast" and that we are attempting to do a broadcast query, so we will want to turn this bit on. We will also set the RD flag. RD stands for "Recursion Desired"; for now, just take it on faith that this bit should be set. All others will be clear (zero).

RCODE:   Return codes. This field is always 0x00 in request packets (those with an R value of zero). Each response packet type has its own set of possible RCODE values.

QDCOUNT:   The number of names that follow in the query section. We will set this to 1 for our broadcast name query.

ANCOUNT:   The number of answers in a POSITIVE NAME QUERY RESPONSE message. This field will be used in the replies we get in response to our broadcast query.

NSCOUNT:  
ARCOUNT:  
These are "Name Service Authority Count" and "Additional Record Count", respectively. These can be ignored for now.

So, for a broadcast NAME QUERY REQUEST, our header will look like this:

 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
1964
0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0
1
0
0
0

To make it easier to write the code for the above query, we will hand-convert the header into a string of bytes. We could do this in code (in fact, that will be necessary for a real implementation), but dealing with such details at this point would be an unnecessary tangent. So...

  unsigned char header[] =
    {
    0x07, 0xAC, /* 1964 == 0x07AC.     */
    0x01, 0x10, /* 0 0000 0010001 0000 */
    0x00, 0x01, /* One name query.     */
    0x00, 0x00, /* Zero answers.       */
    0x00, 0x00, /* Zero authorities.   */
    0x00, 0x00  /* Zero additional.    */
    };

1.3.1.5 The Query Entry

The query entries follow the header. A query entry consists of a Level 2 Encoded NBT name followed by two additional fields: the QUESTION_TYPE and QUESTION_CLASS. Once again, this is taken directly from the DNS query packet.

Under NBT, the QUESTION_TYPE field is limited to two possible values, representing the two types of query that are defined in the RFCs. These are:

NB == 0x0020  The NAME QUERY REQUEST, which we will use to perform our broadcast query.
NBSTAT == 0x0021  The NODE STATUS REQUEST, also known as an "Adapter Status" query. The latter is a reference to the original NetBIOS API command name.

Only one QUESTION_CLASS is defined for NBT, and that is the Internet Class: 0x0001.

So, our completed NAME QUERY REQUEST packet will consist of:

  • the NBT header, as given above,
  • the Second Level encoded NBT name,
  • the unsigned short values 0x0020 and 0x0001.

1.3.1.6 Some Trouble Ahead

He felt that if once he went
beyond the crown of the pass
and took one step veritably
down into the land of Mordor,
that step would be irrevocable.
He could never come back.
-- J.R.R. Tolkein
The Lord of the Rings
  

It would seem that it should now be easy to send a broadcast name query. Just put the pieces together and send them to UDP port 137 at the broadcast address. Yes that should be easy...except that we are now crossing the line between theory and practice, and that means trouble. Be brave.

Upper Case/lower case

RFCs 883 and 1035 state that DNS name lookups should be case-insensitive. That is, CAT.ORG is equivalent to cat.org and Cat.Org, etc. Case-insensitive comparison is not difficult, and First Level Encoding always produces a string of upper-case characters in the range 'A'..'P', so we should have no trouble comparing

EOGFGLGPCACACACACACACACACACACACA.CAT.ORG  against
EOGFGLGPCACACACACACACACACACACACA.cat.org

...but what about the original NetBIOS name? The strings "Neko" and "NEKO" translate, respectively, to

EOGFGLGPCACACACACACACACACACACACA and
EOEFELEPCACACACACACACACACACACACA

These strings do not match, and so we seem to have a problem. Are the two original names considered equivalent? If so, how should we handle them?

RFC 1001 and 1002 do not provide answers to these questions, so we need to look to other sources. Of course, the ultimate source for Truth and Wisdom is empirical information. That is: what actually happens on the wire? A little packet sniffing and a few simple tests will provide the answers we need. Here's the plan:

  1. Use lower-case or mixed-case names when configuring your test hosts.
  2. Set up your sniffer to capture packets on UDP port 137.
  3. Start or restart your test hosts.
  4. After a few minutes, stop the capture.

If your sniffer can decode the NetBIOS names (Ethereal and NetMon can) then you will see that the NetBIOS names are all in upper case. This is normal behavior for NBT, even though it is not documented in the RFCs. Scope strings are also converted to upper case before on-the-wire use.

Here is another interesting test that you can perform if you have both Windows and Samba systems (and/or others) on your network:

  1. Modify the NBT Name Query code (in listing 1.3, below) so that it converts the NetBIOS name to lower case rather than upper case. (That is, change toupper to tolower.)
  2. Recompile.
  3. Start your sniffer.
  4. Use the code to send some queries. Try group names in particular.

In tests using Samba 2.0 and Windows, Samba servers respond to these lower-case queries while the Windows systems do not. This suggests that Windows compares the encoded string, while Samba is decoding the string and performing a case-insensitive comparison on the result 8. One might argue that Samba's behavior is more robust, but comparing names without decoding them first is certainly faster.

NetBIOS Name Syntax

We have specified a syntax for NBT names. So far, however, we have said little about the syntax for the original NetBIOS name. RFC 1001 says only that the name may not begin with an asterisk ('*'), because the asterisk is used for 'wildcard' queries.

At the low level there are few real rules for forming NetBIOS names other than the length limit. Applications, however, often place restrictions on the names that users may choose. Windows, for example, will only allow a specific set of printable characters for workstation names, yet Microsoft's nbtstat program is much more accepting.

For the implementor, this is a bit of a problem. You want to be sure that your code can handle any bizarre name that reaches it over the network, yet help the user avoid choosing names that might cause another system to choke. A good rule of thumb is to warn users against choosing any character that is not legal in a "best practices" DNS label.

Padding Permutations

The space character (0x20) is the designated padding character for NetBIOS names, but there are a few exceptions. One of these is associated with wildcard queries. When sending a wildcard query, the name is padded with nul bytes rather than spaces, giving:

'*'  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Which translates to: CKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Samba will respond to either space or nul padded wildcard queries, but Windows will only respond if the name is nul padded9. Once again, this indicates that Samba decodes NBT names before comparison, but Windows does not.

Microsoft has added a few other non-space-padded names as well. These are special case names, used with particular applications. Still, they demonstrate the need for flexibility in our encoding and decoding functions.

Label Length Limits

We did not bother to mention earlier that the label length bytes placed before each label during Second Level Encoding are not 8-bit values. The uppermost two bits are used as flags, leaving only 6 bits for the label length. Normally these flag bits are both zero (unset), so we can ignore them for now and (as with so many other little details) deal with them later on.

With only 6 bits, the length of each label is limited to 63 characters. The overall length of the Second Level Encoded string is further limited to 255 bytes. Our example code does not have any checks to ensure that the Scope ID has the correct syntax, though such tests would be required in any "real" implementation.

The Fine Print at the End

The RFCs do not say so, but the last byte of the NetBIOS name is reserved.

The practice probably goes back to the early days and IBM. The 16th byte of a NetBIOS name is used to designate the purpose of the name. This byte is known as the "suffix" (or sometimes the "type byte"), and it contains a value which indicates the type of service that registered the name. Some example suffix values include:

0x00 == Workstation Service (aka Machine Name or Client Service)
0x03 == Messenger Service
0x20 == File Server Service
0x1B == Domain Master Browser

The care and feeding of suffix values is yet another topic to be covered in detail later on. A suffix value of 0x00 is fairly common, so we will use that in our broadcast query. Note that this changes the encoding of the NetBIOS name. Once again using the name Neko:

instead of EOEFELEPCACACACACACACACACACACACA,
you get EOEFELEPCACACACACACACACACACACAAA.

Shorthand Alert:

When writing a NetBIOS name, the suffix value is often specified in hex, surrounded by angle brackets. So, the name NEKO with a suffix of 0x1D would be written:

NEKO<1D>

It is also fairly common to use the '#' character to indicate the suffix:

NEKO#1D

We will use the angle bracket notation where appropriate.
 

1.3.1.7 Finally! A Simple Broadcast Name Query

...almost, but not quite,
entirely unlike tea...
-- The Hitchhikers Guide
To The Galaxy
,
Douglas Adams
  

This next bit of code is full of shortcuts. The packet header is hard-coded, as are the QUESTION_TYPE and QUESTION_CLASS. No syntax checking is done on the scope string. Worst of all, the program sends the query but does not bother to listen for a reply. For that, we will use a sniffer.

Tools such as the nmblookup utility that comes with Samba, or Microsoft's nbtstat program, could also be used to send a name query. The goal, however, is to implement these tools on our own, and the next bit of code gives us a start10.

[Listing 1.3]

The updated L1_Encode() function takes two new parameters: pad and sfx. These allow us to specify the padding character and the suffix, respectively. The L2_Encode() function also takes these additional parameters, so that it can pass them along to L1_Encode(), and both functions make use of toupper() to ensure that the NetBIOS name and Scope ID are in upper case.

The function Send_Nbtn_Bcast() does the job of transmitting a block of data via UDP. The destination is port UDP/137 at the universal broadcast address. The program mainline simply strings together the various pieces of the NBT query, taking the NetBIOS name and Scope ID from the command line.

Compile the code and give the executable the name namequery. The program takes one or two arguments. The first is the NetBIOS name, and the second is the Scope ID (the Scope ID is optional). For example, on a Unix system the command line (including the $ prompt) might be:

$ namequery neko cat.org

Start your sniffer with the filter set to capture only packets sent to/from UDP port 137. If you are using TCPDump or Ethereal the filter string is: udp port 137. Depending on your OS, you may need to have Root or Administrator privilege in order to run the sniffer.

Run namequery with the input shown above, and then stop the capture. You should get something like this:

    + Frame 1 (100 on wire, 100 captured)
    + Ethernet II
    + Internet Protocol
    + User Datagram Protocol
    - NetBIOS Name Service
         Transaction ID: 0x07ac
       + Flags: 0x0110 (Name query)
         Questions: 1
         Answer RRs: 0
         Authority RRs: 0
         Additional RRs: 0
       - Queries
         + NEKO           <00>.CAT.ORG: type NB, class inet

This example is copied from Ethereal output.

Compare the parsed output provided by the sniffer against the hard-coded information in the program. They should match up. Next, try a query using a name on your own network and take a look at the response. If you use the name of a Workgroup or NT Domain, you may get responses from several systems.

Another way to get multiple replies is to use a wildcard query. If all NBT nodes on your local LAN use the same Scope ID, and if they are not P nodes, then they will all respond to the wildcard name. To try this, you must first change the call to L2_Encode() within main() so that it passes '\0' as the padding character. That is:

total_len += L2_Encode( &bufr[total_len], name, '\0', '\0', scope );

...then recompile and give the asterisk as the NetBIOS name:

$ namequery "*"

Try using other tools such as nbtstat in Windows or Samba's nmblookup to generate queries, and spend a bit of time looking at the results of these captures. You can also simply let the sniffer run for a while. If your network is active you will see all sorts of NetBIOS packets fly by (particularly if you are on a shared rather than a switched LAN).

1.3.2 Interlude

We now have method, madness, and a vague sense of the direction. We are ready to head out on the open code. Let us first take a moment to meditate on what we have covered so far. Start by considering this mental image...

Imagine a cold, rainy autumn day. Still thinking of summer, you have forgotten to wear a jacket. The chill of the rain runs through your entire body as you hurry along the street. You try to keep your neck dry by pulling up your thin sweater and hunching your shoulders. Down the road you spot a café. It looks warm and bright inside. You quicken your pace, then dash through the door as the drizzly rain becomes more enthusiastic and thunder rumbles in the distance.

The shop is cozy, but not too small. There are potted plants scattered about. Light jazz plays over well-hidden speakers. The clientele are trendy urban business types having quiet, serious discussions in pairs at small tables. Paintings by a local artist hang on the walls.

You step to the counter. A young woman with a dozen earrings and short-cut hair smiles and asks you what you would like. A nice, hot cup of tea. She reaches down behind the counter and grabs a large white mug. Then she opens a box and pulls out a tea bag that is at least three years old, drops it into the mug, and pours in hot water from the sink. "Three dollars" she says, still smiling.

If you are a coffee drinker, you probably don't understand. Replace the words "opens a box and pulls out a tea bag" with "opens a jar and scoops out one spoonful of freeze-dried instant" and you will get the point. The point is that details matter. Certainly, an old tea bag in warm water will make a cup of tea...but not one worth drinking11.

Just so, our examples provide some working code but are far from satisfying. If we are going to write something truly enjoyable we need to dig into the details.

Let's get to it.


1.4 The Name Service in Detail

This is gonna hurt me
more than it does you.
-- common lie
  

Think of the Name Service as a database system. The data may be stored in an NBNS server (P mode), distributed across all of the participating nodes in an IP subnet (B mode), or a combination of the two (M or H mode).

Name Service messages are the transactions that maintain and utilize the NBT name-to-IP address mapping database. These transactions fall into three basic categories:

Name Registration/Refresh: The process by which an application adds and maintains a NetBIOS Name to IP address mapping within an NBT scope.
Name Query:  The process of resolving a NetBIOS name to an IP address.
Name Release: The process by which a NetBIOS name to IP address mapping is removed from within an NBT scope.

These three represent the lifecycle of an NBT name.
 

Hello, I love you
Won't you tell me your name?
-- Hello, I Love You
The Doors
  

The RFCs also specify support for the NetBIOS API Adapter Status Query function. Implementation of the Adapter Status Query is quite similar to that of the Name Query, so it gets lumped in with the Name Service. This is fairly reasonable, since the query packets are almost identical and the most important result of the status query is a list of names owned by the target node.

1.4.1 NBT Names: Once More With Feeling

Let's review what we've learned so far:

  • Though the RFCs do not say so, NetBIOS names should be converted to upper case before they are encoded. The practice probably goes back to early IBM implementations. Converting NetBIOS names to upper case allows for comparison of the encoded string, rather than requiring that NBT names be decoded and compared using a case-insensitive function. Some existing implementations use this shortcut, and will not recognize names with encoded lower-case characters.

  • The RFCs list NetBIOS names as being 16 bytes in length. It is common practice, however, to implement NetBIOS names as two subfields: a 15-byte name and a one-byte suffix. (That's what Microsoft does so everyone else has to do it too.) The suffix byte actually winds up being quite useful. The suffix byte is read as an integer in the range 0..255, so it is not converted to upper-case.

  • If the NetBIOS name is less than 15 bytes, it must be padded. The space character (0x20) is the designated padding character (though there are some rare, special-case exceptions.)

  • Other than length and padding, the only restriction the RFCs place on the syntax of a NetBIOS name is that it may not begin with an asterisk ('*').

1.4.1.1 Valid NetBIOS Name Characters

Any octet value can be encoded using the first-level mechanism. In theory, then, any eight-bit value can be part of a NetBIOS name. Keep this in mind and be prepared. There are some very strange names in use in the wild.

In practice, implementations do place some restrictions on the characters that may be used in NetBIOS names. These restrictions are implemented at the application layer, and should be considered artificial. Under Windows 9x, for example, the "Network Identity" control panel allows only the following characters in a machine name:

Valid Windows 9x
Machine Name Characters
' ' == 0x20 '-' == 0x2D
'!' == 0x21 '.' == 0x2E
'#' == 0x23 '@' == 0x40
'$' == 0x24 '^' == 0x5E
'%' == 0x25 '_' == 0x5F
'&' == 0x26 '{' == 0x7B
'\'' == 0x27 (single quote)       '}' == 0x7D
'(' == 0x28 '~' == 0x7E
')' == 0x29 Alpha-numeric characters.

Yet the same Windows 9x system may also register the special-purpose name "\x01\x02__MSBROWSE__\x02\x01", which contains control characters as shown.

Note that the set of alpha-numeric characters may include extended characters, such as 'Å' and 'Ü'. Unfortunately, these are often represented by different octet values under different operating systems, or even under different configurations of the same operating system.

Some examples:

Character ISO Latin-1 DOS Code Page 437
'Ä' 0xC4 0x8E
'Ç' 0xC7 0x80
'É' 0xC9 0x90
'Î' 0xCE -
'Ö' 0xD6 0x99
'Ñ' 0xD1 0xA5
'Ù' 0xD9 -

As you can see, the mapping between character sets can be a bit of a challenge--particularly since there is no standard character set for use in NBT and no mechanism for negotiating a common character set12.

One more thing to consider when dealing with NetBIOS name characters: Windows NT will generate a warning--and W2K an error--if the Machine Name is not also a valid DNS name. You may need to do some testing to determine which characters Windows considers valid DNS label characters.

1.4.1.2 NetBIOS Names Within Scope

Under NBT, NetBIOS names exist within a scope. The scope is the set of all machines which can "see" the name. For B nodes, the scope is limited to the IP broadcast domain. For P nodes, the scope is limited to the set of nodes that share the same NBNS. For M and H nodes, the scope is the union of the broadcast domain and the shared NBNS.

Scope can be further refined using a Scope ID. The Scope ID effectively sub-divides a virtual NetBIOS LAN into separate, named vLANs. Unfortunately, few (if any) implementations actually support multiple Scope IDs so this feature is of limited practical use.

The syntax of the Scope ID matches the best-practices recommendations for DNS domain names. (Some Windows flavors allow almost any character value in a Scope ID string. Sigh.) Scope IDs should be converted to upper case before use on the wire.
 



The best way to eliminate
the problem is to
remove Scopes completely.
-- John Terpstra,
Samba Team,
in a message to the
Samba-Technical mailing list.
  

Annoyance Alert:

In versions of Windows 95 and '98 that we tested, the Scope ID field in the network setup control panel is greyed out if no WINS server IP address is specified. That is, you cannot enter a Scope ID if your machine is running in B mode.

You can work around this by entering the Scope ID in the right place in the registry, or by entering a (bogus) WINS server IP, entering the Scope ID, saving your changes, rebooting, reopening the network control panel, removing the WINS IP entry, saving your changes, and rebooting again.

The system does not seem to clear the Scope ID once it has been entered. To clear the Scope ID you must either edit the registry, or enter a (bogus) WINS server IP, clear out the Scope ID in the control panel, save your changes, reboot, reopen the network control panel, remove the WINS IP entry, save your changes, and reboot.

Windows NT behaves correctly, and does allow the entry of a Scope ID in B mode.
 

  [Buy the Book!]   

1.4.1.3 Encoding and Decoding NBT Names

First Level Encoding converts a 16-byte NetBIOS name into a 32-byte encoded name, and then combines it with the Scope ID. For example:

"EOGFGLGPCACACACACACACACACACACAAA.CAT.ORG"

We have chosen to call this format the NBT Name. Second Level Encoding is applied to the NBT name to create the on-the-wire format, which we will refer to as the Encoded NBT Name:

"\x20EOGFGLGPCACACACACACACACACACACAAA\x03CAT\x03ORG\0"

As previously described, the maximum length of a label in an NBT name is 63 bytes. This is because the label length field is divided into two sub-fields, the first of which is a two-bit flag field with four possible values:

00 == 0: Label Length
01 == 1: Reserved (unused)
10 == 2: Reserved (unused)
11 == 3: Label String Pointer

With both bits clear (zero) the next 6 bits are the label LENGTH. The LENGTH field is an unsigned integer with a value in the range 0..63.

 0  1  2  3  4  5  6  7
0 0 LENGTH

If both flag bits are set, however, then the next fourteen bits are a "Label String Pointer"; the offset at which the real label can be found.

 0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15
1 1 LABEL STRING POINTER

Label string pointers are used to reduce the size of Name Service messages that might otherwise contain two copies of the same NBT name. For example, a NAME REGISTRATION REQUEST message includes both a QUESTION_RECORD and an ADDITIONAL_RECORD, each of which would otherwise contain the same NBT name. Instead of duplicating the name, however, the ADDITIONAL_RECORD.RR_NAME field contains a label string pointer to the QUESTION_RECORD.QUESTION_NAME field.

Label string pointers are a prime example of the NBT theory/practice dichotomy, and another throw-back to the DNS system. As it turns out, the only label string pointer value ever used in NBT is 0xC00C. The reason for this is quite simple. The NBT header is a fixed size (12 bytes), and is always followed by a block that starts with an encoded NBT Name. Thus, the offset of the first name in the packet is always 12 (0x0C). Any further name field in the packet will point back to the first.

So, the rule of thumb is that the encoded NBT name will always be found at byte offset 0x000C. As a short-cut, some implementations work directly with the encoded name and only bother to decode the name when interacting with a user. Decoding, however, is fairly straight forward:

[Listing 1.4]

The L2_Decode() function copies the encoded NBT name to the destination buffer, skipping the first label length byte and replacing internal label length bytes with the dot character. That is, given the input string:

"\x20EOGFGLGPCACACACACACACACACACACAAA\x03CAT\x03ORG\0"

it will produce the string:

"EOGFGLGPCACACACACACACACACACACAAA.CAT.ORG"

The L1_Decode() function decodes the First Level Encoded NetBIOS name, and hands back the suffix byte as its return value.

1.4.2 NBT Name Service Packets

RFC 1002 lists 17 different Name Service packet types, constructed from three basic building blocks:

  • A Header
  • Query Records
  • Resource Records

These pieces are described in more detail below.

1.4.2.1 Name Service Headers

The header is an array of six 16-bit values, as follows:

0   NAME_TRN_ID  
1 FLAGS
2 QDCOUNT
3 ANCOUNT
4 NSCOUNT
5 ARCOUNT

Managing Name Service headers is fairly straight-forward. With the exception of the FLAGS field, all of the fields are simple unsigned integers. The entire thing can be represented in memory as an array of unsigned short int, or whatever is appropriate in your programming language of choice.

The FLAGS field is further broken down thus:

 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
R OPCODE NM_FLAGS RCODE

Handling the bits in the FLAGS field is fairly trivial for any seasoned programmer. One simple solution is to shift the values given in RFC 1002, section 4.2.1.1 into their absolute positions. For example, an OPCODE value of 0x7 (WACK) would be left shifted 11 bits to align it properly in the OPCODE subfield:
 

It's just a jump to the left
-- Time Warp
Richard O'Brien
  

(0x0007 << 11) = 0x3800 = 0011100000000000(bin)

...which puts it where it's supposed to be:

 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
R OPCODE NM_FLAGS RCODE
0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0

Listing 1.5 presents nbt_nsHeader.h, a header file that will be referenced as we move forward. It provides a set of re-aligned FLAGS subfield values plus a few extra constants. These values will be covered below, when we explain how to use each of the Name Service message types.

[Listing 1.5]

The NAME_TRN_ID field is the transaction ID, which should probably be handled by the bit of code that sends and receives the NBT messages. Many implementations use a simple counter to generate new transaction IDs (Samba uses a random number generator), but these should always be checked to ensure that they are not, by chance, the same as the transaction ID of a conversation initiated by some other node. Better yet, the originating node's IP address should be used as an additional key for segregating transactions.

The four COUNT fields indicate the number of Question and Resource Records which follow. In theory, each of these fields can contain a value in the range 0..65535. In practice, however, the count fields will contain either 0 or 1 as shown in the record layouts in RFC 1002, section 4.2. It appears as though some implementations either ignore these fields or read them as simple booleans.

One final consideration is the byte order of NBT messages. True to its DNS roots, NBT uses network byte order (big-endian). Some microprocessors--including Alpha, MIPS, and Intel i386 family--use or can use little-endian byte order13. If your target system is little-endian, or if you want your code to be portable, you will need to ensure that your integers are properly converted to and from network byte order. Many systems provide the htonl(), htons(), ntohl(), and ntohs() functions for exactly this purpose.
 


My brain hurts!
-- Attributed to
Mr. T. F. Gumby
Gumby Brain Specialist
Monty Python's Flying Circus
  

Bizarre Twist Alert:

The SMB protocol was originally built to run on DOS. DOS was originally built to run on Intel chips, so SMB is little-endian...the opposite of the NBT transport!
 

This next bit of code is nbt_nsHeader.c. It shows how to create and parse NBT Name Service headers. As with all of the code presented in this book, it is designed to be illustrative, not efficient. (We know you can do better.)

[Listing 1.6]

1.4.2.2 Name Service Question Records

The question record is also simple. It consists of an encoded NBT name (in the QUESTION_NAME field) followed by two unsigned 16-bit integer fields: the QUESTION_TYPE and QUESTION_CLASS.

The length of an encoded NBT name is at least 34 bytes, but it will be longer if a Scope ID is used, so the QUESTION_NAME field has no fixed length. There is also no padding done to align the integer fields. The QUESTION_TYPE and QUESTION_CLASS follow immediately after the QUESTION_NAME.

>= 34 bytes 2 bytes 2 bytes
 QUESTION_NAME ...    QUESTION_TYPE   QUESTION_CLASS 

There are only two valid values for the QUESTION_TYPE field. These are:

NB  = 0x0020   Indicates a standard Name Query
NBSTAT  = 0x0021   Indicates a Node Status Query

The QUESTION_CLASS field always has a value of:

    IN  = 0x0001   Indicates the "Internet Class"

Go back and take a look at the broadcast name query example presented earlier. In that example, we hard-coded both the NBT Name Service header and the tail-end of the question record. Now that you have a clearer understanding of the fields involved, you should be able to design much more flexible code. Here's a start:

[Listing 1.7]

1.4.2.3 Name Service Resource Records

For convenience, we will break the Resource Record into three sub-parts:

  • the Name section
  • the TTL field
  • the Resource Data section

The Name section has the same structure as a Query Entry record, except that the RR_NAME field may contain a 16-bit label string pointer instead of a complete NBT name.

 2-bytes or >= 34 bytes  2 bytes 2 bytes
 RR_NAME ...    RR_TYPE   RR_CLASS 

The RR_TYPE field is used to indicate the type of the resource record, which has an effect on the structure of the resource data section. The available values for this field are:

A  == 0x0001 (not used in practice)
NS  == 0x0002 (not used in practice)
NULL  == 0x000A (not used in practice)
NB  == 0x0020
NBSTAT  == 0x0021

The values marked "not used in practice" are described in the RFCs, and indicated as valid values, but are never really used in modern implementations. The value of RR_TYPE will be NB except in a NODE STATUS REPLY, in which case NBSTAT is used.

As with the question record, the RR_CLASS field always has a value of:

    IN  == 0x0001

The TTL field follows the name section. It indicates the "Time To Live" value associated with a resource record. Each NBT name-to-IP address mapping in the NBNS database has a TTL value. This allows records to "fade out" if they are not renewed or properly released. The TTL field is an unsigned long integer, measured in seconds. A value of zero indicates infinite TTL.

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
TTL

The last sub-part of the resource record is the resource data section, which is made up of two fields:

2 bytes  RDLENGTH bytes 
 RDLENGTH   RDATA ... 

The RDLENGTH field is an unsigned 16-bit integer value indicating the length, in bytes, of the RDATA field. The structure of the contents of the RDATA field will vary from message type to message type.

The Resource Record structure, as described in section 4.2.1.3 of RFC 1002, looks just like this:

                     1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
/                            RR_NAME                            /
/                                                               /
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           RR_TYPE             |          RR_CLASS             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                              TTL                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           RDLENGTH            |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
/                                                               /
/                             RDATA                             /
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

It is always good to have some code to play with. This next set of functions can be used to manipulate Resource Records.

[Listing 1.8]

1.4.3 Conversations with the Name Service

We will now introduce a simple syntax for describing how to fill network packets. This syntax is neither standard nor rigorous, just something the author whipped up to help explain what goes into a message. If it looks like someone else's syntax (one which perhaps took long hours of study, concentration, and thought to develop) then apologies are probably in order.

Disclaimer Alert:

Any resemblance to an actual syntax, living or dead, real or imaginary, is entirely coincidental.

A broadcast name query, described using our little syntax, would look like this:

    NAME QUERY REQUEST (Broadcast)
      {
      HEADER
        {
        NAME_TRN_ID = <Set when packet is transmitted>
        FLAGS
          {
          OPCODE = 0x0
          RD     = TRUE
          B      = TRUE
          }
        QDCOUNT = 1
        }
      QUESTION_RECORD
        {
        QUESTION_NAME  = <Encoded NBT Name>
        QUESTION_TYPE  = NB (0x0020)
        QUESTION_CLASS = IN (0x0001)
        }
      }

Basically, the rules are these:

  • If a record (a header, question record, or resource record) is not specified, it is not included in the packet. In the example above there are no resource records specified. We know from the example code that there are no resource records in a NAME QUERY REQUEST.
  • If a field is not specified, it is zeroed. In the example above the RCODE field of the FLAGS sub-record has a value of 0x0, and the NSCOUNT field (among others) also has a value of 0.
  • Comments in angle brackets are short explanations, describing what should go into the field. More complete explanations, if needed, will be found in the accompanying text.
  • Comments in parentheses provide additional information, such as the value of a specified constant.
  • ...and yes, each squirrelly bracket gets its own line.

It's not a particularly formal syntax, but it will serve the purpose.

1.4.3.1 Name Registration

Nodes send NAME REGISTRATION REQUEST messages when they wish to claim ownership of a name. The messages may be broadcast on the local LAN (B mode), or sent directly to an NBNS (P mode). (M and H mode are combinations of B and P mode with their own special quirks. We will get to those further on.)

A NAME REGISTRATION REQUEST message looks like this:

    NAME REGISTRATION REQUEST
      {
      HEADER
        {
        NAME_TRN_ID = <Set when packet is transmitted>
        FLAGS
          {
          OPCODE = 0x5 (Registration)
          RD     = TRUE (1)
          B      = <TRUE for broadcast registration, else FALSE>
          }
        QDCOUNT = 1
        ARCOUNT = 1
        }
      QUESTION_RECORD
        {
        QUESTION_NAME  = <Encoded NBT name to be registered>
        QUESTION_TYPE  = NB (0x0020)
        QUESTION_CLASS = IN (0x0001)
        }
      ADDITIONAL_RECORD
        {
        RR_NAME  = 0xC00C (Label String Pointer to QUESTION_NAME)
        RR_TYPE  = NB (0x0020)
        RR_CLASS = IN (0x0001)
        TTL      = <Zero for broadcast, about three days for unicast>
        RDLENGTH = 6
        RDATA
          {
          NB_FLAGS
            {
            G   = <TRUE for a group name, FALSE for a unique name>
            ONT = <Owner type>
            }
          NB_ADDRESS = <Requesting node's IP address>
          }
        }
      }

The NAME REGISTRATION REQUEST includes both a QUESTION_RECORD and an ADDITIONAL_RECORD. In a sense, it is two messages in one. It says "Does anyone own this name?" and "I want to own this name!", both in the same packet.

The NAME REGISTRATION REQUEST gives us our first look at a Label String Pointer in its native habitat. In the packet above the QUESTION_NAME and the RR_NAME are the same name, so the latter field contains a pointer back to the former. The size of the header is constant; if there is a QUESTION_NAME in a packet it will always be found at offset 0x000C (12). The field value is 0xC00C because (as is always the case with label string pointers) the first two bits are set in order to indicate that the remainder is a pointer rather than a 6-bit label length. So, Label String Pointers in NBT messages always have the value 0xC00C.

The TTL field in the ADDITIONAL_RECORD provides a Time-To-Live value, in seconds, for the name. In B mode, the TTL value is not significant and is generally set to zero. In P mode, the TTL is used by the NBNS to determine when to purge old entries from the database, and is typically set to something on the order of three days in the NAME REGISTRATION REQUEST. The NBNS may override the client's request and reply with a different TTL value, which the client must accept.

The ADDITIONAL_RECORD.RDATA field is 6 bytes long (as shown in ADDITIONAL_RECORD.RDLENGTH) and contains two subfields. The first is the NB_FLAGS field, which provides information about the name and its owner. It looks something like this:

 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
G ONT UNUSED

The NB_FLAGS.G bit indicates whether the name is a group name or a unique name, and NB_FLAGS.ONT identifies the owner node type. ONT is a two-bit field with the following possible values:

00 == B node
01 == P node
10 == M node
11 == H node (added by Microsoft)

The ADDITIONAL_RECORD.RDATA.NB_ADDRESS holds the 4-byte IPV4 address that will be mapped to the name. This should, of course, match the address of the node registering the name.

Take a good look at the structure of the RDATA subrecord in the NAME REGISTRATION REQUEST. This is the most common RDATA format, which gives us an excuse for writing a little more code...

[Listing 1.9]

1.4.3.1.1 Broadcast Name Registration

You've seen the basic form of NAME REGISTRATION REQUEST packet. When sending a broadcast registration, the following rules apply.

  • The B bit is set.
  • The TTL is zero.
  • The RDATA.NB_FLAGS.ONT should never be ONT_P, since P nodes never register their names via broadcast.

A node sending a broadcast NAME REGISTRATION REQUEST (the requester) may receive a unicast NEGATIVE NAME REGISTRATION RESPONSE from another node that already claims ownership of the name (the owner). That is the only valid message in response to a broadcast registration.

    NAME REGISTRATION RESPONSE (Negative)
      {
      HEADER
        {
        NAME_TRN_ID = <Must match REQUEST transaction ID>
        FLAGS
          {
          R      = TRUE (1; This is a response packet)
          OPCODE = 0x5 (Registration)
          AA     = TRUE (1)
          RD     = TRUE (1)
          RA     = TRUE (1)
          RCODE  = ACT_ERR (0x6)
          B      = FALSE (0; Message is unicast back to requester)
          }
        ANCOUNT = 1
        }
      ANSWER_RECORD
        {
        RR_NAME  = <The Encoded NBT Name>
        RR_TYPE  = NB (0x0020)
        RR_CLASS = IN (0x0001)
        TTL      = 0 (TTL has no meaning in this context)
        RDLENGTH = 6
        RDATA
          {
          NB_FLAGS
            {
            G   = <TRUE for a group name, FALSE for a unique name>
            ONT = <Owner type>
            }
          NB_ADDRESS = <Owner's IP address>
          }
        }
      }

When a requester receives a NEGATIVE NAME REGISTRATION RESPONSE, it is obliged to give up. Registration has failed because another node has prior--and conflicting--claim to the name. That is, the name already has an owner.

The RCODE field of the response will be ACT_ERR (0x6), indicating that the name is in use. The RDATA field should contain the real owner's name information:

  • NB_FLAGS.G indicates whether the name in use is a group or unique name,
  • NB_FLAGS.ONT is the owner's node type,
  • NB_ADDRESS is the owner's IP address.

Recall that the NAME REGISTRATION REQUEST contains a name query, so the ANSWER_RECORD in the reply should be constructed as it would be in a POSITIVE NAME QUERY RESPONSE. It is wrong to simply parrot back the information in the request14.

NEGATIVE NAME REGISTRATION RESPONSE messages are only sent if a unique name is involved. Owners of a group name will not complain if a requester tries to join the group. If, however, a requester tries to register a unique name that matches an already registered group name, the members of the group will send negative responses. In a broadcast environment, a single unique name registration request can generate a large number of negative replies.

[Figure 1.6]
 

I want it, I want it, I want it...
You can't have it!
-- Magic Bus
The Who
  

If there are no conflicts the requesting node will hear no complaints, in which case it must retry the request two more times...just to be sure. The RFCs specify a minimum timeout of 250 milliseconds between broadcast retries (Windows uses 750ms). After the third query has timed out, the requesting node should broadcast a NAME OVERWRITE DEMAND declaring itself the victor and owner of the name. The NAME OVERWRITE DEMAND message is identical to the NAME REGISTRATION REQUEST, except that the RD bit is clear (Recursion Desired is 0).

This next program will allow you to play around with broadcast name registration. It uses functions and constants from previous listings to format a NAME REGISTRATION REQUEST and broadcast it on the local IP subnet, then it listens for and reports any replies it receives.

[Listing 1.10]

The transaction ID in the NAME_TRN_ID field should be the same for all three registration attempts, for the final NAME OVERWRITE DEMAND, and for any negative response packets a remote node may care to send. All of these are part of the same transaction.
 



...the correct expression
is "up and died."
-- from the errata for
Applied Cryptography,
2nd. Ed.
,
By Bruce Schneier
  

Blue Screen of Death Alert:

Some OEM versions of Windows 95 had a bug that would cause the system to go into "Blue Screen of Death" mode (that is, system crash) if the NetBIOS Machine Name was in conflict. The problem was made worse by PC vendors who would ship systems with NBT turned on, all preconfigured with the same name. Customers who purchased several computers for local networks would turn them on for the first time and all but one would crash.
 

1.4.3.1.2 Unicast (NBNS) Name Registration

Unicast name registrations are subtly different from the broadcast variety.

  • The B bit is cleared (zero) and the destination IP is the unicast address of the NBNS.

    The message is sent "point-to-point" directly to the NBNS, rather than being broadcast on the local LAN. This is the fundamental difference between B and P mode.

  • The TTL field has real meaning when you are talking to an NBNS.

    The RFCs do not specify a default TTL value. Windows systems use 300,000 seconds, which is three days, eleven hours and twenty minutes. Samba uses 259,200 seconds, which is three days even. Both of these values are ugly in hex15.

  • The timeout between retries is longer.

    The longer timeout between retries is based on the assumption that routed links may have higher latency than the local LAN. RFC 1002 specifies a timeout value of five seconds, which is excessive on today's Internet. A client will try to register a name three times, so the total (worst case) timeout would be fifteen seconds. Samba uses a two second per-packet timeout instead, for a total of six seconds. The timeout under Windows is only 1.5 seconds per packet.

The NBNS should respond with a NAME REGISTRATION RESPONSE, which will include one of the following RCODE values:

0x0:  Success
POSITIVE NAME REGISTRATION RESPONSE
You win! The NBNS has accepted the registration. Do not forget to send a refresh before the TTL expires (see the section on Name Refresh below).
 
FMT_ERR (0x1):  Format Error
The NBNS did not like your message. Something was wrong with the packet format (perhaps it was mangled on the wire).
 
SRV_ERR (0x2):  Server failure
The NBNS is sick and cannot handle requests just now.
 
IMP_ERR (0x4):  Unsupported request error
This one is a bit of a mystery. It basically means that the NBNS does not know how to handle a request. The only clue we have to its intended usage is a poorly worded note in RFC 1002, which says:

Allowable only for challenging NBNS when gets an Update type registration request.

Huh?

This error occurs only under odd circumstances, which will be explained in more detail later on in this section. Basically, though, an IMP_ERR should only be returned by an NBNS if it receives an unsolicited NAME UPDATE REQUEST from a client. (Be patient, we'll get there.)
 

RFS_ERR (0x5):  Refused error
This indicates that the NBNS has made a policy decision not to register the name.
 
ACT_ERR (0x6):  Active error
The NBNS has verified that the name is in use by another node. You can't have it.

Note that the difference between a positive and negative NAME REGISTRATION RESPONSE is simply the RCODE value.

If you get no response then it is correct to assume that the NBNS is "down". If the name cannot be registered then your node does not own it, and your application should recover as gracefully as possible. In P mode, handle a non-responsive NBNS as you would a NEGATIVE NAME REGISTRATION RESPONSE. (If the client is running in H or M mode, then it may--with caution--revert to B mode operation until the NBNS is available again.)

There are two other packet types that you may receive when registering a name with an NBNS. These are WACK and END-NODE CHALLENGE NAME REGISTRATION RESPONSE. The WACK message tells the client to wait while the NBNS figures things out. This is typically done so that the NBNS has time to send queries to another node that has claimed ownership of the requested name. A WACK looks like this:

    WAIT FOR ACKNOWLEDGEMENT (WACK) RESPONSE
      {
      HEADER
        {
        NAME_TRN_ID = <Must match REQUEST transaction ID>
        FLAGS
          {
          R      = TRUE (1; This is a response packet)
          OPCODE = 0x7 (WACK)
          AA     = TRUE (1)
          }
        ANCOUNT = 1
        }
      ANSWER_RECORD
        {
        RR_NAME  = <The Encoded NBT Name from the request>
        RR_TYPE  = NB (0x0020; note the typo in RFC 1002, 4.2.16)
        RR_CLASS = IN (0x0001)
        TTL      = <Number of seconds to wait; 0 == Infinite>
        RDLENGTH = 2
        RDATA    = <Copy of the two-byte HEADER.FLAGS field
                   of the original request>
        }
      }

The key field in the WACK is the TTL field, which tells the client how long to wait for a response. This is used to extend the timeout period on the client, and give the NBNS a chance to do a reality check.

Samba uses a TTL value of 60 seconds, which provides ample time to generate a proper reply. Unless it is shut down after sending the WACK message, Samba's NBNS service will always send a NAME REGISTRATION RESPONSE (positive or negative) well before the 60 seconds has elapsed. Microsoft's WINS takes a different approach, using a value of only 2 seconds. If the 2 seconds expire, however, the requesting client will simply send another NAME REGISTRATION REQUEST, and then another for a total of three tries. WINS should be able to respond within that total timeframe.

WACK messages are sent by honest, hard-working servers that take good care of their clients. In contrast, a lazy and careless NBNS server will send an END-NODE CHALLENGE NAME REGISTRATION RESPONSE. This latter response tells the client that the requested name has a registered owner, but the NBNS is not going to bother to do the work to check that the owner is still up and running and using the name.

Once again, the format of this message is so familiar that there is no need to list all of the fields. The END-NODE CHALLENGE NAME REGISTRATION RESPONSE packet is just a NAME REGISTRATION RESPONSE with:

  • RCODE = 0x0
  • RA = 0 (Recursion Available clear)
  • ANSWER_RECORD.RDATA = <Information retrieved from the NBNS database.>

The annoying thing a