Configuring NAT66 on ScreenOS

I recently had to deploy redeploy an old NetScreen 5GT to segregate a production LAN and create a small lab network. The production LAN has a routable IPv6 /64 prefix, delivered via a Hurricane Electric IPv6 tunnel. The lab network also required IPv6 connectivity. We also recently obtained a /48 prefix from Hurricane Electric, so I could have just assigned a /64 from that to the lab network and routed it via the 5GT. However, this 5GT gets moved around to various different networks, including trade show networks. As such, the untrust interface of the 5GT needed to be auto-configuring, so that it would obtain an IPv6 prefix from router advertisements on whichever LAN it was connected to. This also meant that assigning a /64 from our Hurricane Electric /48 prefix was out of the question.

The first step is to enable IPv6 support on your device, if it isn’t already:

set envar ipv6=yes
save
reset save-config yes

After restarting your device, you should find that IPv6 configuration options are available, and the web UI will also have additional pages and config options. Continue reading “Configuring NAT66 on ScreenOS”

Exporting a X.509 certificate public key from Junos

I’ve just spent the last couple of hours trying to find a way to export the public key from a locally generated, self-signed X.509 certificate on a Juniper SRX-100 firewall. Apparently there is no Junos CLI command to do this, so after poking around the filesystem from a shell on the box, I found the location where Junos stores local certificates and key-pairs.

In the directory /cf/var/db/certs/common you should see at least the following two subdirectories:

drwx------   2 root   wheel   512 Jul 20 16:14 key-pair
drwx------   2 root   wheel   512 Jul 20 16:16 local

In the key-pair directory you should see files for each of your key-pairs, eg. self-cert.priv. In the local directory you will find the actual certificates, eg. self-cert.cert. Both the key-pairs and certificates are stored in binary DER format, which can be read by OpenSSL and converted to the more universal base64-encoded PEM format.

Private keys should really be left alone where they are on the Juniper, but you can safely copy the certificate to other locations, since it does not contain any private key material. Simply scp the certificate file to somewhere that has the OpenSSL tools installed, then use “openssl x509” to read the certificate:

daniel@thinkpad:~$ openssl x509 -in self-cert.cert -inform DER
-----BEGIN CERTIFICATE-----
MIIDSDCCAjCgAwIBAgIRAKybOo9tijNXkhgT9fe0ZFMwDQYJKoZIhvcNAQEFBQAw
TzELMAkGA1UEBhMCREUxJTAjBgNVBAoTHFNldmVudGggU2lnbmFsIEx0ZC4gJiBD
.....
co9vOYXqYv81xnIxg5I0brLqCzruKULy4zc6YHzJAGICMOw2wS9BwRkQUR1B2EZH
2QWUSn4Enj2JJkT3p044U8/q4BKdJ9V52mxQfA==
-----END CERTIFICATE-----

There you have it folks… a base64-encoded PEM-format public certificate.

Cisco Wireless LAN Controllers and DHCP Option 43

I recently had to install a Cisco Wireless LAN Controller (2112, if you’re interested), and had the usual fun and games with getting it to properly understand DHCP Option 43. For the uninitiated, option 43 is a vendor specific option, which, in the case of Cisco WLCs, is/are the manager IP address(es) of controllers that LWAPP access points should attempt to join when they boot up.

Different model APs require this option to be in different formats. For example, Aironet 1000 units require the option response to be type 0x66, and a comma-separated ASCII list of controller IP addresses, whereas Aironet 1130, 1200, 1230 and 1240 units require the response to be type 0xF1, followed by the length (number of addresses x four), then the hexadecimal representation of the controller IP address(es).

Cisco documentation exists for this, however their documentation for ISC’s dhcpd is incorrect. Unlike most corporate customers I run into, who run Microsoft DHCP server (for better or worse), this particular customer was running ISC’s dhcpd.

The first step is setting the option 43 type. I’m going to concentrate on the 1130, 1200, 1230 and 1240 units here, since this is the area where Cisco’s documentation is incorrect. I’m going to follow Cisco’s documentation here.

option space LWAPP;
option LWAPP.controller code 43 = string;

Then we have a vendor class, for the 1200 series units:

class "Cisco AP c1200" {
  match if option vendor class identifier = "Cisco AP c1200";
  option vendor class identifier "Cisco AP c1200";
  vendor option space LWAPP;
  option LWAPP.controller f1:04:c0:a8:f7:05;
}

Note the “f1:04” at the start of the string. This means type 0xF1, followed by four bytes of vendor specific data. The “c0:a8:f7:05” is the hexadecimal representation of the IP address 192.168.247.5. This results in dhcpd transmitting “2b 08 2b 06 f1 04 c0 a8 f7 05” for option 43.

Ok, let’s take a look at this string. The “2b” indicates this is a vendor encapsulated options field (type 43), and “08” means it’s eight bytes long. The next “2b” is where things start to go wrong. This is because the Cisco documentation told us to define LWAPP.controller as type 43 also, which is incorrect. The “06” indicates that six bytes follow for this sub-code, and then we have our “f1 04 c0 a8 f7 05” string verbatim. This causes the WLC to report an error parsing the option 43, saying that it cannot parse “2b 06 f1 04 c0 a8 f7 05”.

What we should have configured in dhcpd.conf is actually:

option space LWAPP;
option LWAPP.controller code 241 = string;

class "Cisco AP c1200" {
  match if option vendor class identifier = "Cisco AP c1200";
  option vendor class identifier "Cisco AP c1200";
  vendor option space LWAPP;
  option LWAPP.controller c0:a8:f7:05;
}

Note that we also dropped the “f1:04” from the hex string, since we are now correctly specifying LWAPP.controller as code 241 (0xF1), and dhcpd automatically populates the “04” for us, after counting the length of our hex string (four bytes = one IP address). This results in dhcpd sending “2b 06 f1 04 c0 a8 f7 05”.

Again we have our “2b”, indicating vendor encapsulated options, but this time the field is only six bytes long. Then we have “f1 04”, indicating our LWAPP.controller code, with four bytes of data – our controller IP address. This time around, the AP will correctly see the option 43 “payload” of just “f1 04 c0 a8 f7 05”, and correctly parse the sub-option 0xF1.

Of course, what this field really is (and this is more clearly detailed in Cisco’s instructions for configuring Microsoft DHCP server), is an array of IP addresses. You can eliminate the need to specify the addresses in hexadecimal by defining the LWAPP.controller as:

option LWAPP.controller code 241 = array of ip-address;

and then simply listing your controller IP addresses:

option LWAPP.controller 192.168.247.5, 192.168.247.6;

This would result in dhcp server sending “2b 0a f1 08 c0 a8 f7 05 c0 a8 f7 06”. Note the “f1 04” changed to “f1 08”, since the array length is now eight bytes (two IP addresses).

Why Cisco didn’t simply publish this, is beyond me. They’ve made it very confusing for users who don’t understand DHCP vendor specific information. I suspect the person who wrote the dhcpd section of the Cisco documentation didn’t fully understand how ISC dhcpd handles vendor specific options.

In any case, our configuration can be made somewhat clearer, and consistent with dhcpd’s documentation, as follows:

option space LWAPP;
option LWAPP.controller code 241 = array of ip-address;

class "LWAPP" {
  match option vendor-class-identifier;
}

subclass "LWAPP" "Cisco AP c1200" {
  vendor-option-space LWAPP;
  option LWAPP.controller 192.168.247.5;
}

For each additional type of AP you have to support, just add another subclass, using the appropriate vendor class identifier string.

The Amazing Unmanaged Trunk Mode Switch

Have you ever needed to set up a bunch of equipment on a boardroom table or some other temporary location, and needed both native and 802.1q tagged VLANs, but only had one available switchport?

A quick n’ dirty solution is to use an unmanaged switch, such as one of the numerous 8-port desktop switches from manufacturers such as D-Link, Netgear, Linksys etc. Configure its upstream switchport as a trunk port, thus allowing your required VLANs to pass tagged frames to your unmanaged desktop switch.

Wait a second, you say…. unmanaged switches can’t do trunk ports. How can an unmanaged switch understand VLAN frames?

It doesn’t need to. What is an 802.1q tagged frame, other than a standard 802.3 ethernet frame with four additional bytes inserted? These four additional bytes are the 802.1q VLAN ID field and 802.1p CoS field. As long as the unmanaged switch does not truncate frames to the 802.3 standard 1518 bytes, it will happily forward the 1522-byte 802.1q tagged frames just like any other. The last time I encountered a switch that would not forward these slightly “oversized” frames, was about four years ago… and it was a very cheap and nasty brand (name withheld to protect the innocent guilty).

This trick also comes in handy when you have a user with a two-port VoIP phone (such as most Cisco, Snom, Polycom etc phones), using a voice-VLAN, and the user requires more switchports than are currently available at his/her desk. Simply connect the 8-port unmanaged switch before the IP phone (ie. to the upstream port), and connect the IP phone to the unmanaged switch. The phone still gets its tagged voice-VLAN frames, the PC gets its untagged data-VLAN frames (tag-stripped if necessary by the IP phone), and the user has 6 other ports available to connect whatever… including, if necessary, other VLANs (so long as they’re tagged, and the end device can work with tagged frames, since the unmanaged switch won’t strip the 802.1q tag).

Beware though, this should only ever be used as a temporary measure, since it does open a few security holes. If the “allowed VLANs” is not carefully configured on the upstream port, the opportunity exists to VLAN-hop, or flood traffic into other VLANs. And of course, since the unmanaged switch is, well, unmanaged, there is no individual “allowed VLANs” security on those 8 ports. All ports are effectively the same as that one upstream trunk port.

Have you used this method before? What brand/model unmanaged switch did you use? What were your experiences with it, and did you encounter any problems?

Cisco 857 router

I’ve finally replaced my trusty old D-Link DSL500, which I’ve had for about four years, with a Cisco 857. What can I say about these routers… well…

My 857 router arrived with SDM Express, but not SDM, installed on the flash drive. While SDM Express is an improvement over the old Cisco Router Web Setup (CRWS), one of the reasons I bought the 857 was to see whether SDM is as good for routers as ASDM is for PIX. So I set up the router using SDM Express, and had a look at the lovely mess of a config it generated. It would probably have sufficed for a non-technical user, but being three-quarters of the way to a CCNP, I don’t think I qualify as that anymore.

First up is the ATM0.1 sub-interface that SDM creates. Ok, this probably is a good way to do it, since, even when configuring a single DLCI with frame-relay, I’ve got into the habit of using a sub-interface. But in NZ, I think we’re far less likely to have the option of multiple ATM PVC’s on ADSL than we are of having multiple DLCI’s on frame-relay.

The 857 (in fact, all the 850 and 870 series routers) have a built-in four port fast ethernet switch. While this shows up as four individual interfaces in the config, and you manually set some options per interface (layer 2 options only, I suspect), it does not function as a VLAN-capable switch, such as in the 870 series routers.

So, now for some of the gotchas. If you plan to run a server behind a router like this (and this probably would affect any Cisco ADSL router), and you only have the one public IP assigned to the Dialer interface, there are two ways you can go about it. If you run a large number of public services on that server, you may be tempted to do something like:

ip nat inside source static 10.0.0.5 interface Dialer0
ip nat inside source list 1 interface Dialer0 overload
access-list 1 permit ip 10.0.0.0 0.0.0.255 any

Of course, you should apply an access-list inbound on the Dialer0 interface, so you don’t completely expose that server. Cisco IOS is smart enough that you can have other hosts on your internal network NAT outbound. You can even specify individual inbound port-NAT entries, such as:

ip nat inside source static tcp 10.0.0.31 4662 interface Dialer0 4662

for a P2P eMule client, and the port NAT will take precedence over the whole IP NAT for the server.

Where this comes unstuck however, is if you want to terminate an IPSEC tunnel on your router. Remember, we’ve only got one public IP on our Dialer0 interface. Unfortunately, IOS is not smart enough to figure out that it should locally process incoming ESP and ISAKMP traffic – and instead forwards it to the server that you specified. So, faced with this situation myself, I have had to create individual port NAT entries for all the services I host on my server. Fortunately, IOS no longer seems to suffer a bug I enountered years ago, where UDP DNS packets didn’t NAT properly. Since DNS quite often uses UDP (like, if the query is less than 512 bytes), this bug used to make it impossible to host a DNS server behind a router like this.

The next gotcha I came across is the “ip inspect” command having a fit when confronted with out-of-sequence packets. When running an IPSEC tunnel to a NetScreen 25, I found that certain protocols that were in my “ip inspect” list were stalling. Debug revealed that large numbers of packets were being dropped, due to being out-of-sequence. After some research, I learned that Cisco’s IOS-based IPS (ip inspect) really doesn’t like having to deal with fragments. I suspect this is the reason for the relatively new IOS command “ip virtual-reassembly”, which attempts to reassemble packets prior to “ip inspect” checking them. I suspect my problem was that I was getting a lot of fragments over the VPN, due to incorrect TCP-MSS settings, and the smaller fragments were arriving before the larger fragments – hence “ip inspect” considered them out-of-sequence. Debugging “ip virtual-reassembly” revealed “invalid parameters” – which I could find no further information on. It seemed the best course of action would be to eliminate the fragmentation to begin with. After spending several hours unsuccessfully experimenting with MTU and TCP-MSS settings, the solution finally came down to setting one parameter on the far-end NetScreen – “set flow path-mtu”. Once this was enabled, everything worked fine. Obviously, PMTU discovery figured out it needed to decrease the TCP-MSS to account for the ESP encapsulation overhead. This turned to be a preferable solution to manually clamping the TCP-MSS for all traffic.

Getting back to SDM, I installed the full SDM on my router via TFTP (since the actual SDM installer just hung repeatedly, despite following Cisco’s instructions for retro-fitting existing routers with SDM). SDM is certainly more feature rich than SDM Express, but I don’t rate it quite as highly as ASDM for PIX. I ended up doing the bulk of my config by hand, from CLI, and using SDM just as a monitoring front end. It does have an audit tool however, which can be a nice security check of your config. It mostly suggests turning off services like pad and finger. Hopefully someday soon, these will be off by default anyway.

A few complaints about SDM – setting the timezone for your router is kind of weird. It called my timezone “Napier”, which, although is in NZ, and the same timezone as Auckland, I’ve never seen it referred to like that before. Officially, our timezone should be NZST/NZDT or Pacific/Auckland. SDM also configured absolute dates for daylight saving start/end. This is not correct – DST start/end is determined by week number in October and March respectively.

Configuring the IPSEC tunnel initially in SDM was a lesson in Cisco etiquette. It had some default IPSEC proposals that it wouldn’t let me delete, so I had to add my preferred proposals as secondary options. Afterwards, I tweaked the crypto map by hand in the CLI.

Don’t rely on SDM to get the ordering right of access-list entries. For ease of editing, I no longer used numeric access-lists, except for simple one or two-liners. Instead I use the “ip access-list extended ” format, makes it easy to remove individual entries. You can also easily insert entries by specifying the entry line-number, a bit like a BASIC program listing. Lastly, be careful when closing the SDM window, because it closes all your browser windows!

A couple of things to beware of with the 857 (as opposed to the 877). The 857 is the successor to the SOHO 97, not the 827 or 837 as one might think. As such, it is not particularly grunty, and if you run a lot of sessions or IPSEC tunnels (maximum of 5), you might find the CPU getting quite bogged down. The 857 does not support IPv6, which is surprising, since an 827/837 can, with the right IOS image. It also does not support class-based queuing, which can be a problem if you wanted to reserve bandwidth for, and prioritise VoIP traffic. I haven’t yet found a way to run the router’s SSH on a non-standard port, since the vty complains if you try to assign it to a different rotary group.

So, while the 857 is successfully doing firewalling, NAT and IPSEC for me now, I’m sorta wishing I’d spent the extra money and bought an 877.

Catalyst Express(?!) 500 switches

On the weekend I helped a friend set up some Cisco kit involving Catalyst Express 500 switches, Aironet 1310 wireless bridges, and various Cisco IP phones. What should have been a relatively simple job ended up taking about 10 hours.

My friend had already configured the 2800 series router, with Cisco Call Manager Express. Phones attached to a local Cat 500 worked fine – upon booting up, they configured their VLAN, got a DHCP lease, and registered to CCME. Next came the Aironets. Having never configured VLAN trunking on a wireless link, this was new ground for me. At first we had each VLAN (native VLAN 1, and voice VLAN 200) using its own SSID. It wasn’t until later that we realised we could run both over the same SSID.

Where the problems arose, was on the remote Cat 500 (ie, on the other side of the wireless bridge). The phones connected to that switch appeared to time out, trying to get a DHCP lease. Since we had previously plugged them in on the local Cat 500, they eventually fell back to their previous DHCP lease, and were actually able to make calls. This wouldn’t help us for plugging new phones in though. Since my friend had done the initial configuration of the Cat 500’s, and wasn’t sure at the time whether the Aironets were supposed to run as bridges, access points or both, he had set the Smart Port type on Aironets’ Cat 500 switchports to “Access Point”.

When I looked at the interface stats on each of the Aironets, neither of them reported any broadcasts for the radio interface. I thought this was odd, since a DHCP request is a broadcast. I even tried debugging IP packets on the Aironet, while a phone booted up. Nope, don’t see any DHCP requests. I started to get the feeling the switches were blocking broadcasts somehow, but at that point still assumed my friend had set up the switch correctly. Also, without a CLI on the Catalyst 500’s we couldn’t debug packets or VLAN encapsulation on the suspect port. In between a few drinks of vodka and Coke, some pizza, and a few rounds of the game “Burnout”, we chased our tails for several hours, eventually giving up and sleeping on it.

The next morning, I revisited my theory of the Cat 500 being at fault. We upgraded the IOS of all the switches, and the Aironet bridges. Still no luck. I went back into the Cat 500 web config (did I mention these things have no CLI?), and examined the switchport config. Neither my friend or I were exactly sure what the various Smart Port roles did, in terms of spanning tree, VLAN access/trunking, and QoS. We could only guess, since the documentation doesn’t really make it very clear either, and without a CLI, we couldn’t even verify the config against our own knowledge of Cisco IOS. Then I said, “Hey, we have a VLAN trunk going across the radio, these Aironet ports are gonna have to be in trunk mode too. What does the ‘Switch’ Smart Port role do?”

And then it worked.

Yep, a “Switch” Smart Port allows VLAN trunking. Apparently an “Access Point” Smart Port does too, otherwise we wouldn’t have been able to make calls earlier (remember the Skinny & RTP traffic would have been in VLAN 200). For some reason I’m still scratching my head about though, it was not passing broadcasts (even from the native VLAN, when we tried to get a notebook attached to the remote Cat 500 to pick up a DHCP lease). The only hint given by the documentation, is that an “Access Point” Smart Port allows up to 30 connected users on the access point. Ok, so it’s enforcing a MAC address limit of 31 on that switch port. Why not just say that… it still didn’t explain not forwarding broadcasts though. How is a client associated to an access point supposed to obtain an IP address?

So that was all good, and we proceded to set up the encryption on the wireless link. That was Brainfuck #2. Over the years, Cisco have added so many extensions to IEEE and IETF standards, that setting up Aironets is like eating alphabet soup. It was only due to my previous training and experience with Proxim wireless gear that I was able to correctly choose the ciphers that are more commonly known as WPA2. After another treasure hunt through the config to find where to set the pre-shared key for WPA, we had encryption working.

In the meantime however, the Catalyst 500’s had disabled the ports that the Aironets were connected to. According to the Cat 500 event log, it was because it had detected one way communication on the port. Err… what? Ok, sure, the wireless link was down while we were setting up the encryption, but disabling a port because of that could get real annoying, especially if the link got knocked out by weather. We re-enabled the ports (a tedious process of going into the Cat 500 web GUI, disabling the port, applying that, re-enabling the port, applying that again).

Finally, it seemed everything was working. Looking back on it after some thought, I suspect the Catalyst 500’s are getting upset with the “Switch” Smart Ports when they don’t hear any spanning tree BPDU’s on them. STP was enabled for both VLANs on the Catalyst 500’s, we told it we had a switch connected to those ports, and it recognised the active ethernet link status. With the wireless link down however, the switches wouldn’t have seen any BPDU’s arrive on that port for several minutes (we are not running STP on the Aironets). That would be long enough for spanning tree to get upset – while on the other hand, the ethernet link status was still ok. The switch might have thought there was a faulty TX at the other end of the cable. Since these Catalyst 500 switches are seriously dumbed down, your guess is as good as mine.

Either that, or it was secretly using UDLD (unidirectional link detection), a feature often used by “real” Catalyst switches to detect faulty transmit circuitry on GBIC interfaces. Except that UDLD should only be applied on fibre interfaces (ie, where one fibre of the pair could be damaged). Copper interfaces shouldn’t need UDLD.

So, since we are not running STP on the Aironet bridges, my only guess is that enabling it might keep the Catalyst 500’s happy in the event of a wireless link failure. The Cat 500 will still see BPDU’s from its directly-attached Aironet… Enabling spanning tree on the Aironets could be Brainfuck #3 though, since there is no way (that I could find) to set the bridge priority on a Catalyst 500. In an environment such as the one I was helping set up, there were no other “real” Catalysts that we could have set as the spanning tree root bridge. So the Catalyst 500 (or Aironet) with the lowest MAC address would be elected as the root bridge. It would be seriously uncool if that switch just happened to be the one at the far end of the wireless link.

So what did I learn from that experience? I won’t be using a Catalyst 500 anytime soon, and even if I do, only as an access layer switch in conjunction with a more fully featured Catalyst (at least a 2950).