Linux Neophyte Troubleshooting (by Jarred)
I have to give Chris credit: he knows a lot about Linux. In fact, I'm pretty sure he's forgotten more about the subject than I have yet learned! However, being "new" (relatively) to Linux allows me to provide some insight that he may have glossed over. If you're an experienced Linux user, nothing I say here is likely to help out, but for the rest of you I thought before posting this article I'd take a stab at setting up my own proxy server. The "simple" process ended up taking a couple days of on-and-off troubleshooting to get everything working properly. What follows is a brief summary of the things I learned/experienced during my Linux proxy crash course.
First, let's start with the hardware. I had a mini-ITX motherboard and parts available, which would have been perfect! Sadly, the board only has a single network adapter and an x16 PCI-E slot, so I looked elsewhere. I ended up piecing together a system from spare parts.
Jarred's Test System | |
Component | Description |
Processor |
Intel Core 2 Quad Q6600 (2.40GHz, 65nm, 2x4MB cache, quad-core, 1066FSB, 105W) |
Memory | 2x2048MB DDR2-800 RAM |
Motherboard | ASRock Conroe1333-eSATA2 |
Hard Drives | 300GB Maxtor SATA |
Video Card | NVIDIA GeForce 7600GT |
Operating Systems | Arch Linux (64-bit) |
Network Cards |
Onboard NVIDIA Gigabit (NForce) PCI TRENDnet Gigabit (Realtek 8169) |
Obviously, my spare hardware is a bit more potent than what Chris had lying around, and frankly it's complete overkill for this sort of box. On the bright side, it runs 64-bit Linux quite well, and the NVIDIA GPU makes it gaming capable (if you're not too demanding). Getting Arch installed was the easy part, though; configuring things properly took quite a bit more effort.
I followed the directions and… nothing worked. Ugh. Now, I have to put a disclaimer here: I initially used an old Compaq PCI NIC as my secondary network adapter… and I discovered it was non-functional after a while spent troubleshooting. Or at least, it didn't work with Linux and caused the PC to lock up when I tried to load the driver. Good times! So make sure your hardware works properly in advance and you'll save yourself a headache or two. I picked up the TRENDnet Gigabit NIC at a local shop for just $20 and it installed without a hitch.
Old hardware isn't a problem with Linux; broken on the other hand…
As far as configuring Linux, the wikis Chris linked were generally helpful, though they're more detailed than most people will want/need. The "Arch Way" essentially boils down to giving you a fishing pole and some bait and trying to teach you to fish rather than providing you with a nice salmon dinner. Arch has benefits, and you will learn something about Linux (whether you want to or not), but if you're a newbie plan on spending a fair amount of time reading wikis and searching for solutions as you come to grips with the OS.
After Arch was running and I discovered my Compaq NIC was dead, installing the second NIC required a bit of unexpected work. Since it wasn't present during the OS install, the drivers weren't loaded by default. Using lspci, I was able to find my new NIC, determined it was a Realtek 8169 chipset, and a short Google later I found the necessary driver: modprobe r8169. After spending some time reading about ifconfig and trying a few settings, I got the NIC installed and (apparently) functional, so now it was time to get squid and shorewall configured. (Note that this would have likely been unnecessary had the NIC been present during the Arch install.)
While Chris likes the 10.4.20.x network, I prefer the customary 192.168.x.x. Chris listed a global DNS name server of 216.242.0.2, which will work fine (a name server from CiberLynx), but I put in the name servers from my ISP (Comcast). I grabbed this information from the /etc/resolv.conf file, placed there by DHCP from the cable modem. I also wanted to use DHCP as much as possible. The result is that I have my onboard NIC plugged into my cable modem, and the TRENDnet NIC connected to my wireless router. I set a static IP of 192.168.1.1 for the TRENDnet NIC, with DHCP providing IPs from 192.168.1.5 through 192.168.1.250. Really, though, I only need one for the wireless router, which then provides its own DHCP for a different subnet: 192.168.10.x. The good thing about this setup is that I never had to touch the configuration on my wireless router, which has been working fine. I just unplugged it from the cable modem and connected it to the Linux box.
Configuring shorewall was simple, but I ended up not getting network access from my Linux box. That was a "works as intended" feature, but I wanted to surf from the Linux box as well. I had to add ACCEPT $FW net tcp www to /etc/shorewall/rules file to get my local networking back, and I added a line to allow FTP to work as well. Getting squid to work wasn't a problem… after figuring out that Chris forgot the "transparent" option for the http_port setting. I created the directory /home/squidcache for the proxy (mkdir /home/squidcache then chmod 777 /home/squidcache), just because I liked having the cache as a root folder. With everything finally configured properly, I did some testing and found everything worked about as expected. Great! I also installed X Windows, the NVIDIA driver, and the KDE desktop manager as per the Beginner's Guide Wiki—useful for editing multiple text files, surfing the web for configuration information, etc. Then I decided to reboot the Linux box to make sure it was truly working without a hitch.
After the reboot, sadly to say I was back to nothing working… locally or via the proxy. Some poking around (using dmesg and ifconfig) eventually led me to the discovery that my NICs had swapped names after the reboot, so the NForce NIC was now eth1 and TRENDnet was eth0. One suggestion I found said that if I put the drivers for my NICs into the MODULES section of rc.conf, I could specify the order. That didn't work, unfortunately, but another option involved creating a file called /etc/udev/rules.d/10-network.rules with two lines to name my NICs. (Get your MAC Address via dmesg|grep [network module] or udevadm info -a -p /sys/class/net/[Device: eth0/eth1/wlan0/etc.].) So I added:
SUBSYSTEM=="net", ATTR{address}=="[NVIDIA NForce MAC]", NAME="eth0"
SUBSYSTEM=="net", ATTR{address}=="[TRENDnet MAC]", NAME="eth1"
At this point, everything worked properly, but I did run into a few minor quirks over the next day or so of testing. One problem was that Futuremark's Peacekeeper benchmark stopped working. Troubleshooting by Chris ended up showing that there was a problem with the header being sent from the Futuremark server (Message: "Invalid chunk header" in /var/log/squid/cache.log). Telling squid not to cache that IP/server didn't help, as the malformed header problem persisted, but we were able to work around the issue by modifying the shorewall rules. Now the redirect line reads: REDIRECT loc 3128 tcp www - !service.futuremark.com—in other words, redirect all web traffic except for service.futuremark.com through the proxy.
Wrapping things up, here are the final configuration files that I modified for my particular setup. Providing these files almost certainly goes against the Arch Way, but hopefully having a sample configuration can help a few of you out.
/etc/dhcpd.conf: Put your own ISP name servers in here (from /etc/resolv.conf).
/etc/rc.conf: Specify your network setup, server name, and startup daemons.
/etc/shorewall/rules: The necessary redirect for web traffic to work with your proxy.
/etc/shorewall/shorewall.conf: Only changed the one line to STARTUP_ENABLED=Yes.
/etc/squid/squid.conf: Huge file full of proxy options; here's the short version without comment lines.
Update: It seems my proxy was throttling performance when using "diskd" for the cache directory; changing it to aufs has fixed the situation. With diskd, I experienced intermittent bursts of Ethernet transfer rates, with other transfers limited to <500KB/s. We're not sure why this happened, but you may want to check your network transfer rates with iptraf (pacman -S iptraf, then run it and choose the "S" option to view real-time network transfers).
So what are the benefits to running the proxy cache? If you run multiple machines (I've got more than a dozen at present, with systems constantly arriving and leaving), the proxy cache means things like Windows Updates won't have to go to the web every time and download several hundred megabytes of data. That same benefit is potentially available for other services (i.e. FTP), and in an ideal world I'd be able to cache the various Steam updates. Sadly, Valve doesn't appear to like that, so all of my systems need to go out to the Valve servers to update. Except, you can manually copy your steamapps folder from one system to another and avoid the downloads. But I digress. The squid proxy can also provide a host of other capabilities, from anti-virus support to web filtering and even limiting access to certain times of the day.
The bottom line is that if you have an old system lying around—certainly my quad-core proxy is overkill, and even a Pentium 4 is more than you actually need—you can definitely benefit. A small ITX box or perhaps even an Atom nettop would be perfect for this sort of thing, but most of those lack the requisite dual NICs. You could try a PCIe NIC with mini-ITX, though it's questionable whether the x1 cards will function properly in a mini-ITX board with a single x16 slot intended for graphics use. Barring that, a uATX setup would work fine. Our only recommendation is that you consider the cost of electricity compared with the hardware. Sure, Linux will run fine on "free" old hardware, but a proxy server will generally need to be up and running 24/7, so you don't want to have a box sucking down 100W (or more) if you can avoid it.
96 Comments
View All Comments
ChrisRice - Tuesday, May 11, 2010 - link
Freebsd would certainly be my second choice in home firewall systems "First in the corporate scene". That being said I've always been a fan of having the newer packages of Arch compared to Deb order to get many new features that you would be without in a Deb environment. As far as bugs/security holes because its a rolling compared to the bugs/security holes on a Distro with a slow moving release system, I think they both have their own downsides.mfenn - Tuesday, May 11, 2010 - link
I agree with dezza that Arch should *not* be used in a "set it and forget it" box. The great thing about Debian or Red Hat is that you can choose to only receive security updates. The maintainers also backport security fixes for the supported life of the release (which for RHEL is 7 years!). Arch only provides the upstream package versions, so if you want the latest security fixes, you also get the latest functionality-killing bugs. Also, for somebody who isn't religiously running "pacman -Syu" every week or so, Arch will quickly fall into the dist. upgrade hell that you get with other distros. You've got to realize that rolling release doesn't eliminate the dist. upgrade problem, it just allows the user to spread the problems across a longer span of time (e.g. I can update every month for a year and encounter 1 problem each time, or I can upgrade every year and encounter 12 problems). For an infrequently updated system (i.e. one build by any reader of this article because let's face it, if they were Linux geeks, they would have one already) you *will* have upgrade problems. In summary, a growing trend in the Linux community is to treat Arch as a panacea, which it most certainly is not. It's great for some things (desktops for tinkerers, development with the latest and greatest, supporting oddball hardware), but a server distro it isn't.KaarlisK - Tuesday, May 11, 2010 - link
About x1 in x16 slots:Could you please test that? :D You have the motherboard.
And why is cache_memline always half the RAM? Even if you have, for example, 8GB?
JarredWalton - Tuesday, May 11, 2010 - link
I think Chris assumes you don't have that much RAM. You probably only need to use half the RAM or leave 1GB free, or you can get by with just caching 512MB in RAM. I have my proxy set as 2GB RAM, and so most of the data comes across from the proxy at GbE speeds. If it goes to HDD than the speed will drop to around 50MB/s, which is still plenty fast.KaarlisK - Tuesday, May 11, 2010 - link
Thanx for the explanation!So basically my usage pattern can determine the cache size, and there IS use from a large cache, as your 2GB example shows.
ChrisRice - Tuesday, May 11, 2010 - link
I would recommend starting with a smaller cache then tweaking up. I run a 256MB ram cache and that works just fine for me. That being said if I had more ram on the hardware I am using, I would run at least 1GB.KaarlisK - Wednesday, May 12, 2010 - link
Thanks for the reply.And thanks again for the article! This might finally be the way to sneak... a kind-of replacement for WSUS on a certain network. I know it's horrible, but I do not really have a choice.
enterco - Tuesday, May 11, 2010 - link
Hi!I would like to make few observations:
- many owners of an old PC suitable for a caching proxy are using ATX motherboards and enclosures, making the proxy a 'big noisy box', just good to keep in the basement, if you have one.
- an old P4 computer will add enough bucks to the electricity bill
- a typical computer user is not familiar with the requirements of configuring a Linux box, and will avoid this kind of setup.
- the most bandwidth hungry applications in a home is not a HTTP download, but P2P transfers.
I, personally, don't have the basement, and I don't want the noise made by such a box, neither to waste space or money in electricity in my home. So, instead of bringing back to life an old machine, I would prefer to configure QOS on a wireless router.
pkoi - Sunday, May 16, 2010 - link
Go the VMware way.medys - Tuesday, May 11, 2010 - link
These days virtualisation is the answer :)I know that most of people do not need so much as I do and a lot of them do not care about backups, but in case you do, there is a great way to have everything you need in one box :)
Get some semi old PC with at least 2GB of RAM (4GB is recomended).
Install a distribution of linux on it that can run virtualisation software (VirtualBOX, vmware server, KVM).
Configure the linux as NAS server.
Install virtualisation software.
Create virtual machines for anything you want :) router, proxy, LAMP, application server etc....