How to Set Up a Transparent Proxy

Setting up a Linux Box for a Transparent Malware Scanning Proxy

Written by William Keeley, Author of "Tech Tactics Money Saving Secrets"

This article is a work in progress. At the moment, it only covers filters email and web connections. As time permits the author to experiment, other protections will be added
such as Instant Messaging Scanning, DNS filtering, DHCP built into the server, IRC filtering, as well as other services.

Instead of spending thousands on a Cisco router or equivilent Internet appliance, it is possible to build a transparent malware scanning proxy appliance with less than
five hundred dollars worth of equipment. A transparent malware scanning proxy server is a device that is connected between the Internet and devices on a local area network that
protect devices on the local area network from malicious software and hack attempts and require zero to minimum configuration for each device on the local area network.

The transparent malware scanning proxy server discussed here is the frontline defense against malicious software. It should not be the only defense. Although this proxy
server should offer protection against the vast majority of badware out there, there is the possibility that some badware can slip through. Because of this
possibility, it is a good idea for each computer or device subsceptible to malware infection to also include a real time antivirus scanning program. It is also a good idea to
update program regularly in order to address known security vulnerabilities.

Setting up this transparent proxy server requires several things. These include configuring the Linux kernel, installing an antivirus program, installing appropriate
transparent proxies, and properly configuring iptables. Also included is information on how to set up a script to start each program. Presently included in this document
is information on how to get a web (http) and pop3 (email) proxying going. Other features and services will be added as the author has time to do so.

This article provides a brief overview on how to set up each required software package. It does not get into the specifics of compiling each program from source. This
information can be found on the software package's website. In most cases, each of the programs listed is available as software packages specific to each Linux distribution.
To go into great detail for every distribution would take an entire book.

Hardware requirements for this transparent proxy server is actually pretty minimal. All that is really required is a relatively decent computer with two network interfaces. One
connects to the Internet, modem, or other parent device. The other connects to the protected network. The system should preferably have at least 1 GB RAM and at least 1.2 Ghz
processor. However, less peowerful systems have been known to work quite well on networks with only a few computers.

Configuring the Linux Kernel and Iptables

Configuring the Linux kernel is generally easy to do but precautions need to be taken to properly back up the system in case there are things that go wrong. There is currently
much information online that provides information on how to set up the Linux Kernel, so this section will only list the options on what features need to be enabled. This
article was written using Gento Linux kernel Version 2.6.34. It is recommended that the entire set of networking filter features to be enabled. Below is the list of kernel
filtering options that are enabled on the example machine:

Networking Support ->

      Networking options ->
            Network packet filtering framework (Netfilter) ->
                  IP: Netfilter Configuration ->

                        <*> IPv4 connection tracking support (required for NAT)      
                        [*]   proc/sysctl compatibility with old connection tracking 
                        < > IP Userspace queueing via NETLINK (OBSOLETE)  
                        <*> IP tables support (required for filtering/masq/NAT)
                        <*>   "addrtype" address type match support    
                        <*>   "ah" match support           
                        <*>   "ecn" match support           
                        <*>   "ttl" match support            
                        <*>   Packet filtering               
                        <*>     REJECT target support      
                        <*>   LOG target support            
                        <*>   ULOG target support          
                        <*>   Full NAT                      
                        <*>     MASQUERADE target support    
                        <*>     NETMAP target support                
                        <*>     REDIRECT target support               
                        <*>     Basic SNMP-ALG support                    
                        <*>   Packet mangling                            
                        <*>     CLUSTERIP target support (EXPERIMENTAL)  
                        <*>     ECN target support   
                        <*>   "TTL" target support                       
                        <*>   raw table support (required for NOTRACK/TRACE)  
                        <*>   Security table  
                        <*> ARP tables support
                        <*>   ARP packet filtering
                        <*>   ARP payload mangling

After the correct features are selected, the kernel and modules must be compiled and installed to the system. The bootloader needs to be set to recognize the newly compiled
kernel. The iptables software can be either downloaded by visiting www.netfilter.org/downloads.html or by using distribution specific package managers.

After the kernel filtering and iptables setup is completed and tested, the next step involves setting up the antivirus program.

Setting up the Antivirus Program

The transparent proxy can use just about any antivirus program that runs on Linux and allows console communication. A few examples of such programs include Arcavir Socket Scanner,
Avast, AVG, ClamAV, Dr. Web, F-prot, Kapersky, Nod32d, Sophos, Trend Micro Library Scanner, and many others.

The example system on which this article was written uses the free, open source Clam Antivirus. Clam Antivirus may be free and open source, but it is supported by many contributers who use the platform to both save and make money. It is not a
half assed project that is kept up by kids living in their parent's basement. Clam Antivirus is used and supported by any companies who manufacture Internet security equipment.
It can be downloaded from http://www.clamav.net or installed via a distribution's package manager. ClamAV is relatively easy to set up. The configuration file used by the Clam
Antivirus platfrom is usually located in /etc/clamd.conf. The listed options used in the example system are shown below (Most comments included in configuration file are removed):

# /etc/clamd.conf
LogFile /var/log/clamav/clamd.log
LogTime yes
ExtendedDetectionInfo yes
PidFile /var/run/clamav/clamd.pid
LocalSocket /var/run/clamav/clamd.sock
TCPSocket 3310
User clamav
AllowSupplementaryGroups yes
DetectPUA yes
ScanMail yes

A specific user account should be created for Clam Antivirus as well. The user on the example system corresponds to what is posted in the clamd.conf file. The user is clamav.
One other thing neede in order for antivirus programs to work is the fact that the antivirus definitions need to be updated regularly. The Clam antivirus platform uses a program
called freshclam to update antivirus definitions. The configuration file for freshclam is usually located in /etc/freshclam.conf. Information from this file is posted below:

# /etc/freshclam.conf
UpdateLogFile /var/log/clamav/freshclam.log
PidFile /var/run/clamav/freshclam.pid
DatabaseOwner clamav
AllowSupplementaryGroups yes
DatabaseMirror database.clamav.net
ScriptedUpdates yes
NotifyClamd /etc/clamd.conf

This file also has its comments removed for brevity sake. A perusal of the configuration files will show all of the comments that are normally included in the files.

It is a good idea to have the virus definition databases to be check at least a few times each day in order to catch the newest known malware that is released. To run freshclam
every 3 hours try adding the following line to /etc/crontab:

0 */3 * * *  /usr/bin/freshclam --quiet

For those who do not want to change the crontab file, freshclam can also be run in daemon mode. This is accomplished by issuing the following command:

/usr/bin/freshclam -d -c 8

The -d parameter tell freshclam to run as a daemon or background process. The -c 8 parameter tells freshclam to check for updates 8 times each day. This amounts to one check
every three hours.

Setting Up the Web Proxy

Once the antivirus platform is set up, it is time to set up the individual proxy servers. For the web proxy, the example system is set up to use Http Antivirus Proxy or HAVP.
Havp can be downloaded by visiting http://www.server-side.de/ . It can also be installed by using distribution specific package managers. HAVP has several features available
for being a simple antivirus proxy server. Havp can block certain web URL's by using a simple blacklist. Havp can also allow other URL's to be loaded without being scanned for
malware by being included on a whitelist. The havp http proxy does not protect https URL's due to the fact that communications via https are encrypted and havp has no way of
scanning the information passed. However, most web traffic happens over the unencrypted http channels.

Havp has three main configuration files and many reporting template files. The template files are basically html files that are served to a client device when havp detects a
transmission error, a malware infected file, or a file that is blacklisted. These html templates are sub directories usually located under /etc/havp/templates . Each subdirectory
is under a name corresponding to the two letter abbreviation for each language supported. One can modify these html files in order to provide custom branding. The other two
configuration files are usually named /etc/havp/havp.config, /etc/havp/whitelist, and /etc/havp/blacklist .

The havp.conf file tells the havp proxy how to behave as far as how to start up, which Internet port to use, whether the proxy is a transparent proxy (the setup discussed here is),
as well as what antivirus platform to use. There is also a line in the havp.config that directs the proxy to come to a screaching halt if the line is not commented out or
removed. This line is:

REMOVETHISLINE deleteme

There are many other features for havp that can be turned on or off simply by modifying the configuration files. These include whether the blacklist or whitelist takes precedence,
the number of connections havp can handle at a time, whether and which parent proxy is used as well as many other features. It is best to scroll through the configuration file
and take a look at everything there.

The /etc/havp/havp.config that is used on the example system is listed below. Most comments are removed for the sake of brevity. However, a few comments may be added.

# /etc/havp/havp.config
# This is a transparent proxy, therefore transparent needs to be true
TRANSPARENT true
# On the example computer havp is instructed to listen on port 777
PORT 777
# Havp can link to calmav libraries instead of relying on clamd.  This result in less overhead, so it is used on example computer.
ENABLECLAMLIB true
# On the example computer the antivirus database is located in /var/lib/clamav Please check the corresponding config files for clamav if there are problems
CLAMDBDIR /var/lib/clamav
# Since the example system is using havp's ability to link to Clamav' libraries, it doesn't need to use clamd
ENABLECLAMD false
# On the example system, clamav is being used and the others are not therefore, they are disabled
ENABLEFPROT false
ENABLEAVG false
ENABLEAVESERVER false
ENABLESOPHIE false
ENABLENOD32 false
ENABLEAVAST false
ENABLEARCAVIR false
ENABLEDRWEB false

The next two files that affect the behavior of havp are the whitelist and blacklist files. While it is generally good to block most executable content, there are some cases where
it may be desirable for users behind the firewall to be able to download executable files. This includes instances where users can download valuable program updates, some malware
database updates, Microsoft Windows updates, among a few others. The whitelist allows the use of the wildcard character (*) This allows the whitelist to allow content from a
whole range of sites as well as a range of individual pages without having to use thousands of lines of information. The whitelist used on the example system (minus many of the
comments for brevity) is listed below:

# /etc/havp/whitelist

# Let's allow content from Adobe's network
*.adobe.com/*

# Allow computers to download clamav
*sourceforge.net/*clamav-*

# allow computers to download updates and executables from Microsoft's websites
*.microsoft.com/*
*.windowsupdate.com/*

The last part of the havp configuration is the the blacklist file. The blacklist is the place where most of the protective filtering takes place. On the example system, this is where
most of the executable content is blocked. On systems that are set up to use a parent proxy such as squid or privoxy, it is best to filter content using the parent proxy and leave
the whitelist and blacklist for havp empty so that each and evey file that is passed to the protected network from the web is scanned for malicious software. Using parent proxies will
likely be discussed in future versions of this article. The blacklist file, like the whitelist file, allows the use of wildcard characters. This allows the blacklist to block content
from a whole range of sites as well as a range of individual pages without having to use thousands of lines of information. One other consideration when blocking access to content based
on Windows file extensions is the fact that while havp sees EXE, exe, and ExE as being different, Windows does not. This means that if a click happy user receives a link in
her email inviting her to visit http://www.badwaregreetingcards.com/beautiful/infectedflowerscard.eXe and decide to visit the site, havp will allow the executable file to be downloaded if
it does not have malware known to the antivirus software unless if files ending with .eXe are also blocked. The blacklist file used on the example system is listed below:

# /etc/havp/blacklist
*/*.bat
*/*.baT
*/*.bAt
*/*.bAT
*/*.Bat
*/*.BaT
*/*.BAt
*/*.BAT

*/*.bz2
*/*.bZ2
*/*.Bz2
*/*.BZ2

*/*.com
*/*.coM
*/*.cOm
*/*.cOM
*/*.Com
*/*.CoM
*/*.COm
*/*.COM

*/*.cmd
*/*.cmD
*/*.cMd
*/*.cMD
*/*.Cmd
*/*.CmD
*/*.CMd
*/*.CMD

*/*.cpl
*/*.cpL
*/*.cPl
*/*.cPL
*/*.Cpl
*/*.CpL
*/*.CPl
*/*.CPL

*/*.exe
*/*.exE
*/*.eXe
*/*.eXE
*/*.Exe
*/*.ExE
*/*.EXe
*/*.EXE

*/*.hta
*/*.htA
*/*.hTa
*/*.hTA
*/*.Hta
*/*.HtA
*/*.HTa
*/*.HTA

# */*.js
# */*.jse

*/*.lnk
*/*.lnK
*/*.lNk
*/*.lNK
*/*.Lnk
*/*.LnK
*/*.LNk
*/*.LNK

*/*.msi
*/*.msI
*/*.mSi
*/*.mSI
*/*.Msi
*/*.MsI
*/*.MSi
*/*.MSI

*/*.pif
*/*.piF
*/*.pIf
*/*.pIF
*/*.Pif
*/*.PiF
*/*.PIf
*/*.PIF

*/*.reg
*/*.reG
*/*.rEg
*/*.rEG
*/*.Reg
*/*.ReG
*/*.REg
*/*.REG

*/*.scf
*/*.scF
*/*.sCf
*/*.sCF
*/*.Scf
*/*.ScF
*/*.SCf
*/*.SCF

*/*.scr
*/*.scF
*/*.sCf
*/*.sCF
*/*.Scf
*/*.ScF
*/*.SCf
*/*.SCF

*/*.sfx
*/*.sfX
*/*.sFx
*/*.sFX
*/*.Sfx
*/*.SfX
*/*.SFx
*/*.SFX

*/*.vb
*/*.vB
*/*.Vb
*/*.VB

*/*.vbe
*/*.vbE
*/*.vBe
*/*.vBE
*/*.Vbe
*/*.VbE
*/*.VBe
*/*.VBE

*/*.vbs
*/*.vbS
*/*.vBs
*/*.vBS
*/*.Vbs
*/*.VbS
*/*.VBs
*/*.VBS

*/*.ws
*/*.wS
*/*.Ws
*/*.WS

*/*.wsc
*/*.wsC
*/*.wSc
*/*.wSC
*/*.Wsc
*/*.WsC
*/*.WSc
*/*.WSC

*/*.wsf
*/*.wsF
*/*.wSf
*/*.wSF
*/*.Wsf
*/*.WsF
*/*.WSf
*/*.WSF

*/*.wsh
*/*.wsH
*/*.wSh
*/*.wSH
*/*.Wsh
*/*.WsH
*/*.WSh
*/*.WSH

With all of these three configuration files having been discussed, it is time to work on screening incoming email for malicious content. HAVP does not do this, but there is
another program that does.

Setting Up The Mail Proxy

P3scan is a program that works with antivirus programs to screen incoming email for malicious content. P3scan can be downloaded by visiting
http://p3scan.sourceforge.net/ or by using distribution specific package managers. Although P3scan can be used with any number of antivirus programs, the example system on
which this article was written uses the Clamav antivirus suite for p3scan as well as the other scans. One good thing about p3scan is the fact that it uses one main configuration
file which is usually located in /etc/p3scan/p3scan.conf and one template file to instruct the user why an email message has been altered. The template file is
/etc/p3scan/p3scan.mail, and it is usually a symbolic link pointing to the notification template that is written in the language appropriate to the users behind the firewall.
For English and American users, /etc/p3scan/p3scan.mail is symbolically linked to /etc/p3scan/p3scan-en.mail.
The following configuration file is what is used on the example system. Most comments are removed for brevity.

# /etc/p3scan/p3scan.conf

maxchilds = 20
scannertype = clamd
# Below line corresponds to the line TCPSocket 3310 in /etc/clamd.conf
scanner = 127.0.0.1:3310

After p3scan is set up, there is one final task that needs to be done, and this task is putting together a script that makes everything work together.

Now that the hard work is done, the fun begins the last part of this article is where things come together. This part is where so many decisions are made regarding what programs
are started and where these programs are started as well. This part can and should be different for each administrator depending on what the network is used for and how much
security is actually needed. The example system on which this article was written is used in a computer repair shop where malware infected computers are repaired. The example
transparent proxy uses a wireless connection to connect to the Internet and a wired ethernet port to connect to a Linksys router which provides the protected computer network
its Internet connectivity. This allows the author to download good programs while denying Internet access to any rogue program that may be running. On this example system,
eth0 is the ethernet port that provides connectivity to the protected computers or network, and eth1 is the interface that connects the trasnparent proxy server to the Internet.
Thne transparent proxy server on and about which this article is written is a Dell laptop computer running Gentoo Linux. The example system is also used for many other purposes
such as editing videos, audio, writing books and documents, and many other purposes.

Each executable line in the script contains a comment above that explains what the line does.

The script starts below:

#!/bin/bash

#Remount root files system with mandatory locks in place so that HAVP works
/bin/mount / -o mand -o remount

#Update the antivirus signaures to catch the latest known malware
/usr/bin/freshclam

#Run freshclam so that it checks for and downloads any new definition file every 3 hours
/usr/bin/freshclam -d -c 8

#start clamd for pop3 scanning
/usr/sbin/clamd

#Start HAVP transparent proxy
/usr/sbin/havp

#start p3scan transparent proxy
/usr/sbin/p3scan

#Set up network interface that supplies the protected network with connectivity On this system, it's eth0
/sbin/ifconfig eth0 192.168.5.1 up

#Tell Linux Kernel it is OK to forward Internet packets
echo 1 > /proc/sys/net/ipv4/ip_forward

#Forward Web Requests to Transparent Proxy for virus filtering. Havp is configured to run on port 777
/sbin/iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-ports 777


#Forward pop3 (email receiving) requests to transparent proxy for filtering.  P3scan is configured to listen on port 8110
/sbin/iptables -t nat -A PREROUTING -p tcp --dport 110 -j REDIRECT --to-ports 8110 

#Forward all DNS requests
/sbin/iptables -t nat -A POSTROUTING -p tcp --dport 53 -o eth1 -j MASQUERADE
/sbin/iptables -t nat -A POSTROUTING -p udp --dport 53 -o eth1 -j MASQUERADE

#Forward all encrypted web requests to sites without any filtering or virus protection.  This may be modified or commented out as desired.
/sbin/iptables -t nat -A POSTROUTING -p tcp --dport 443 -o eth1 -j MASQUERADE

#Forward all other requests as normal.  It is listed here for information purposes but is commented out for author's needs
#/sbin/iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE

#These next two lines gives information for manual configuration. DHCPD is not used because protected network has its own router
echo Use 192.168.5.1 for gateway.
cat /etc/resolv.conf

Final Word

This concludes this article. It is the hope of the author that this provides needed and useful information that is useful to many people. Like
previously stated, this article is a work in progress with more material being added in the future.

Tech Tactics - Money Saving Secrets

Tuesday, January 31, 2012