Bug #7763

nothing works

Added by Anonymous over 1 year ago.

Status:NewStart date:07/17/2016
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-
Component:

Description

#!/usr/bin/perl #
  1. CGIProxy 2.1.17 #
  2. CGIProxy (nph-proxy.cgi): a proxy in the form of a CGI script.
  3. Retrieves the resource at any HTTP or FTP URL, updating embedded URLs
  4. in HTML and all other resources to point back through this script. By
  5. default, no user info is sent to the server. Options include
  6. text-only proxying to save bandwidth, cookie filtering, ad filtering,
  7. script removal, user-defined encoding of the target URL, and much more.
  8. Besides running as a CGI script, can also run under mod_perl, as a
  9. FastCGI script, or can use its own embedded HTTP server.
  10. Requires Perl 5. #
  11. Copyright (C) 1996, 1998-2016 by James Marshall,
  12. All rights reserved. Free for non-commercial use; commercial use
  13. requires a license. #
  14. For the latest, see https://jmarshall.com/tools/cgiproxy/ # #
  15. IMPORTANT NOTE ABOUT ANONYMOUS BROWSING: #
  16. CGIProxy was originally made for indirect browsing more than
  17. anonymity, but since people are using it for anonymity, I've tried
  18. to make it as anonymous as possible. Suggestions welcome. For best
  19. anonymity, browse with JavaScript turned off. That said, please notify
  20. me if you find any privacy holes, even when using JavaScript.
  21. Anonymity is good, but may not be bulletproof. For example, if even
  22. a single unchecked JavaScript statement can be run, your anonymity
  23. can be compromised. I've tried to handle JS in every place it can
  24. exist, but please tell me if I missed any. Also, browser plugins
  25. or other executable extensions may be able to reveal you to a server.
  26. Also, be aware that this script doesn't modify PDF files or other
  27. third-party document formats that may contain linking ability, so
  28. you will lose your anonymity if you follow links in such files.
  29. If you find any other way your anonymity can be compromised, please let
  30. me know. # #
  31. INSTALLATION: #
  32. First, edit this file (nph-proxy.cgi) to configure it-- see the CONFIGURATION
  33. section just below for certain options that may be required. All
  34. configuration variables are set in the "user configuration" section starting
  35. around line 338.
  36. After copying nph-proxy.cgi to your server, run "./nph-proxy.cgi init"
  37. from the server command line (on Windows, run "perl nph-proxy.cgi init").
  38. This creates needed directories, installs all optional Perl (CPAN) modules,
  39. and creates the database that CGIProxy uses. Ignore the scrolling text,
  40. and hit <return> if asked any questions. Ideally you can run this command
  41. as root to set file permissions and ownership optimally, but even if run as
  42. non-root these will be handled as well as possible and the script should
  43. still work.
  44. To see a simple usage message, run "./nph-proxy.cgi -?".
  45. It's fine to rename this file, as long as your Web server is set up to
  46. recognize it. All of the documentation refers to "nph-proxy.cgi",
  47. but replace that with whatever you renamed the file to. #
  48. For complete installation instructions, see
  49. https://jmarshall.com/tools/cgiproxy/install.html # #
  50. CONFIGURATION: #
  51. . Set $PROXY_DIR and $RUN_AS_USER -- see the comments above those settings
  52. for details.
  53. . If you don't have root access on your server, set $LOCAL_LIB_DIR so that
  54. the Perl (CPAN) modules can be installed under your own directory. Do
  55. this before running "./nph-proxy.cgi init", as described above.
  56. . If you're using either a MySQL/MariaDB or Oracle database to store cookies,
  57. you need to set $DB_DRIVER, $DB_USER, $DB_PASS, and possibly $DB_SERVER .
  58. See the notes by those settings for more details. Note that you need to
  59. purge the database periodically by running "./nph-proxy.cgi purge-db",
  60. with a cron job on Unix or Mac, or with the Task Scheduler in Windows.
  61. The default database driver is SQLite, which doesn't need a username or
  62. password or even a running database engine, but still requires periodic
  63. purging.
  64. . If you're using another HTTP or SSL proxy, set $HTTP_PROXY,
  65. $SSL_PROXY, and $NO_PROXY as needed. If those proxies use
  66. authentication, set $PROXY_AUTH and $SSL_PROXY_AUTH accordingly.
  67. . If you're using a SOCKS proxy, set $SOCKS_PROXY and possibly
  68. $SOCKS_USERNAME and $SOCKS_PASSWORD .
  69. . If this is running on an insecure server that doesn't use port 80, set
  70. $RUNNING_ON_SSL_SERVER=0 (otherwise, the default of '' is fine).
  71. . If you plan to run CGIProxy as a FastCGI script, set at least
  72. $SECRET_PATH and see the configuration section "FastCGI configuration".
  73. . If you plan to run CGIProxy using its own embedded server, set
  74. $SECRET_PATH and see the configuration section "Embedded server configuration".
  75. You'll also need a certificate and private key (key pair) in PEM
  76. format.
  77. . See http://www.jmarshall.com/tools/cgiproxy/options.html#env , in the section
  78. "OPTIONS RELATED TO YOUR SERVER/NETWORK ENVIRONMENT", for other options
  79. you may need to set. #
  80. Other options include:
  81. . Set $TEXT_ONLY, $REMOVE_COOKIES, $REMOVE_SCRIPTS, $FILTER_ADS,
  82. $HIDE_REFERER, and $INSERT_ENTRY_FORM as desired. Set
  83. $REMOVE_SCRIPTS if anonymity is important.
  84. . To let the user choose all of those settings (except $TEXT_ONLY),
  85. set $ALLOW_USER_CONFIG=1.
  86. . To change the encoding format of the URL, modify the
  87. proxy_encode() and proxy_decode() routines. The default
  88. routines are suitable for simple PATH_INFO compliance.
  89. . To encode cookies, modify the cookie_encode() and cookie_decode()
  90. routines.
  91. . You can restrict which servers this proxy will access, with
  92. @ALLOWED_SERVERS and @BANNED_SERVERS.
  93. . Similarly, you can specify allowed and denied server lists for
  94. both cookies and scripts.
  95. . For security, you can ban access to private IP ranges, with
  96. @BANNED_NETWORKS.
  97. . If filtering ads, you can customize this with a few settings.
  98. . To insert your own block of HTML into each page, set $INSERT_HTML
  99. or $INSERT_FILE.
  100. . As a last resort, if you really can't run this script as NPH,
  101. you can try to run it as non-NPH by setting $NOT_RUNNING_AS_NPH=1.
  102. BUT, read the notes and warnings above that line. Caveat surfor.
  103. . For crude load-balancing among a set of proxies, set @PROXY_GROUP.
  104. . Other config is possible; see the user configuration section.
  105. . If heavy use of this proxy puts a load on your server, see the
  106. "NOTES ON PERFORMANCE" section below. #
  107. For more info, read the comments above any config options you set. #
  108. For a full list of options, see https://jmarshall.com/tools/cgiproxy/options.html #
  109. This script MUST be installed as a non-parsed header (NPH) script.
  110. In Apache and many other servers, this is done by simply starting the
  111. filename with "nph-". It MAY be possible to fake it as a non-NPH
  112. script, MOST of the time, by using the $NOT_RUNNING_AS_NPH feature.
  113. This is not advised. See the comments by that option for warnings. # #
  114. TO USE:
  115. Start a browsing session by visiting the script's URL with no parameters.
  116. You can bookmark pages you browse to through the proxy, or link to
  117. the URLs that are generated. # #
  118. NOTES ON PERFORMANCE:
  119. Unfortunately, this has gotten slower through the versions, mostly
  120. because of optional new features. Configured equally, version 1.3
  121. takes 25% longer to run than 1.0 or 1.1 (based on cough highly
  122. abbreviated testing). Compiling takes about 50% longer.
  123. Leaving $REMOVE_SCRIPTS=1 adds 25-50% to the running time.
  124. Remember that we're talking about tenths of a second here. Most of
  125. the delay experienced by the user is from waiting on two network
  126. connections. These performance issues only matter if your server
  127. CPU is getting overloaded. Also, these mostly matter when retrieving
  128. JavaScript and Flash, because modifying those is what takes most of the
  129. time.
  130. If you can, use mod_perl. Starting with version 1.3.1, this should
  131. work under mod_perl, which requires Perl 5.004 or later. If you use
  132. mod_perl, be careful to install this as an NPH script, i.e. set the
  133. "PerlSendHeader Off" configuration directive (or "PerlOptions -ParseHeaders"
  134. if using mod_perl 2.x). For more info, see the mod_perl documentation.
  135. If you can't use mod_perl, try using FastCGI. Configure the section
  136. "FastCGI configuration" below, and run nph-proxy.cgi from the command
  137. line to see a usage message. You'll also need to configure your
  138. Web server to use FastCGI.
  139. If you can't use mod_perl or FastCGI, try running CGIProxy as its own
  140. embedded server. Configure the section "Embedded server configuration",
  141. and run nph-proxy.cgi from the command line to see a usage message.
  142. You'll also need a key pair (certificate and private key).
  143. If you use mod_perl, FastCGI, or the embedded server, and modify this
  144. script, see the note near the "reset 'a-z'" line below, regarding
  145. UPPER_CASE and lower_case variable names. #
  146. If performance on the browser is bad for JS-heavy sites like facebook,
  147. then close other browser windows and other CPU-heavy processes, and
  148. see the comments above the setting of %REDIRECTS below. Also, try
  149. using a browser other than MSIE-- it seems to have the most problems. # #
  150. TO DO:
  151. What I want to hear about:
  152. . Any HTML tags not being converted here.
  153. . Any method of introducing JavaScript or other script, that's not
  154. being handled here.
  155. . Any script MIME types other than those already in @SCRIPT_MIME_TYPES.
  156. . Any MIME types other than text/html that have links that need to
  157. be converted.
  158. plug any other script holes (e.g. MSIE-proprietary, other MIME types?)
  159. more error checking?
  160. find a simple encryption technique for proxy_encode()
  161. For ad filtering, add option to disable images from servers other than
  162. that of the containing HTML page? Is it worth it? # #
  163. BUGS:
  164. Anonymity may not not perfect. In particular, there may be some remaining
  165. JavaScript or Flash holes. Please let me know if you find any.
  166. Since ALL of your cookies are sent to this script (which then chooses
  167. the relevant ones), some cookies could be dropped if you accumulate a
  168. lot, resulting in "Bad Request" errors. To fix this, use a database
  169. server for cookies. # #
  170. I first wrote this in 1996 as an experiment to allow indirect browsing.
  171. The original seed was a program I wrote for Rich Morin's article
  172. in the June 1996 issue of Unix Review, online at
  173. http://www.cfcl.com/tin/P/199606.shtml. #
  174. Confession: I didn't originally write this with the spec for HTTP
  175. proxies in mind, and there are probably some violations of the protocol
  176. (at least for proxies). This whole thing is one big violation of the
  177. proxy model anyway, so I hereby rationalize that the spec can be widely
  178. interpreted here. If there is demand, I can make it more conformant.
  179. The HTTP client and server components should be fine; it's just the
  180. special requirements for proxies that may not be followed. #
    #--------------------------------------------------------------------------

use strict ;
use warnings ;
no warnings qw(uninitialized redefine) ; # we use defaults all the time

use Encode ;
use IO::Handle ;
use IO::Select ;
use File::Spec ;
use Time::Local ;
use Getopt::Long ;
use Socket qw(:all) ;
use Net::Domain qw(hostfqdn) ;
use Fcntl qw(:DEFAULT :flock) ;
use POSIX qw(:sys_wait_h setsid);
use Time::HiRes qw(gettimeofday tv_interval) ;
use Errno qw(EINTR EAGAIN EWOULDBLOCK ENOBUFS EPIPE) ;

  1. First block below is config variables, second block is sort-of config
  2. variables, third block is persistent constants, fourth block is would-be
  3. persistent constants (not set until needed), fifth block is constants for
  4. JavaScript processing (mostly regular expressions), and last block is
  5. variables.
  6. Removed $RE_JS_STRING_LITERAL to help with Perl's long-literal-string bug,
  7. but can replace it later if/when that is fixed. Added
  8. $RE_JS_STRING_LITERAL_START, $RE_JS_STRING_REMAINDER_1, and
  9. $RE_JS_STRING_REMAINDER_2 as part of the workaround.
    use vars qw(
    $PROXY_DIR $SECRET_PATH $LOCAL_LIB_DIR
    $FCGI_SOCKET $FCGI_MAX_REQUESTS_PER_PROCESS $FCGI_NUM_PROCESSES
    $PRIVATE_KEY_FILE $CERTIFICATE_FILE $RUN_AS_USER $EMB_USERNAME $EMB_PASSWORD
    $DB_DRIVER $DB_SERVER $DB_NAME $DB_USER $DB_PASS $USE_DB_FOR_COOKIES
    %REDIRECTS %TIMEOUT_MULTIPLIER_BY_HOST
    $DEFAULT_LANG
    $TEXT_ONLY
    $REMOVE_COOKIES $REMOVE_SCRIPTS $FILTER_ADS $HIDE_REFERER
    $INSERT_ENTRY_FORM $ALLOW_USER_CONFIG
    $ENCODE_DECODE_BLOCK_IN_JS
    @ALLOWED_SERVERS @BANNED_SERVERS @BANNED_NETWORKS
    $NO_COOKIE_WITH_IMAGE @ALLOWED_COOKIE_SERVERS @BANNED_COOKIE_SERVERS
    @ALLOWED_SCRIPT_SERVERS @BANNED_SCRIPT_SERVERS
    @BANNED_IMAGE_URL_PATTERNS $RETURN_EMPTY_GIF
    $USER_IP_ADDRESS_TEST $DESTINATION_SERVER_TEST
    $INSERT_HTML $INSERT_FILE $ANONYMIZE_INSERTION $FORM_AFTER_INSERTION
    $INSERTION_FRAME_HEIGHT
    $RUNNING_ON_SSL_SERVER $NOT_RUNNING_AS_NPH $USER_FACING_PORT
    $HTTP_PROXY $SSL_PROXY $NO_PROXY $PROXY_AUTH $SSL_PROXY_AUTH
    $SOCKS_PROXY $SOCKS_USERNAME $SOCKS_PASSWORD
    $MINIMIZE_CACHING
    $SESSION_COOKIES_ONLY $COOKIE_PATH_FOLLOWS_SPEC $RESPECT_THREE_DOT_RULE
    @PROXY_GROUP
    $USER_AGENT $USE_PASSIVE_FTP_MODE $SHOW_FTP_WELCOME
    $PROXIFY_SCRIPTS $PROXIFY_SWF $ALLOW_RTMP_PROXY $ALLOW_UNPROXIFIED_SCRIPTS
    $PROXIFY_COMMENTS
    $USE_POST_ON_START $ENCODE_URL_INPUT
    $REMOVE_TITLES $NO_BROWSE_THROUGH_SELF $NO_LINK_TO_START $MAX_REQUEST_SIZE
    @TRANSMIT_HTML_IN_PARTS_URLS
    $QUIETLY_EXIT_PROXY_SESSION
    $ALERT_ON_CSP_VIOLATION
    $OVERRIDE_SECURITY

    @SCRIPT_MIME_TYPES @OTHER_TYPES_TO_REGISTER @TYPES_TO_HANDLE
    $NON_TEXT_EXTENSIONS
    @RTL_LANG
    $PROXY_VERSION

    $RUN_METHOD
    @MONTH @WEEKDAY %UN_MONTH
    %RTL_LANG
    @BANNED_NETWORK_ADDRS
    $DB_HOSTPORT $DBH $STH_UPD_COOKIE $STH_INS_COOKIE $STH_SEL_COOKIE $STH_SEL_ALL_COOKIES
    $STH_DEL_COOKIE $STH_DEL_ALL_COOKIES $STH_UPD_SESSION $STH_INS_SESSION $STH_SEL_IP
    $STH_PURGE_SESSIONS $STH_PURGE_COOKIES
    $USER_IP_ADDRESS_TEST_H $DESTINATION_SERVER_TEST_H
    $RUNNING_ON_IIS
    @NO_PROXY
    $NO_CACHE_HEADERS
    @ALL_TYPES %MIME_TYPE_ID $SCRIPT_TYPE_REGEX $TYPES_TO_HANDLE_REGEX
    $THIS_HOST $ENV_SERVER_PORT $ENV_SCRIPT_NAME $THIS_SCRIPT_URL
    $SSL_SUPPORTED
    $RTMP_SERVER_PORT
    %ENV_UNCHANGING $HAS_INITED

    %MSG @MSG_KEYS $CUSTOM_INSERTION %IN_CUSTOM_INSERTION

    $RE_JS_WHITE_SPACE $RE_JS_LINE_TERMINATOR $RE_JS_COMMENT
    $RE_JS_IDENTIFIER_START $RE_JS_IDENTIFIER_PART $RE_JS_IDENTIFIER_NAME
    $RE_JS_PUNCTUATOR $RE_JS_DIV_PUNCTUATOR
    $RE_JS_NUMERIC_LITERAL $RE_JS_ESCAPE_SEQUENCE
    $RE_JS_STRING_LITERAL
    $RE_JS_STRING_LITERAL_START $RE_JS_STRING_REMAINDER_1 $RE_JS_STRING_REMAINDER_2
    $RE_JS_REGULAR_EXPRESSION_LITERAL
    $RE_JS_TOKEN $RE_JS_INPUT_ELEMENT_DIV $RE_JS_INPUT_ELEMENT_REG_EXP
    $RE_JS_SKIP $RE_JS_SKIP_NO_LT
    %RE_JS_SET_TRAPPED_PROPERTIES %RE_JS_SET_RESERVED_WORDS_NON_EXPRESSION
    %RE_JS_SET_ALL_PUNCTUATORS
    $JSLIB_BODY $JSLIB_BODY_GZ

    $HTTP_VERSION $HTTP_1_X
    $URL
    $STDIN $STDOUT
    $now $session_id $session_id_persistent $session_cookies
    $packed_flags $encoded_URL $doing_insert_here $env_accept
    $e_remove_cookies $e_remove_scripts $e_filter_ads $e_insert_entry_form
    $e_hide_referer
    $images_are_banned_here $scripts_are_banned_here $cookies_are_banned_here
    $scheme $authority $path $host $port $username $password
    $csp $csp_ro $csp_is_supported
    $cookie_to_server %auth
    $script_url $url_start $url_start_inframe $url_start_noframe $lang $dir
    $is_in_frame $expected_type
    $base_url $base_scheme $base_host $base_path $base_file $base_unframes
    $default_style_type $default_script_type
    $status $headers $body $charset $meta_charset $is_html
    %in_mini_start_form
    $does_write
    $swflib $AVM2_BYTECODES
    $xhr_origin
    $temp_counter
    $debug ) ;

#--------------------------------------------------------------------------
  1. user configuration
    #--------------------------------------------------------------------------
  1. For certain purposes, CGIProxy may need to create files. This is where
  2. those will go. For example, use "/home/username/cgiproxy", where "username"
  3. is replaced by your username.
  4. This directory has to be readable and writeable by the userID that CGIProxy
  5. runs as; that userID is set in the Web server configuration (if this is running
  6. as a CGI script or under mod_perl), or else it's the userID used to start
  7. the FastCGI server or the embedded server.
  8. This can be either a relative or absolute path. If it's a relative path, it
  9. will be interpreted relative to the home directory of this script file's owner.
  10. If you have root access and can run "./nph-proxy init" as root (which has
  11. advantages), then set this to an absolute path so it doesn't go under
  12. the /root directory.
  13. Note that you need to use "\\" to represent a single backslash.
  14. Leading drive letters (e.g. for Windows) are allowed.
  15. The default will use the directory "cgiproxy" under your home directory (which
  16. varies with your operating system). If it doesn't work, manually set
  17. $PROXY_DIR to an absolute path. You can name it whatever you want.
  18. Also see $RUN_AS_USER, just below. Note that many special users, probably
  19. including your Web server's user, don't have a home directory to put $PROXY_DIR
  20. under. For such a case, you need to set $PROXY_DIR to another directory somewhere
  21. that the Web server's user can read and write.
  22. Note that in Unix or Mac, using a directory on a mounted filesystem (which often
  23. includes home directories) may prevent that filesystem from being unmounted,
  24. which may bother your sysadmin. If so, try setting this to something starting
  25. with "/tmp/", like "/tmp/.your-username/".
  26. If you get "mkdir" permission errors, create the directory yourself with mkdir.
  27. You may also need to "chmod 777 directoryname" to make the directory writable
  28. by the Web server, but note that this makes it readable and writable by
  29. everybody. You might ask your webmaster if they provide a safe way for CGI
  30. scripts to read and write files in your directories. With Apache, the suEXEC
  31. feature is often used to let multiple website owners use the same server
  32. securely: each CGI or mod_perl script is run as the owner of the script file.
    $PROXY_DIR= 'cgiproxy' ;
  1. If you have root access and can run "./nph-proxy init" as root, then set this
  2. to either the username or numeric user ID that the script will run as. When
  3. run as a CGI script or under mod_perl, this is usually the Web server's
  4. username, or possibly the script owner's username if using Apache with the
  5. suEXEC feature turned on.
  6. Setting this lets "./nph-proxy init" create the needed directories ($PROXY_DIR
  7. and subdirectories) and a SQLite database file (if using SQLite) with the right
  8. permissions and ownership.
  9. If you run this script as the root user in order to use port 443 with the
  10. embedded server, it's a good idea to change the user ID to something with
  11. fewer permissions. You can also do this by setting $RUN_AS_USER .
  12. In any case, this has to be set to an existing user on the server, i.e. CGIProxy
  13. doesn't create the user if it doesn't already exist.
  14. If this is not set, it will default to the owner of this script file.
  15. Also see $PROXY_DIR, just above. Note that many special users, probably including
  16. your Web server's user, don't have a home directory to put $PROXY_DIR under.
  17. For such a case, you need to set $PROXY_DIR to another directory somewhere that
  18. the Web server's user can read and write.
  19. This probably won't work on Windows, though note that you don't need root
  20. access to use port 443 on Windows.
    #$RUN_AS_USER= 'nobody' ;
  1. IMPORTANT: CHANGE THIS IF USING FASTCGI OR THE EMBEDDED SERVER!
  2. If using FastCGI or the embedded server, the path in the URL will begin with a
  3. fixed alphanumeric sequence (string) to help conceal the proxy. You can set
  4. this to any alphanumeric string. The URL of your proxy will be
  5. "https://example.com/secret" (replace "secret" with your actual secret).
  6. If we didn't do this, then a censor could check if a site hosts a proxy by
  7. merely accessing "https://example.com" .
  8. Note that this is not a secret from the users, just from anyone watching
  9. network traffic. Also, it won't be kept secret if your server is insecure.
    $SECRET_PATH= 'secret' ;
  1. If you don't have root access on your server, set this so that Perl (CPAN)
  2. modules are installed under your own directory. Be sure to follow the
  3. instructions about the environment variables after you run "./nph-proxy.cgi init".
  4. If this script is not running as your user ID (such as a Web server running
  5. as its own user ID), and you're using the local::lib module, then
  6. set this to the directory where your modules are installed with local::lib .
  7. This is normally just the "perl5" directory under your home directory, unless
  8. you renamed it or configured local::lib to use a different directory.
  9. If you set this before installing modules, then CPAN (Perl) modules will be
  10. installed into this directory.
    #$LOCAL_LIB_DIR= '/home/your-username/perl5' ; # this example works for Unix or Mac
  1. If you're running CGIProxy such that the Web server that the user sees is different
  2. from the Web server CGIProxy is running on (though maybe on the same machine),
  3. the SERVER_PORT environment variable might not be set to the port that the
  4. user is connecting to, and so all the generated URLs will have the wrong
  5. port in them. In this case, you can set $USER_FACING_PORT to the port number
  6. that should be in the URLs, i.e. the port that the user connects to.
  7. For example, this would be useful when the user connects to nginx on a server where
  8. nginx then calls an internal Apache process to run this script (perhaps to take
  9. advantage of mod_perl). In such a case, the SERVER_PORT set by Apache will be
  10. the port used for internal nginx-to-Apache communication, not the port the user
  11. connects to nginx with. In this case, you would set $USER_FACING_PORT to the
  12. outward-facing port that nginx listens on.
    #$USER_FACING_PORT= 443 ;

#---- FastCGI configuration ---------------------

  1. FastCGI is a mechanism that can speed up CGI-like scripts. It's purely
  2. optional and requires some web server configuration as well, and if you
  3. don't use it you can ignore this section.
  1. FastCGI uses a local Internet socket to communicate between the FastCGI client
  2. (e.g. the web server software) and the FastCGI server (e.g. a CGI script that
  3. has been converted to run as a listening daemon, such as CGIProxy).
  4. Set this to a port number for this script to listen on as a FastCGI script.
  5. You'll need to set it in your HTTP server's configuration file too (e.g. in
  6. httpd.conf or nginx.conf). For details of that, see
  7. http://www.jmarshall.com/tools/cgiproxy/install.html#fastcgi
  8. This used to use a "Unix-domain socket" instead of an Internet socket, but
  9. there was trouble with the FCGI module and Unix-domain sockets, so as of
  10. CGIProxy 2.1.14 we use an Internet socket.
  11. Note that this no longer requires a ":" at the start, though that is allowed.
    $FCGI_SOCKET= 8002 ;
  1. FastCGI uses multiple processes to listen on its socket, where each
  2. process can handle one request at a time. This is a performance tuning
  3. parameter, so the optimal number depends on your server environment
  4. (hardware and software).
  5. If you don't understand this, the default should be fine. You can experiment
  6. with different numbers if performance is an issue.
  7. This can be overridden with the "-n" command-line parameter.
    $FCGI_NUM_PROCESSES= 100 ;
  1. As a FastCGI process gets used for many requests, it slowly takes more and
  2. more memory, due to the copy-on-write behavior of forked processes. Thus,
  3. it's cleaner if you kill a process and restart a fresh one after it handles
  4. some number of requests. This is a performance tuning parameter, so the
  5. optimal number depends on your server environment (hardware and software).
  6. If you don't understand this, the default should be fine. You can experiment
  7. with different numbers if performance is an issue.
  8. This can be overridden with the "-m" command-line parameter.
    $FCGI_MAX_REQUESTS_PER_PROCESS= 1000 ;

#---- End of FastCGI configuration --------------

  1. Much initialization of unchanging values is now in this routine. (Ignore
  2. this if you don't know what it means.)
    sub init {

#---- Embedded server configuration -------------

  1. For the embedded server, you need to a) put a certificate and private key,
  2. in PEM format, into the $PROXY_DIR directory, and b) set these two
  3. variables to the two file names. (A "certificate" is the same thing as
  4. a public key.)
  5. You can either pay a certificate authority for a key pair, or you can
  6. generate your own "self-signed" key pair. The disadvantage of using a
  7. self-signed key pair is that your users will see a browser warning about
  8. an untrusted certificate. This is all true of any secure server.
    #$CERTIFICATE_FILE= 'plain-cert.pem' ;
    #$PRIVATE_KEY_FILE= 'plain-rsa.pem' ;
  1. It's important to use $SECRET_PATH, but you can require a username and
  2. password too. All users must login with whatever you set below, using
  3. HTTP Basic authentication. Leave these commented out to disable
  4. password protection.
  5. This is very simple right now. In the future there will likely be
  6. more authentication methods, including support for multiple users.
    #$EMB_USERNAME= 'free' ;
    #$EMB_PASSWORD= 'speech' ;

#---- End of embedded server configuration ------

#---- Database configuration --------------------

  1. Database use is optional, and if you don't use one you can ignore this
  2. section. But if you're getting "Bad Request" errors, you can fix it
  3. by using a database; also, see the $USE_DB_FOR_COOKIES option below.
  1. Database use is optional. It's most efficient when this script is running
  2. under mod_perl or FastCGI.
  3. The easiest database to use is SQLite. While normal database engines like
  4. MySQL/MariaDB or Oracle require a constantly running process and some
  5. configuration by the system administrator, SQLite requires none of this--
  6. it reads and writes directly to database files in your own directory, as
  7. protected by the operating system permissions. Because of its ease of
  8. configuration, SQLite is the default database here.
  9. If you're using a database other than SQLite, create a database user account
  10. for this program to use, or ask your database administrator to do it. Set
  11. $DB_USER and $DB_PASS to the username and password, below. This program
  12. will try to create the required database, named $DB_NAME as set below, but
  13. if your DBA isn't willing to grant the permission to create databases to
  14. the CGIProxy user, then you or the DBA will need to create the database.
  15. This can be done with the SQL command "CREATE DATABASE cgiproxy;" (or
  16. whatever you set $DB_NAME to below). #
  17. If you are using a database of any kind, it must be purged periodically. In
  18. Unix or Mac, do this with a cron job. In Windows, use the Task Scheduler.
  19. In Unix or Mac, the command to purge the database is
  20. "/path/to/script/nph-proxy.cgi purge-db". (Replace "/path/to/script/"
  21. with the actual path to the script.) Edit your crontab with "crontab -e",
  22. and add a line like:
  23. "0 * * * * /path/to/script/nph-proxy.cgi purge-db" (without quotes)
  24. to purge the database at the top of every hour, or:
  25. "0 2 * * * /path/to/script/nph-proxy.cgi purge-db" (without quotes)
  26. to purge it every night at 2:00am.
  1. This is the name of the "database driver" for the database software you're using.
  2. Currently supported values are "SQLite", "MySQL" and "Oracle".
  3. The default of "SQLite" is the easiest to use. SQLite lets you have database
  4. functionality by directly reading and writing a database file, without requiring
  5. a full database engine like MySQL/MariaDB or Oracle to run on your server.
  6. Note that it is potentially insecure to use a database if there are other
  7. untrusted people with accounts on the same server, especially if they can read
  8. this script file and the database password below. The easiest way to securely
  9. use a database is to have your own server with no untrusted user having shell
  10. access on it. If this isn't practical, then you need to set file permissions
  11. appropriately on both this script file and any SQLite database file: set
  12. permissions (and file ownership and group ownership) on both files to be
  13. accessible by the web server's userID, but not accessible by anyone else on
  14. the same server. Note that running this on a virtual private server isn't
  15. insecure in this way-- even though a VPS is a shared machine, other people
  16. can't see your files (except the sysadmin).
  17. Set this to "" or comment it out to not use a database. Note that you will
  18. probably see "Bad Request" errors when you accumulate too many cookies; using
  19. a database solves this problem, or you can periodically clear your cookies.
    $DB_DRIVER= 'SQLite' ;
  1. If your database (other than SQLite) is running on a remote server, or on a
  2. non-default port, set this to "dbserver:port", where dbserver is the name
  3. or IP address of your database server, and port is the port it is listening
  4. on. If dbserver is empty (as in ":port"), then it defaults to localhost;
  5. if port is empty (as in "dbserver:" or just "dbserver"), then it defaults
  6. to 3306 for MySQL, or 1521 for Oracle.
    #$DB_SERVER= "localhost:3306" ;
  1. CGIProxy creates (if possible) and uses its own database. If you want to name
  2. the database something else, change this value. If you need a database
  3. administrator to create the database, tell him or her this database name.
  4. This value must only contain letters, numbers, and the "_" character.
    $DB_NAME= 'cgiproxy' ;
  1. These are the username and password of the database account, as described above.
  2. If you're using SQLite, you don't need to set these-- access to the SQLite
  3. database files is controlled by the permissions of the filesystem.
    $DB_USER= 'proxy' ;
    $DB_PASS= '' ;
  1. If set, then use the server-side database to store cookies. This gets around
  2. the problem of too many total cookies causing "Bad Request" errors.
  3. Set this to 1 to use the database (if it's configured), or to 0 to NOT use
  4. the database.
    $USE_DB_FOR_COOKIES= 1 ;

#---- End of database configuration -------------

  1. This is the default language to use for all CGIProxy messages, until the user
  2. clicks on a flag in the start form.
    $DEFAULT_LANG= 'en' ;
  1. If set, then proxy traffic will be restricted to text data only, to save
  2. bandwidth (though it can still be circumvented with uuencode, etc.).
  3. To replace images with a 1x1 transparent GIF, set $RETURN_EMPTY_GIF below.
    $TEXT_ONLY= 0 ; # set to 1 to allow only text data, 0 to allow all
  1. If set, then prevent all cookies from passing through the proxy. To allow
  2. cookies from some servers, set this to 0 and see @ALLOWED_COOKIE_SERVERS
  3. and @BANNED_COOKIE_SERVERS below. You can also prevent cookies with
  4. images by setting $NO_COOKIE_WITH_IMAGE below.
  5. Note that this only affects cookies from the target server. The proxy
  6. script sends its own cookies for other reasons too, like to support
  7. authentication. This flag does not stop these cookies from being sent.
    $REMOVE_COOKIES= 0 ;
  1. If set, then remove as much scripting as possible. If anonymity is
  2. important, this is strongly recommended! Better yet, turn off script
  3. support in your browser.
  4. On the HTTP level:
  5. . prevent transmission of script MIME types (which only works if the server
  6. marks them as such, so a malicious server could get around this, but
  7. then the browser probably wouldn't execute the script).
  8. . remove Link: headers that link to a resource of a script MIME type.
  9. Within HTML resources:
  10. . remove <script>...</script> .
  11. . remove intrinsic event attributes from tags, i.e. attributes whose names
  12. begin with "on".
  13. . remove <style>...</style> where "type" attribute is a script MIME type.
  14. . remove various HTML tags that appear to link to a script MIME type.
  15. . remove script macros (aka Netscape-specific "JavaScript entities"),
  16. i.e. any attributes containing the string "&{" .
  17. . remove "JavaScript conditional comments".
  18. . remove MSIE-specific "dynamic properties".
  19. To allow scripts from some sites but not from others, set this to 0 and
  20. see @ALLOWED_SCRIPT_SERVERS and @BANNED_SCRIPT_SERVERS below.
  21. See @SCRIPT_MIME_TYPES below for a list of which MIME types are filtered out.
  22. I do NOT know for certain that this removes all script content! It removes
  23. all that I know of, but I don't have a definitive list of places scripts
  24. can exist. If you do, please send it to me. EVEN RUNNING A SINGLE
  25. JAVASCRIPT STATEMENT CAN COMPROMISE YOUR ANONYMITY! Just so you know.
  26. Richard Smith has a good test site for anonymizing proxies, at
  27. http://users.rcn.com/rms2000/anon/test.htm
  28. Note that turning this on removes most popup ads! :)
    $REMOVE_SCRIPTS= 0 ;
  1. If set, then filter out images that match one of @BANNED_IMAGE_URL_PATTERNS,
  2. below. Also removes cookies attached to images, as if $NO_COOKIE_WITH_IMAGE
  3. is set.
  4. To remove most popup advertisements, also set $REMOVE_SCRIPTS=1 above.
    $FILTER_ADS= 0 ;
  1. If set, then don't send a Referer: [sic] header with each request
  2. (i.e. something that tells the server which page you're coming from
  3. that linked to it). This is a minor privacy issue, but a few sites
  4. won't send you pages or images if the Referer: is not what they're
  5. expecting. If a page is loading without images or a link seems to be
  6. refused, then try turning this off, and a correct Referer: header will
  7. be sent.
  8. This is only a problem in a VERY small percentage of sites, so few that
  9. I'm kinda hesitant to put this in the entry form. Other arrangements
  10. have their own problems, though.
    $HIDE_REFERER= 0 ;
  1. If set, insert a compact version of the URL entry form at the top of each
  2. page. This will also display the URL currently being viewed.
  3. When viewing a page with frames, then a new top frame is created and the
  4. insertion goes there.
  5. If you want to customize the appearance of the form, modify the routine
  6. mini_start_form() near the end of the script.
  7. If you want to insert something other than this form, see $INSERT_HTML and
  8. $INSERT_FILE below.
  9. Users should realize that options changed via the form only take affect when
  10. the form is submitted by entering a new URL or pressing the "Go" button.
  11. Selecting an option, then following a link on the page, will not cause
  12. the option to take effect.
  13. Users should also realize that anything inserted into a page may throw
  14. off any precise layout. The insertion will also be subject to
  15. background colors and images, and any other page-wide settings.
    $INSERT_ENTRY_FORM= 1 ;
  1. If set, then allow the user to control $REMOVE_COOKIES, $REMOVE_SCRIPTS,
  2. $FILTER_ADS, $HIDE_REFERER, and $INSERT_ENTRY_FORM. Note that they
  3. can't fine-tune any related options, such as the various @ALLOWED... and
  4. @BANNED... lists.
    $ALLOW_USER_CONFIG= 1 ;
  1. If you want to encode the URLs of visited pages so that they don't show
  2. up within the full URL in your browser bar, then use proxy_encode() and
  3. proxy_decode(). These are Perl routines that transform the way the
  4. destination URL is included in the full URL. You can either use
  5. some combination of the example encodings below, or you can program your
  6. own routines. The encoded form of URLs should only contain characters
  7. that are legal in PATH_INFO. This varies by server, but using only
  8. printable chars and no "?" or "#" works on most servers. Don't let
  9. PATH_INFO contain the strings "./", "/.", "../", or "/..", or else it
  10. may get compressed like a pathname somewhere. Try not to make the
  11. resulting string too long, either.
  12. Of course, proxy_decode() must exactly undo whatever proxy_encode() does.
  13. Make proxy_encode() as fast as possible-- it's a bottleneck for the whole
  14. program. The speed of proxy_decode() is not as important.
  15. If you're not a Perl programmer, you can use the example encodings that are
  16. commented out, i.e. the lines beginning with "#". To use them, merely
  17. uncomment them, i.e. remove the "#" at the start of the line. If you
  18. uncomment a line in proxy_encode(), you MUST uncomment the corresponding
  19. line in proxy_decode() (note that "corresponding lines" in
  20. proxy_decode() are in reverse order of those in proxy_encode()). You
  21. can use one, two, or all three encodings at the same time, as long as
  22. the correct lines are uncommented.
  23. Starting in version 2.1beta9, don't call these functions directly. Rather,
  24. call wrap_proxy_encode() and wrap_proxy_decode() instead, which handle
  25. certain details that you shouldn't have to worry about in these functions.
  26. IMPORTANT: If you modify these routines, and if $PROXIFY_SCRIPTS is set
  27. below (on by default), then you MUST modify $ENCODE_DECODE_BLOCK_IN_JS
  28. below!! (You'll need to write corresponding routines in JavaScript to do
  29. the same as these routines in Perl, used when proxifying JavaScript.)
  30. Because of the simplified absolute URL resolution in full_url(), there may
  31. be ".." segments in the default encoding here, notably in the first path
  32. segment. Normally, that's just an HTML mistake, but please tell me if
  33. you see any privacy exploit with it.
  34. Note that a few sites have embedded applications (like applets or Shockwave)
  35. that expect to access URLs relative to the page's URL. This means they
  36. may not work if the encoded target URL can't be treated like a base URL,
  37. e.g. that it can't be appended with something like "../data/foo.data"
  38. to get that expected data file. In such cases, the default encoding below
  39. should let these sites work fine, as should any other encoding that can
  40. support URLs relative to it.
sub proxy_encode {
my($URL)= @_ ;
$URL=~ s#^([\w+.-]+)://#$1/# ; # http://xxx -> http/xxx
  1. $URL=~ s/(.)/ sprintf('%02x',ord($1)) /ge ; # each char -> 2-hex
  2. $URL=~ tr/a-zA-Z/n-za-mN-ZA-M/ ; # rot-13
return $URL ;
}

sub proxy_decode {
my($enc_URL)= @_ ;

  1. $enc_URL=~ tr/a-zA-Z/n-za-mN-ZA-M/ ; # rot-13
  2. $enc_URL=~ s/([\da-fA-F]{2})/ sprintf("%c",hex($1)) /ge ;
    $enc_URL=~ s#^([\w+.-]+)/#$1://# ; # http/xxx -> http://xxx
    return $enc_URL ;
    }
  1. Encode cookies before they're sent back to the user.
  2. The return value must only contain characters that are legal in cookie
  3. names and values, i.e. only printable characters, and no ";", ",", "=",
  4. or white space.
  5. cookie_encode() is called twice for each cookie: once to encode the cookie
  6. name, and once to encode the cookie value. The two are then joined with
  7. "=" and sent to the user.
  8. cookie_decode() must exactly undo whatever cookie_encode() does.
  9. Also, cookie_encode() must always encode a given input string into the
  10. same output string. This is because browsers need the cookie name to
  11. identify and manage a cookie, so the name must be consistent.
  12. This is not a bottleneck like proxy_encode() is, so speed is not critical.
  13. IMPORTANT: If you modify these routines, and if $PROXIFY_SCRIPTS is set
  14. below (on by default), then you MUST modify $ENCODE_DECODE_BLOCK_IN_JS
  15. below!! (You'll need to write corresponding routines in JavaScript to do
  16. the same as these routines in Perl, used when proxifying JavaScript.)
sub cookie_encode {
my($cookie)= @_ ;
  1. $cookie=~ s/(.)/ sprintf('%02x',ord($1)) /ge ; # each char -> 2-hex
  2. $cookie=~ tr/a-zA-Z/n-za-mN-ZA-M/ ; # rot-13
    $cookie=~ s/(\W)/ '%' . sprintf('%02x',ord($1)) /ge ; # simple URL-encoding
    return $cookie ;
    }
sub cookie_decode {
my($enc_cookie)= @_ ;
$enc_cookie=~ s/%([\da-fA-F]{2})/ pack('C', hex($1)) /ge ; # URL-decode
  1. $enc_cookie=~ tr/a-zA-Z/n-za-mN-ZA-M/ ; # rot-13
  2. $enc_cookie=~ s/([\da-fA-F]{2})/ sprintf("%c",hex($1)) /ge ;
    return $enc_cookie ;
    }
  1. If $PROXIFY_SCRIPTS is true, and if you modify the routines above that
  2. encode cookies and URLs, then you need to modify $ENCODE_DECODE_BLOCK_IN_JS
  3. here. Explanation: When proxifying JavaScript, a library of JavaScript
  4. functions is used. In that library are a few JavaScript routines that do
  5. the same as their Perl counterparts in this script. Four of those routines
  6. are proxy_encode(), proxy_decode(), cookie_encode(), and cookie_decode().
  7. Thus, unfortunately, when you write your own versions of those Perl routines
  8. (or modify what's already there), you also need to write (or modify) these
  9. corresponding JavaScript routines to do the same thing. Put the routines in
  10. this long variable $ENCODE_DECODE_BLOCK_IN_JS, and it will be included in
  11. the JavaScript library when needed. Prefix the function names with
  12. "_proxy_jslib_", as below.
  13. The commented examples in the JavaScript routines below correspond exactly to
  14. the commented examples in the Perl routines above. Thus, if you modify the
  15. Perl routines by merely uncommenting the examples, you can do the same in
  16. these JavaScript routines. (JavaScript comments begin with "//".)
  17. [If you don't know Perl: Note that everything up until the line "EOB" is one
  18. long string value, called a "here document". $ENCODE_DECODE_BLOCK_IN_JS is
  19. set to the whole thing.]

$ENCODE_DECODE_BLOCK_IN_JS= <<'EOB' ;

function _proxy_jslib_proxy_encode(URL) {
URL= URL.replace(/^([\w\+\.\-]+)\:\/\//, '$1/') ;
// URL= URL.replace(/(.)/g, function (s,p1) { return p1.charCodeAt(0).toString(16) } ) ;
// URL= URL.replace(/([a-mA-M])|[n-zN-Z]/g, function (s,p1) { return String.fromCharCode(s.charCodeAt(0)+(p1?13:-13)) }) ;

return URL ;
}

function _proxy_jslib_proxy_decode(enc_URL) {
// enc_URL= enc_URL.replace(/([a-mA-M])|[n-zN-Z]/g, function (s,p1) { return String.fromCharCode(s.charCodeAt(0)+(p1?13:-13)) }) ;
// enc_URL= enc_URL.replace(/([\da-fA-F]{2})/g, function (s,p1) { return String.fromCharCode(eval('0x'+p1)) } ) ;
enc_URL= enc_URL.replace(/^([\w\+\.\-]+)\//, '$1://') ;
return enc_URL ;
}

function _proxy_jslib_cookie_encode(cookie) {
// cookie= cookie.replace(/(.)/g, function (s,p1) { return p1.charCodeAt(0).toString(16) } ) ;
// cookie= cookie.replace(/([a-mA-M])|[n-zN-Z]/g, function (s,p1) { return String.fromCharCode(s.charCodeAt(0)+(p1?13:-13)) }) ;
cookie= cookie.replace(/(\W)/g, function (s,p1) { return '%'+p1.charCodeAt(0).toString(16) } ) ;
return cookie ;
}

function _proxy_jslib_cookie_decode(enc_cookie) {
enc_cookie= enc_cookie.replace(/%([\da-fA-F]{2})/g, function (s,p1) { return String.fromCharCode(eval('0x'+p1)) } ) ;
// enc_cookie= enc_cookie.replace(/([a-mA-M])|[n-zN-Z]/g, function (s,p1) { return String.fromCharCode(s.charCodeAt(0)+(p1?13:-13)) }) ;
// enc_cookie= enc_cookie.replace(/([\da-fA-F]{2})/g, function (s,p1) { return String.fromCharCode(eval('0x'+p1)) } ) ;
return enc_cookie ;
}

EOB

  1. Use @ALLOWED_SERVERS and @BANNED_SERVERS to restrict which servers a user
  2. can visit through this proxy. Any URL at a host matching a pattern in
  3. @BANNED_SERVERS will be forbidden. In addition, if @ALLOWED_SERVERS is
  4. not empty, then access is allowed only to servers that match a pattern
  5. in it. In other words, @BANNED_SERVERS means "ban these servers", and
  6. @ALLOWED_SERVERS (if not empty) means "allow only these servers". If a
  7. server matches both lists, it is banned.
  8. These are each a list of Perl 5 regular expressions (aka patterns or
  9. regexes), not literal host names. To turn a hostname into a pattern,
  10. replace every "." with "\.", add "^" to the beginning, and add "$" to the
  11. end. For example, 'www.example.com' becomes '^www\.example\.com$'. To
  12. match every host ending in something, leave out the "^". For example,
  13. '\.example\.com$' matches every host ending in ".example.com". For more
  14. details about Perl regular expressions, see the Perl documentation. (They
  15. may seem cryptic at first, but they're very powerful once you know how to
  16. use them.)
  17. Note: Use single quotes around each pattern, not double qoutes, unless you
  18. understand the difference between the two in Perl. Otherwise, characters
  19. like "$" and "\" may not be handled the way you expect.
    @ALLOWED_SERVERS= () ;
    @BANNED_SERVERS= () ;
  1. If @BANNED_NETWORKS is set, then forbid access to these hosts or networks.
  2. This is done by IP address, not name, so it provides more certain security
  3. than @BANNED_SERVERS above.
  4. Specify each element as a decimal IP address-- all four integers for a host,
  5. or one to three integers for a network. For example, '127.0.0.1' bans
  6. access to the local host, and '192.168' bans access to all IP addresses
  7. in the 192.168 network. Sorry, no banning yet for subnets other than
  8. 8, 16, or 24 bits.
  9. IF YOU'RE RUNNING THIS ON OR INSIDE A FIREWALL, THIS SETTING IS STRONGLY
  10. RECOMMENDED!! In particular, you should ban access to other machines
  11. inside the firewall that the firewall machine itself may have access to.
  12. Otherwise, external users will be able to access any internal hosts that
  13. the firewall can access. Even if that's what you intend, you should ban
  14. access to any hosts that you don't explicitly want to expose to outside
  15. users.
  16. In addition to the recommended defaults below, add all IP addresses of your
  17. server machine if you want to protect it like this.
  18. If you're using this with another proxy on the same machine (like a SOCKS
  19. proxy), you'll need to remove the '127' item below. But see the comments
  20. above $SOCKS_PROXY, below, for a warning.
  21. After you set this, YOU SHOULD TEST to verify that the proxy can't access
  22. the IP addresses you're banning!
  23. NOTE: According to RFC 1918, network address ranges reserved for private
  24. networks are 10.x.x.x, 192.168.x.x, and 172.16.x.x-172.31.x.x, i.e. with
  25. respective subnet masks of 8, 16, and 12 bits. Since we can't currently
  26. do a 12-bit mask, we'll exclude the entire 172 network here. If this
  27. causes a problem, let me know and I'll add subnet masks down to 1-bit
  28. resolution.
  29. Also included are 169.254.x.x (per RFC 3927) and 244.0.0.x (used for
  30. routing), as recommended by Waldo Jaquith.
  31. On some systems, 127.x.x.x all point to localhost, so disallow all of "127".
  32. This feature is simple now but may be more complete in future releases.
  33. How would you like this to be extended? What would be useful to you?
    @BANNED_NETWORKS= ('127', '192.168', '172', '10', '169.254', '244.0.0') ;
  1. Settings to fine-tune cookie filtering, if cookies are not banned altogether
  2. (by user checkbox or $REMOVE_COOKIES above).
  3. Use @ALLOWED_COOKIE_SERVERS and @BANNED_COOKIE_SERVERS to restrict which
  4. servers can send cookies through this proxy. They work like
  5. @ALLOWED_SERVERS and @BANNED_SERVERS above, both in how their precedence
  6. works, and that they're lists of Perl 5 regular expressions. See the
  7. comments there for details.
  1. If non-empty, only allow cookies from servers matching one of these patterns.
  2. Comment this out to allow all cookies (subject to @BANNED_COOKIE_SERVERS).
    #@ALLOWED_COOKIE_SERVERS= ('\bslashdot\.org$') ;
  1. Reject cookies from servers matching these patterns.
    @BANNED_COOKIE_SERVERS= (
    '\.doubleclick\.net$',
    '\.preferences\.com$',
    '\.imgis\.com$',
    '\.adforce\.com$',
    '\.focalink\.com$',
    '\.flycast\.com$',
    '\.avenuea\.com$',
    '\.linkexchange\.com$',
    '\.pathfinder\.com$',
    '\.burstnet\.com$',
    '\btripod\.com$',
    '\bgeocities\.yahoo\.com$',
    '\.mediaplex\.com$',
    ) ;
  1. Set this to reject cookies returned with images. This actually prevents
  2. cookies returned with any non-text resource.
  3. This helps prevent tracking by ad networks, but there are also some
  4. legitimate uses of attaching cookies to images, such as captcha, so
  5. by default this is off.
    $NO_COOKIE_WITH_IMAGE= 0 ;
  1. Settings to fine-tune script filtering, if scripts are not banned altogether
  2. (by user checkbox or $REMOVE_SCRIPTS above).
  3. Use @ALLOWED_SCRIPT_SERVERS and @BANNED_SCRIPT_SERVERS to restrict which
  4. servers you'll allow scripts from. They work like @ALLOWED_SERVERS and
  5. @BANNED_SERVERS above, both in how their precedence works, and that
  6. they're lists of Perl 5 regular expressions. See the comments there for
  7. details.
    @ALLOWED_SCRIPT_SERVERS= () ;
    @BANNED_SCRIPT_SERVERS= () ;
  1. Various options to help filter ads and stop cookie-based privacy invasion.
  2. These are only effective if $FILTER_ADS is set above.
  3. @BANNED_IMAGE_URL_PATTERNS uses Perl patterns. If an image's URL
  4. matches one of the patterns, it will not be downloaded (typically for
  5. ad-filtering). For more information on Perl regular expressions, see
  6. the Perl documentation.
  7. Note that most popup ads will be removed if scripts are removed (see
  8. $REMOVE_SCRIPTS above).
  9. If ad-filtering is your primary motive, consider using one of the many
  10. proxies that specialize in that. The classic is from JunkBusters, at
  11. http://www.junkbusters.com .
  1. Reject images whose URL matches any of these patterns. This is just a
  2. sample list; add more depending on which sites you visit.
    @BANNED_IMAGE_URL_PATTERNS= (
    'ad\.doubleclick\.net/ad/',
    '\b[a-z](\d+)?\.doubleclick\.net(:\d*)?/',
    '\.imgis\.com\b',
    '\.adforce\.com\b',
    '\.avenuea\.com\b',
    '\.go\.com(:\d*)?/ad/',
    '\.eimg\.com\b',
    '\bexcite\.netscape\.com(:\d*)?/.*/promo/',
    '/excitenetscapepromos/',
    '\.yimg\.com(:\d*)?.*/promo/',
    '\bus\.yimg\.com/[a-z]/(\w\w)/\1',
    '\bus\.yimg\.com/[a-z]/\d-/',
    '\bpromotions\.yahoo\.com(:\d*)?/promotions/',
    '\bcnn\.com(:\d*)?/ads/',
    'ads\.msn\.com\b',
    '\blinkexchange\.com\b',
    '\badknowledge\.com\b',
    '/SmartBanner/',
    '\bdeja\.com/ads/',
    '\bimage\.pathfinder\.com/sponsors',
    'ads\.tripod\.com',
    'ar\.atwola\.com/image/',
    '\brealcities\.com/ads/',
    '\bnytimes\.com/ad[sx]/',
    '\busatoday\.com/sponsors/',
    '\busatoday\.com/RealMedia/ads/',
    '\bmsads\.net/ads/',
    '\bmediaplex\.com/ads/',
    '\batdmt\.com/[a-z]/',
    '\bview\.atdmt\.com/',
    '\bADSAdClient31\.dll\b',
    ) ;
  1. If set, replace banned images with 1x1 transparent GIF. This also replaces
  2. all images with the same if $TEXT_ONLY is set.
  3. Note that setting this makes the response a little slower, since the browser
  4. must still retrieve the empty GIF.
    $RETURN_EMPTY_GIF= 0 ;
  1. To use an external program to decide whether or not a user at a given IP
  2. address may use this proxy (as opposed to using server configuration), set
  3. $USER_IP_ADDRESS_TEST to either the name of a command-line program that
  4. performs this test, or a queryable URL that performs this test (e.g. a CGI
  5. script).
  6. For a command-line program: The program should take a single argument, the
  7. IP address of the user. The output of the program is evaluated as a
  8. number, and if the number is non-zero then the IP address of the user is
  9. allowed; thus, the output is typically either "1" or "0". Note that
  10. depending on $ENV{PATH}, you may need to enter the path here explicitly.
  11. For a queryable URL: Specify the start of the URL here (must begin with
  12. "http://"), and the user's IP address will be appended. For example, the
  13. value here may contain a "?", thus putting the IP address in the
  14. QUERY_STRING; it could also be in PATH_INFO. The response body from the
  15. URL should be a number like for a command line program, above.
    $USER_IP_ADDRESS_TEST= '' ;
  1. To use an external program to decide whether or not a destination server is
  2. allowed (as opposed to using @ALLOWED_SERVERS and @BANNED_SERVERS above),
  3. set $DESTINATION_SERVER_TEST to either the name of a command-line program
  4. that performs this test, or a queryable URL that performs this test (e.g. a
  5. CGI script).
  6. For a command-line program: The program should take a single argument, the
  7. destination server's name or IP address (depending on how the user enters
  8. it). The output of the program is evaluated as a number, and if the number
  9. is non-zero then the destination server is allowed; thus, the output is
  10. typically either "1" or "0". Note that depending on $ENV{PATH}, you may
  11. need to enter the path here explicitly.
  12. For a queryable URL: Specify the start of the URL here (must begin with
  13. "http://"), and the destination server's name or IP address will be
  14. appended. For example, the value here may contain a "?", thus putting the
  15. name or address in the QUERY_STRING; it could also be in PATH_INFO. The
  16. response body from the URL should be a number like for a command line
  17. program, above.
    $DESTINATION_SERVER_TEST= '' ;
  1. If either $INSERT_HTML or $INSERT_FILE is set, then that HTML text or the
  2. contents of that named file (respectively) will be inserted into any HTML
  3. page retrieved through this proxy. $INSERT_HTML takes precedence over
  4. $INSERT_FILE. $INSERT_FILE is assumed to have contents in UTF-8.
  5. When viewing a page with frames, a new top frame is created and the
  6. insertions go there.
  7. NOTE: Any HTML you insert should not have relative URLs in it! The problem
  8. is that there is no appropriate base URL to resolve them with. So only use
  9. absolute URLs in your insertion. (If you use relative URLs anyway, then
  10. a) if $ANONYMIZE_INSERTION is set, they'll be resolved relative to this
  11. script's URL, which isn't great, or b) if $ANONYMIZE_INSERTION==0,
  12. they'll be unchanged and the browser will simply resolve them relative
  13. to the current page, which is usually worse.)
  14. The frame handling means that it's fairly easy for a surfer to bypass this
  15. insertion, by pretending in effect to be in a frame. There's not much we
  16. can do about that, since a page is retrieved the same way regardless of
  17. whether it's in a frame. This script uses a parameter in the URL to
  18. communicate to itself between calls, but the user can merely change that
  19. URL to make the script think it's retrieving a page for a frame. Also,
  20. many browsers let the user expand a frame's contents into a full window.
  21. [The warning in earlier versions about setting $INSERT_HTML to '' when using
  22. mod_perl and $INSERT_FILE no longer applies. It's all handled elsewhere.]
  23. As with $INSERT_ENTRY_FORM, note that any insertion may throw off any
  24. precise layout, and the insertion is subject to background colors and
  25. other page-wide settings.

#$INSERT_HTML= "<h1>This is an inserted header</h1><hr>" ;
#$INSERT_FILE= 'insert_file_name' ;

  1. If your insertion has links that you don't want anonymized along with the rest
  2. of the downloaded HTML, then set this to 0. Otherwise leave it at 1.
    $ANONYMIZE_INSERTION= 1 ;
  1. If there's both a URL entry form and an insertion via $INSERT_HTML or
  2. $INSERT_FILE on the same page, the entry form normally goes at the top.
  3. Set this to put it after the other insertion.
    $FORM_AFTER_INSERTION= 0 ;
  1. If the insertion is put in a top frame, then this is how many pixels high
  2. the frame is. If the default of 80 or 50 pixels is too big or too small
  3. for your insertion, change this. You can use percentage of screen height
  4. if you prefer, e.g. "20%". (Unfortunately, you can't just tell the
  5. browser to "make it as high as it needs to be", but at least the frame
  6. will be resizable by the user.)
  7. This affects insertions by $INSERT_ENTRY_FORM, $INSERT_HTML, and $INSERT_FILE.
  8. The default here usually works for the inserted entry form, which varies in
  9. size depending on $ALLOW_USER_CONFIG. It also varies by browser.
    $INSERTION_FRAME_HEIGHT= $ALLOW_USER_CONFIG ? 80 : 50 ;
  1. NOTE THAT YOU SHOULD BE RUNNING CGIPROXY ON A SECURE SERVER!
  2. Note also that the meaning of '' has changed-- now, all ports except 80
  3. are assumed to be using SSL.
  4. Set this to 1 if the script is running on an SSL server, i.e. it is
  5. accessed through a URL starting with "https:"; set this to 0 if it's not
  6. running on an SSL server. This is needed to know how to route URLs back
  7. through the proxy. Regrettably, standard CGI does not yet provide a way
  8. for scripts to determine this without help.
  9. If this variable is set to '' or left undefined, then the program will
  10. guess: SSL is assumed if SERVER_PORT is not 80. This fails when using
  11. an insecure server on a port other than 80, or (less commonly) an SSL server
  12. uses port 80, but usually it works. Besides being a good default, it lets
  13. you install the script where both a secure server and a non-secure server
  14. will serve it, and it will work correctly through either server.
  15. This has nothing to do with retrieving pages that are on SSL servers.
    $RUNNING_ON_SSL_SERVER= '' ;
  1. If your server doesn't support NPH scripts, then set this variable to true
  2. and try running the script as a normal non-NPH script. HOWEVER, this
  3. won't work as well as running it as NPH; there may be bugs, maybe some
  4. privacy holes, and results may not be consistent. It's a hack.
  5. Try to install the script as NPH before you use this option, because
  6. this may not work. NPH is supported on almost all servers, and it's
  7. usually very easy to install a script as NPH (on Apache, for example,
  8. you just need to name the script something starting with "nph-").
  9. One example of a problem is that Location: headers may get messed up,
  10. because they mean different things in an NPH and a non-NPH script.
  11. You have been warned.
  12. For this to work, your server MUST support the "Status:" CGI response
  13. header.
    $NOT_RUNNING_AS_NPH= 0 ;
  1. Set HTTP and SSL proxies if needed. Also see $USE_PASSIVE_FTP_MODE below.
  2. The format of the first two variables is "host:port", with the port being
  3. optional. The format of $NO_PROXY is a comma-separated list of hostnames
  4. or domains: any request for a hostname that ends in one of the strings in
  5. $NO_PROXY will not use the HTTP or SSL proxy; e.g. use ".mycompany.com" to
  6. avoid using the proxies to access any host in the mycompany.com domain.
  7. The environment variables in the examples below are appropriate defaults,
  8. if they are available. Note that earlier versions of this script used
  9. the environment variables directly, instead of the $HTTP_PROXY and
  10. $NO_PROXY variables we use now.
  11. Sometimes you can use the same proxy (like Squid) for both SSL and normal
  12. HTTP, in which case $HTTP_PROXY and $SSL_PROXY will be the same.
  13. $NO_PROXY applies to both SSL and normal HTTP proxying, which is usually
  14. appropriate. If there's demand to differentiate those, it wouldn't be
  15. hard to make a separate $SSL_NO_PROXY option.
    #$HTTP_PROXY= $ENV{'http_proxy'} ;
    #$SSL_PROXY= 'firewall.example.com:3128' ;
    #$NO_PROXY= $ENV{'no_proxy'} ;
  1. If your HTTP and SSL proxies require authentication, this script supports
  2. that in a limited way: you can have a single username/password pair per
  3. proxy to authenticate with, regardless of realm. In other words, multiple
  4. realms aren't supported for proxy authentication (though they are for
  5. normal server authentication, elsewhere).
  6. Set $PROXY_AUTH and $SSL_PROXY_AUTH either in the form of "username:password",
  7. or to the actual base64 string that gets sent in the Proxy-Authorization:
  8. header. Often the two variables will be the same, when the same proxy is
  9. used for both SSL and normal HTTP.
    #$PROXY_AUTH= 'Aladdin:open sesame' ;
    #$SSL_PROXY_AUTH= $PROXY_AUTH ;
  1. Set SOCKS proxy if needed. The format of $SOCKS_PROXY is "host:port", with
  2. the port being optional (defaults to 1080).
  3. If your SOCKS proxy supports username/password authentication, then set
  4. the username and password below.
  5. Also see @BANNED_NETWORKS above-- you'll need to remove the '127' from the
  6. default list if you use a SOCKS proxy on the machine where this is running,
  7. such as with the example here.
  8. NOTE THAT THE CONNECTION BETWEEN THIS SCRIPT AND YOUR SOCKS PROXY MUST BE
  9. TRUSTED, BECAUSE CURRENTLY ALL DATA IS SENT IN THE CLEAR BETWEEN THEM!
  10. In particular, the username and password below will be sent in the clear.
  11. The solution would be to use the GSSAPI authentication method, which many
  12. SOCKS proxies do not support, and which CGIProxy doesn't support yet either.
    #$SOCKS_PROXY= 'localhost:1080' ;
    #$SOCKS_USERNAME= '' ;
    #$SOCKS_PASSWORD= '' ;
  1. This is one way to handle pages that don't work well, by redirecting to other working
  2. versions of the pages (for example, to a mobile version or another version that
  3. doesn't have much JavaScript). How it works: If the current domain matches one
  4. of the keys of %REDIRECTS, then s/// (string substitution) is done on the URL,
  5. using the match and replacement patterns in the 2-element value array.
  6. The set of sites handled this way is Facebook and Gmail, since they doesn't
  7. always work well, or are slow, through CGIProxy. If you want to access
  8. them normally, then comment out or remove the line(s) below for that site.
  9. If you want to redirect more sites, you can add records to the %REDIRECTS
  10. hash in the following way: Set the hash key to the name of the server you
  11. want to redirect, and the value to a reference to a 2-element array containing
  12. the left and right sides of an s/// string substitution. If that doesn't make
  13. sense, then try to emulate an example below.
  14. As of version 2.1.7, the full facebook.com site works pretty well, so the
  15. redirection below has been commented out.
  16. ... aaaand, as of version 2.1.8, the full Gmail site works pretty well, so the
  17. redirection below has been commented out.
  18. To improve performance with facebook or other JS-busy sites, users can:
  19. - close other browser windows
  20. - end other CPU-heavy processes on their browsing machine
  21. - reload the page or restart the browser when it gets too slow
  22. - use a browser other than MSIE (it has the most problems)
  23. If Gmail or facebook is still too slow or crashes a lot, you can remove the
  24. leading "#" on the appropriate lines below to automatically redirect to
  25. Gmail's HTML-only site or facebook's mobile site, which may work better.
    %REDIRECTS= (
  26. 'www.facebook.com' => [qr#^https?://www\.facebook\.com#i, 'https://m.facebook.com'],
  27. 'mail.google.com' => [qr#^https?://mail\.google\.com/.*shva=\w*1.*$#i, 'https://mail.google.com/?ui=html']
    ) ;
  1. Some JavaScript-busy sites crash when visiting them through CGIProxy. Increasing
  2. the delay times in Window.setTimeout() and Window.setInterval() m

Related issues

Copied from Checkey - Bug #7753: nothing works New 07/17/2016

Also available in: Atom PDF