WebCopier Project represents a downloading task. The program keeps all your Projects with their
settings between working sessions.
Each Project has a number of settings, which could be viewed or changed using the Project Settings dialog.
The settings are divided in four groups: General,
Download, Contents and
Advanced.
General settings
Project
- Title - the project title that will be shown in the Contents Tree and the program window caption. The title will help you later to separate this project from other ones.
- Save in Directory - the directory name where downloaded files will be saved.
It’s recommended to have a separate directory for each project.
You can type the directory name or press the ... button to select it. If the directory does not exist,
it will be created automatically when the download starts.
Website
- URL - the address of the web site you want to download.
You can type the address or copy it from the clipboard.
Authentication
- User ID, Password -User ID and Password for secure web site.
Leave the fields blank if you do not need to log on to access the site.
 |
- It’s highly recommended to specify for each project unique Save in Directory name.
Otherwise project files downloaded from the Internet can be overwritten by another project files,
because they will be saved in the same directory.
|
Download settings
- Speed - the number of files that will be downloaded simultaneously.
You can specify the number from 1 to 100. Default value is 5.
You may need to find the optimum number because small number could make the downloads slower,
while too high value could make them unreliable because of possible timeout.
If you feel that your downloads are not reliable (you see too many errors number 12002 in the
Log File Window), but you don’t want to decrease the number,
then try to increase the Timeout value.
Restrictions
- Max. Total Disk Space - the maximum total size of all downloaded files.
- Max. Number of Files - the maximum number of downloaded files.
- Max. File Size - the maximum file size limit. Bigger files would not be downloaded.
- Min. File Size - the minimum file size limit. Smaller files would not be downloaded.
- Max. Total Time - the maximum download time.
 |
- These settings can be changed during download.
- If files from HTML category do not match the Min. or Max. File Size restriction,
they will be downloaded, parsed and, only after that, deleted.
|
Contents settings
These settings control the appearance and behavior of the Contents Tree.
They don’t affect the download process.
During Download
- Update during Download - if this option is set the Contents Tree
will be updated during the download every time new files are found or a file’s status is changed.
Show
- Rename non-standard HTML files - allows to rename non-standard HTML files
(for example, "index.asp" becomes "index.asp.htm") to be able view these files offline even when
the corresponding server (Microsoft IIS from previous example) is not running on your computer.
- Keep query text in the file name - If the link URL contain a query string (text after ‘?’ character)
and the option is enabled, the file name will also contain the query string (if its length doesn’t exceed
260 characters). Otherwise, the file name will contain encoded text like "Wcf45edcc35f02.htm".
Advanced / Download settings
Level Limits
- Main Links - sets the download Levels for all links,
that have the same domain name as the starting URL.
- Offsite Links - sets the download Levels for all links,
that have different domain name than the starting URL.
Update Files
- When Changed - file will be downloaded if it was not downloaded before or it’s creation
time or size was changed.
- Never - file will be downloaded only if it was not downloaded before.
- Always - file will be downloaded regardless was it changed or not.
Links Conversion
- Do NOT convert links - use this option to prevent links conversion after the download.
- Convert only DOWNLOADED links - use this option to allow the program to update all project
links that have been downloaded to make them relative to the project directory and ready for offline browsing.
- Convert ALL links - use this option to allow the program to update all project links
(including links that have not been downloaded) to make them relative to the project directory and
ready for offline browsing.
- Download images from any location - allows to download images from different servers even
when "Load Files from" option ("Project Settings / Advanced / URL Filters" window) is set to
"The starting Server only", or other URL Filter settings don’t allow that.
- Delete outdated files - check this option if you want the program automatically delete
outdated files (the files that have been previously downloaded by WebCopier and don’t exist on the
web site any more) on your disk.
 |
- The Offsite Links parameter is available only when you have specified to load files from
All Servers, or if you have at least one URL filter with the action Include.
- You can do the Links Conversion later at any time by using Project / Convert Links menu
command.
|
Advanced / File Filters settings
File filters allow you to selectively download files by file type.
For your convenience all files are grouped in file types categories:
HTML, Image, Audio, Video, Java, Document, Archive and Other.
You may enable or disable loading the whole file types categories, by setting or removing checks on the
File Types list.
- Unchecked category means that none of the file with extensions, listed in will be downloaded.
- Checked category means that all files with checked extensions, listed in will be loaded.
Unchecked extensions will be skipped.
- You may edit the category by adding, changing or removing files extensions in the list.
You can set different size limits for different file types (for example, HTML, Image, Audio).
- Min. File Size - the minimum file size limit. Smaller files that belong to this File Group
would not be downloaded.
- Max. File Size - the maximum file size limit. Bigger files that belong to this File Group
would not be downloaded.
Advanced / URL Filters settings
URL filters allow you easily shape Project downloads by setting which files could be loaded and
which should be skipped.
All URL Filters are divided into four sections:
- Protocol - restrict loading by protocol (http://www.server.com/dir/file.htm);
- Server - restrict loading by server name (http://www.server.com/dir/file.htm);
- Directory - restrict loading by path (http://www.server.com/dir/file.htm);
- File - restrict loading by filename (http://www.server.com/dir/file.htm).
Load files from Server
- The starting Server only - allows files for download only if their URL server part is
the same as starting URL server.
- All Servers - load files without any server name restriction.
Load files from Directory
- Within the starting Dir. & below - allows files for download only if their URL directory part
is equal to or begins from the directory in the starting URL.
- All Directories - load files without any directory name restriction.
URL Filters
The Filters List allows you to add, change or remove URL filters.
To create a new filter press the corresponding button, fill the filter sections and select an action to include or exclude files based on the filter settings.
You can leave a section empty to skip checking of this section.
Also, you can use wildcards asterisk (*) and symbol (^) to match patterns.
Asterisk (*) wildcard character matches any number of characters. The symbol (^) matches any single character.
Advanced / Other settings
Download
- Timeout - specifies the amount of time (in seconds) the program will wait for web server response. The valid range is from 10 to 600 seconds. The default value is 60.
- Retries - number of times WebCopier will try to download a file if any error occur. The number can be from 0 to 100. The default value is 5.
- Pause - specifies a pause between web site files downloads to prevent web server overload.
Directory structure
- Relative - the files will be saved with the names relative to the file name of the starting page keeping the original website directory structure.
- Based on URLs - the files will be saved with the file names that include full URL.
- One directory - all downloaded files will be saved in one directory ignoring the original website directory structure.
Agent Identification
Sometimes it’s necessary to change the way WebCopier represents itself to other Web servers.
This is called Agent Identification, and the default setting is WebCopier. The program has several preset values you can choose from.
Usually the Agent Identification value doesn’t affect most websites. However, sometimes you may experience problems with downloading some protected websites until the correct Agent Identification is be set.
- The combobox allows to choose between the following Agent Identifications: WebCopier, Anonymous, Microsoft Internet Explorer, Netscape Communicator, Opera and User defined.
- version - sets the version number if Microsoft Internet Explorer, Netscape Communicator or Opera Agent Identification is selected.
- Use this identification - allows to type your own Identification text when the User defined Agent Identification is selected.
- Default HTML File - default HTML file name. When an HTML link does not include file name (for example "www.somesite.com"), WebCopier assigns the "default.htm" name to the starting page, because web server doesn't tell the actual name of the starting page. Now you can assign a different name to the starting website page.
- Referrer URL - allows to specify the Referrer URL for the project starting URL. It may be helpful if you download a part of a web site and the web server wants to know where the request for the first page came from. For all other site pages WebCopier automatically supplies the Referrer information based on the site structure.