HTTrack Website Copier
Open Source offline browser

Option panel : MIME Types






  • MIME Types

  • An important new feature for some people. This panel tells the engine that if a link is encountered, with a specific type (.cgi, .asp, or .php3 for example), it MUST assume that this link has always the same MIME type, for example the "text/html" MIME type. This is VERY important to speed up many mirrors. Some big HTML files which have many links of unknown type embedded, such as ".asp", cause the engine to test all links, and this slows down the parser.

    In this case, you can tell HTTrack: ".asp pages are in fact HTML pages"
    This is possible, using:

    File type: asp MIME identity: text/html

    You can declare multiple definitions, or declare multiple types separed by ",", like in:
    File type: asp,php,php3 MIME identity: text/html

    Most important MIME types are:
    text/htmlHtml files, parsed by HTTrack
    image/gifGIF files
    image/jpegJpeg files
    image/pngPNG files
    application/x-zip.zip files
    application/x-mp3.mp3 files
    application/x-foo.foo files
    application/octet-streamUnknown files

    You can rename files on a mirror. If you KNOW that all "dat" files are in fact "zip" files renamed into "dat", you can tell httrack:
    File type: dat MIME identity: application/x-zip

    You can also "name" a file type, with its original MIME type, if this type is not known by HTTrack. This will avoid a test when the link will be reached:
    File type: foo MIME identity: application/octet-stream

    In this case, HTTrack won't check the type, because it has learned that "foo" is a known type, or MIME type "application/octet-stream". Therefore, it will let untouched the "foo" type.






Back to Home