PCD2HTML readme pcd2html is a collection of scripts and Makefiles to convert Kodac Photo CD data into documented HTML pages.
The user has to ship several files called "rules" to control in which way the images should be converted into JPG files. It depends from convert which is contained in the ImageMagick package available from ftp://ftp.wizards.dupont.com/pub/ImageMagick/ By supporting certain text files each single image could be documented in English or German (or with slight changes to any other language). Example ------- For those clever people who'd never read any documentation: You could find a complete set of example files to convert a whole Photo-CD at http://www.physik.uni-halle.de/~e2od5/island/pcd_1998_data.html The unchanged result of this set of files could be found at http://www.physik.uni-halle.de/~e2od5/island/pcd_1998 If you stop reading here you are left alone. May be the poor man page could help you a little bit, but reading this introduction is intended to give you all help. Motivation ---------- There are many reasons for offering Photo-CD images in WWW browser readable form. Converting the images int JPG images is the first step od this work. There are three possibilities to do that: 1. convert blindly with a dumb batch file --> This will end in poor quality and is absolutely not recommended. 2. load PCD image into an image manipulation program, do some enhancements (as cropping, shrinking, normalizing histogram) and save it as JPG --> The quality of the output will be OK but it is a time consuming process and (thats the point) do you really know what you have done with your image?? Imagine you want to do a step inbetween the steps you have done and try to remember what you have done some days/months ago. 3. use a batch method which is configurable for each image --> You will need some iterations to get the wanted result but you have allways the chance to reproduce what you have done. Most basic stuff of image manipulation can be done by convert. May be it consumes some time to run the batch for an image over and over until you are satisfied. But why not writing the describing text of actual image this time? Converting images ----------------- The third way is the way pcd2html goes. In a rules file you define all options for `convert` you want to use for each image to convert. In fact you write at first a realy simple rule: File rules has the line: 001 first_image That does: convert 001.pcd 001first_image.jpg The rules file contains only the number of the PCD image and the name you want to call the converted image. Now visit the image. May be you consider to crop the black border and shrink the image to fit on a normal browser page. The new rules file looks like: 001 first_image [-crop 750x500+12+7 -geometry 600x450] ==> convert -crop 750x500+12+7 -geometry 600x45 0001.pcd 001first_image.jpg All characters between [] will be used as options for convert. Sou you can place any valid convert option here. OK, what if the image file size is to big? You could compress the image. By default pcd2html uses `-quality 75`. The `-quality` option is the only option you should avoid to place into [] because I'm not sure, which option is used by convert. Try the following: 001 first_image [-crop 750x500+12+7 -geometry 600x450] Q60 ==> convert -crop 750x500+12+7 -geometry 600x45 -quality 60 0001.pcd \ 001first_image.jpg Off course you can specify also lower compression if the image shows ugly compression flaws, i.e. "Q85". May be you have to fiddle around something with all these options. But this is no problem. The only thing you have to do is to change the rules file and type `make`. All things will be done automatically. Consider choosing an image magnification from the Photo CD other the third. OK, what about 001 first_image [-crop 1500x1000+24+14 -geometry 600x450] Q60 !4 ==> convert -crop 1590x1000+24+14 -geometry 600x45 -quality 60 0001.pcd[4] \ 001first_image.jpg You see there is full control over the converting process. It is possible to comment out a line by a `#` at the first column of a row. This is usefull if you want to save your last rules of a file if unsure. Each line of a rules file describes what to do with a certain image on the Photo-CD. The user has to group his images into subdirectories and create rules files in each subdirectory. After a change of one of the rules files type `make` in this directory and *all* images will be converted newly. That means it is a goof idea to convert a group of files change the rules of them and convert them with the new rules. Hint: To get the crop coordinates of an image the program paul was very usefull for me. You can obtain paul from http://packages.debian.org/paul or http://www.physik.uni-halle.de/~e2od5/paul Further hint: If you want to convert only a single image than you can do: make 001.stamp or whatever image you want to convert. The file .stamp is build after converting (it contains some convert output from -verbose option) and will be cleaned by make clean. It has more or less a technical sense and is not interesting for the user. Creating HTML files ------------------- Suppose all images are convertet as you like them. Normaly you now would start hacking HTML files to present them in a reasonable manner. Would it be astonishing for you that these files are ready yet? OK, let me explain how it works. After converting the image a HTML file is created, which simply loads the image. A style file called pcd.css is supported which could be helpful to change the look of all pages in one step. Note: This pcd.css file is the worst thing in the whole bundle and I hope that someone would help me to improve it! But that's not all! I mentioned above that you could use the time while the batch conversion runs for editing the HTML file. Now I have to say: Never edit the HTML file by hand! All your changes will be lost if you run pcd2html next time. But there is a more clever way. The user has the possibility to ship three files for each image: .eng, .deu, .tec where means the number of the image. What are these beasts? In .eng you can write any HTML text in English language. The first line has a special meaning. It is used as title of the page and is NOT displayed on the page itself. May be it wold make sense it also to use as headline. E-mail your opinion about it and I will change it! The rest is printed below the image as an English description. I think you guess the meaning of .deu. This is for speakers of German language. I think it wouldn't be a problem for non German speakers to change pcd2html this way to support their favourite language. Importand is that pcd2html supports two languages ... if you like it. What about the .tec files? This is for information which should be printed under both the English and the German text. It is interesting for photographers if they want to ship information like the used film, shutter speed, aperture and so on. All these things are optional but quite useful. The clue is that you get your HTML pages fast and can put them on the net immediately after getting the Photo CD. If you find the time to note some comments to your images simply edit the .eng, .deu or .tec file call pcd2html and your HTML pages are up to date each time. How are the HTML pages ordered? The pages in each subdirectory are ordered in this order as the images are described in the rules file. Each page has a link to its previous and its following page if such a page exists. So your slide show is ready to present. Each subdirectory has an index file with a small (really small!) thumbnail. What to do if there is some stuff which didn't fit in that scheme? You can ship a executable (shellscript) called "extra" in each directory and insert a line in the rules file extra What about the main index file? In the main directory a rules file with different syntax is necessary. It's recommended to use the first line as comment (starting with `#`) and the following information: # These values are used if you create tar.gz archives of your pcd2html data or the ready HTML output. This is used in the following way: pcd2html data => stores all your user supported data in _data-.tgz or pcd2html html => stores all necessary pcd2html output in _html-.tgz The first archive is what to have to store savely to rebuild all necessary files. The second one is all you need to put into the Web or give your friends or what also you want to do. After these optional names follow the names of the subdirectories with the titles of their index files in this syntax: subdir {English title of subindex} [German title of subindex] The main rules file controls which subdirectories are visited to search for the image rules files. The main index file has links to the indexes in the subdirectories. It has an additional link to the other language. I considered it to be useless to have a link to the other language on each page, but it wouldn't be the faintest problem to implement such a feature. All HTML pages contain the meta tags with the date of creation, the creating software, the name of the user which created the file (if available in `finger` information). In the main index file can be attached furthermore a `keywords` meta tag. This will be included, if a file keywords.eng or keywords.deu exists. Furthermore a the files contents.eng and contents.deu are responsible for the `contents` meta tag. Both tags are interesting for Web search machines. Last but not least you should not forget to write the index files index.eng and index.deu. These files contain the text of the main page. As usual the first line of these files are the Title of the page and here it is also used as headline. Two comments have special meanings: # back:
where
is the URL of the page one step deeper in your pages hierarchy and # home: marks your home page. It is good style to give visitors a chance to go back to a reasonable place and you shouldn't forget these comments into your index.eng and index.deu files. Requirements ------------ ImageMagick: - Which version of convert you use depends from your needs to convert the image. I used ImageMagick 4.0.4 98/04/01 cristy@sympatico.org pcd2html uses the following UNIX tools: - bash (GNU bash, version 2.01.1(1)-release) Some special bash features where used, I'm not sure if other shells could be used. Lower bash versions should work, too. - GNU make (GNU Make version 3.76.1) There are features in GNU make which aren't available in other make varieties. I expect no other make to work with pcd2html. Lower versions could work, too. - grep (grep (GNU grep) 2.1) No very specific features of GNU grep are used. Every grep should work. - sed (GNU sed version 2.05) The sed programming in pcd2html is not so tricky, that it is expected to work with oder sed versions, too. I really hope that pcd2html is useful for you and look forward any critics, bugfixes or enhancements Andreas Tille