PCD2HTML readme
pcd2html is a collection of scripts and Makefiles to convert
Kodac Photo CD data into documented HTML pages.
The user has to ship several files called "rules" to control
in which way the images should be converted into JPG files.
It depends from convert which is contained in the ImageMagick
package available from
ftp://ftp.wizards.dupont.com/pub/ImageMagick/
By supporting certain text files each single image could be
documented in English or German (or with slight changes to any
other language).
Example
-------
For those clever people who'd never read any documentation:
You could find a complete set of example files to convert a whole
Photo-CD at
http://www.physik.uni-halle.de/~e2od5/island/pcd_1998_data.html
The unchanged result of this set of files could be found at
http://www.physik.uni-halle.de/~e2od5/island/pcd_1998
If you stop reading here you are left alone. May be the poor
man page could help you a little bit, but reading this introduction
is intended to give you all help.
Motivation
----------
There are many reasons for offering Photo-CD images in WWW browser
readable form. Converting the images int JPG images is the first
step od this work. There are three possibilities to do that:
1. convert blindly with a dumb batch file
--> This will end in poor quality and is absolutely not recommended.
2. load PCD image into an image manipulation program, do some
enhancements (as cropping, shrinking, normalizing histogram)
and save it as JPG
--> The quality of the output will be OK but it is a time consuming
process and (thats the point) do you really know what you have
done with your image?? Imagine you want to do a step inbetween
the steps you have done and try to remember what you have done
some days/months ago.
3. use a batch method which is configurable for each image
--> You will need some iterations to get the wanted result but
you have allways the chance to reproduce what you have done.
Most basic stuff of image manipulation can be done by convert.
May be it consumes some time to run the batch for an image over
and over until you are satisfied. But why not writing the
describing text of actual image this time?
Converting images
-----------------
The third way is the way pcd2html goes. In a rules file you define
all options for `convert` you want to use for each image to convert.
In fact you write at first a realy simple rule:
File rules has the line:
001 first_image
That does:
convert 001.pcd 001first_image.jpg
The rules file contains only the number of the PCD image and the
name you want to call the converted image. Now visit the image.
May be you consider to crop the black border and shrink the image
to fit on a normal browser page. The new rules file looks like:
001 first_image [-crop 750x500+12+7 -geometry 600x450]
==>
convert -crop 750x500+12+7 -geometry 600x45 0001.pcd 001first_image.jpg
All characters between [] will be used as options for convert.
Sou you can place any valid convert option here.
OK, what if the image file size is to big? You could compress the
image. By default pcd2html uses `-quality 75`. The `-quality` option
is the only option you should avoid to place into [] because I'm not
sure, which option is used by convert. Try the following:
001 first_image [-crop 750x500+12+7 -geometry 600x450] Q60
==>
convert -crop 750x500+12+7 -geometry 600x45 -quality 60 0001.pcd \
001first_image.jpg
Off course you can specify also lower compression if the image
shows ugly compression flaws, i.e. "Q85".
May be you have to fiddle around something with all these options.
But this is no problem. The only thing you have to do is to change
the rules file and type `make`. All things will be done automatically.
Consider choosing an image magnification from the Photo CD other the
third. OK, what about
001 first_image [-crop 1500x1000+24+14 -geometry 600x450] Q60 !4
==>
convert -crop 1590x1000+24+14 -geometry 600x45 -quality 60 0001.pcd[4] \
001first_image.jpg
You see there is full control over the converting process.
It is possible to comment out a line by a `#` at the first column
of a row. This is usefull if you want to save your last rules
of a file if unsure.
Each line of a rules file describes what to do with a certain image
on the Photo-CD. The user has to group his images into subdirectories and
create rules files in each subdirectory. After a change of one of the
rules files type `make` in this directory and *all* images will be
converted newly. That means it is a goof idea to convert a group of files
change the rules of them and convert them with the new rules.
Hint: To get the crop coordinates of an image the program paul
was very usefull for me. You can obtain paul from
http://packages.debian.org/paul
or
http://www.physik.uni-halle.de/~e2od5/paul
Further hint: If you want to convert only a single image than you can
do:
make 001.stamp
or whatever image you want to convert. The file .stamp
is build after converting (it contains some convert output from
-verbose option) and will be cleaned by make clean. It has more
or less a technical sense and is not interesting for the user.
Creating HTML files
-------------------
Suppose all images are convertet as you like them. Normaly you now
would start hacking HTML files to present them in a reasonable manner.
Would it be astonishing for you that these files are ready yet?
OK, let me explain how it works.
After converting the image a HTML file is created, which simply loads
the image. A style file called pcd.css is supported which could
be helpful to change the look of all pages in one step.
Note: This pcd.css file is the worst thing in the whole bundle
and I hope that someone would help me to improve it!
But that's not all!
I mentioned above that you could use the time while the batch conversion
runs for editing the HTML file. Now I have to say: Never edit the
HTML file by hand! All your changes will be lost if you run pcd2html
next time. But there is a more clever way.
The user has the possibility to ship three files for each image:
.eng, .deu, .tec
where means the number of the image.
What are these beasts?
In .eng you can write any HTML text in English language.
The first line has a special meaning. It is used as title of
the page and is NOT displayed on the page itself. May be it
wold make sense it also to use as headline. E-mail your
opinion about it and I will change it!
The rest is printed below the image as an English description.
I think you guess the meaning of .deu. This is for
speakers of German language. I think it wouldn't be a problem
for non German speakers to change pcd2html this way to
support their favourite language. Importand is that
pcd2html supports two languages ... if you like it.
What about the .tec files?
This is for information which should be printed under
both the English and the German text. It is interesting
for photographers if they want to ship information
like the used film, shutter speed, aperture and so on.
All these things are optional but quite useful.
The clue is that you get your HTML pages fast and can put
them on the net immediately after getting the Photo CD.
If you find the time to note some comments to your images
simply edit the .eng, .deu or .tec file
call pcd2html and your HTML pages are up to date each time.
How are the HTML pages ordered?
The pages in each subdirectory are ordered in this order
as the images are described in the rules file. Each page
has a link to its previous and its following page if such
a page exists. So your slide show is ready to present.
Each subdirectory has an index file with a small (really
small!) thumbnail.
What to do if there is some stuff which didn't fit in that
scheme? You can ship a executable (shellscript) called
"extra" in each directory and insert a line in the rules
file
extra
What about the main index file?
In the main directory a rules file with different syntax is
necessary.
It's recommended to use the first line as comment (starting with
`#`) and the following information:
#
These values are used if you create tar.gz archives of your
pcd2html data or the ready HTML output. This is used in the following
way:
pcd2html data
=> stores all your user supported data in
_data-.tgz
or
pcd2html html
=> stores all necessary pcd2html output in
_html-.tgz
The first archive is what to have to store savely to rebuild all
necessary files. The second one is all you need to put into the
Web or give your friends or what also you want to do.
After these optional names follow the names of the subdirectories
with the titles of their index files in this syntax:
subdir {English title of subindex} [German title of subindex]
The main rules file controls which subdirectories are visited
to search for the image rules files. The main index file has links
to the indexes in the subdirectories. It has an additional link to
the other language. I considered it to be useless to have a link
to the other language on each page, but it wouldn't be the faintest
problem to implement such a feature.
All HTML pages contain the meta tags with the date of creation, the
creating software, the name of the user which created the file
(if available in `finger` information). In the main index file
can be attached furthermore a `keywords` meta tag. This will be
included, if a file keywords.eng or keywords.deu exists. Furthermore
a the files contents.eng and contents.deu are responsible for
the `contents` meta tag. Both tags are interesting for Web search
machines.
Last but not least you should not forget to write the index files
index.eng and index.deu. These files contain the text of the main page.
As usual the first line of these files are the Title of the page
and here it is also used as headline.
Two comments have special meanings:
# back:
where is the URL of the page one step deeper in your
pages hierarchy and
# home:
marks your home page. It is good style to give visitors a chance
to go back to a reasonable place and you shouldn't forget these
comments into your index.eng and index.deu files.
Requirements
------------
ImageMagick:
- Which version of convert you use depends from your needs to
convert the image. I used
ImageMagick 4.0.4 98/04/01 cristy@sympatico.org
pcd2html uses the following UNIX tools:
- bash (GNU bash, version 2.01.1(1)-release)
Some special bash features where used, I'm not sure if other
shells could be used. Lower bash versions should work, too.
- GNU make (GNU Make version 3.76.1)
There are features in GNU make which aren't available in other
make varieties. I expect no other make to work with pcd2html.
Lower versions could work, too.
- grep (grep (GNU grep) 2.1)
No very specific features of GNU grep are used. Every grep
should work.
- sed (GNU sed version 2.05)
The sed programming in pcd2html is not so tricky, that it is
expected to work with oder sed versions, too.
I really hope that pcd2html is useful for you and look forward any
critics, bugfixes or enhancements
Andreas Tille