.\"===============================================================
.\"
.\" webtex.man 0.96
.\"
.\"                                                 May  30 1997
.\"---------------------------------------------------------------
.TH WEBTEX 1 "30 MAY 1997" Linux "'naochan.com' presents"
.\"===============================================================

.\"------
.SH NAME
.\"------
webtex \- An Implementation of \fIreally practical and usable\fP html2latex

.\"----------
.SH SYNOPSIS
.\"----------
.B webtex
[ \-fgjdv ][ \-o 
.I file
][ \-b
.I base\-URL
]
.I URL

.\"-------------
.SH DESCRIPTION
.\"-------------

.B webtex 
is written (or now being written) on trial to implement 
a \fIreally practical and usable\fP html2latex.

I wrote whole of this system by yacc, lex, and C language from nothingness.
I also wrote LaTeX cording sections.

After
.B webtex
version 0.92, I used Berkeley version of \fIyacc\fP(1).

I mean to make a software that reads a URL specified in 
its argument (like \fIlynx\fP(1)), and output the web page in LaTeX-formatted 
document style.

On current stage, 'Kanji Auto Detecting' is not available.
If you want to read a non-EUC Japanese HTML document,
please use with some Kanji-code converter, such as \fInkf\fP(1).

.\"---------
.SH OPTIONS
.\"---------

.TP
.B \-f
Include the URL-informations specified in <A HREF="..."> into \\footnote.
.TP
.B \-g
Utilise \fIepsbox.sty\fP to paste GIF images into the document.
*NB* Within <PRE>...</PRE>, \fBwebtex\fP cannot paste images, because 
\fIepsbox.sty\fP conflicts against verbatim (alltt) environments.
.TP
.B \-j
Japanese EUC mode (use jreport.sty in LaTeX).
.TP
.BI "\-o " filename
Output the LaTeX-formatted document into a file
.I filename
\.
.TP
.BI "\-b " base\-URL
When processing your HTML file in hand (not from HTTP access),
you can specify the URL (that the HTML has to be) as
.I base-URL
in order to enable relative URL definition in <A HREF="..."> tags.
.TP
.B \-d
Output debug messages to your standard error output (stderr).
.TP
.B \-v
Show the version number of this instance of
.B webtex
and quit.

.\"-----------------------------
.SH RECOGNIZABLE TAGS & OPTIONS
.\"-----------------------------

.TP
.B Basic Set
.PD
.RS
.ta 16
.ie n .TP 27
.el .TP 10
<HTML> 
.PD 0
.TP
<HEAD> <TITLE> <BODY>
.TP
<BR> <NOBR> <WBR>
.TP
<HR> ..................... WIDTH, SIZE, ALIGN
.TP
<IMG> .................... SRC, ALT
.PD
.RE
.TP
.B Paragraph, Anchor, etc.
.PD
.RS
.ta 16
.ie n .TP 27
.el .TP 10
<H1> <H2> <H3> <H4> <H5> <H6>
.PD 0
.TP
<A> ...................... HREF, NAME
.TP
<CENTER> 
.TP
<P> </P>
.TP
<BLOCKQUOTE> <BQ>
.TP
<ADDRESS>
.PD
.RE
.TP
.B Character Attribute
.PD
.RS
.ta 16
.ie n .TP 27
.el .TP 10
<FONT> ................... SIZE
.PD 0
.TP
<BIG> <SMALL>
.TP
<EM> <STRONG> <CODE> <SAMP> 
.TP
<KBD> <VAR> <CITE> <DFN> 
.TP
<B> <I> <U> <TT>
.TP
<PRE> 
.PD
.RE
.TP
.B Tabular Environment
.PD
.RS
.ta 16
.ie n .TP 27
.el .TP 10
<TABLE> .................. BORDER
.PD 0
.TP
<CAPTION>
.TP
<TR>
.TP
<TH> <TD> ................ COLSPAN, ROWSPAN, ALIGN, (VALIGN)
.PD
.RE
.TP
.B Listings
.PD
.RS
.ta 16
.ie n .TP 27
.el .TP 10
<OL> ..................... TYPE, START
.PD 0
.TP
<UL> ..................... TYPE
.TP
<LI> <DL> <DT> <DD>
.TP
<MENU> <DIR>
.PD
.RE
.TP
.B Forms
.PD
.RS
.ta 16
.ie n .TP 27
.el .TP 10
<FORM> 
.PD 0
.TP
<INPUT> .................. TYPE (text, radio, checkbox, submit, reset, hidden), VALUE, SIZE, CHECKED
.TP
<SELECT> 
.TP
<OPTION> 
.TP
<TEXTAREA> ................ (NAME), COLS, ROWS
.PD
.RE
.TP
.B Comments
.PD
.RS
.ta 16
.ie n .TP 15
.el .TP 10
<!-- -->
.PD 0
.TP
<!- ->
.PD
.RE
.TP
.B &xxx;-type Escape Sequences
.PD
.RS
.ta 16
.ie n .TP 15
.el .TP 10
Almost all, from &lt; &gt; &amp; to &ensp; &emsp; &nbsp; &endash; &emdash; in HTML 3.0. 
.PD 0
.TP
(In current version, they are invalid within <INPUT TYPE="TEXT">.)
.PD
.RE
.TP
.B Recognizes Its Name Only
.PD
.RS
.ta 16
.ie n .TP 15
.el .TP 10
<ISINDEX> <BASEFONT> <SCRIPT> <BLINK>
.PD
.RE

.\"---------------------
.SH TAGS NOT RECOGNIZED
.\"---------------------
.PD
.RS
.ta 7
.ie n .TP 15
.el .TP 10
<BASE> <NEXTID>
.PD 0
.TP
Attributed <P>...</P>
.TP
any non-GIF images, such as JPEG, XBM, etc.
.TP
&xxx;-type escape sequences within <INPUT TYPE="TEXT">
.TP
around FRAMEs
.PD
.RE

.\"----------
.SH EXAMPLES
.\"----------

Basically,
.IP
% webtex http://www.naochan.com/
.LP
lets \fBwebtex\fP fetch a HTML from the specified URL 
(in this case, my top page), convert it to LaTeX format, and then 
it comes out to your standard output.
.IP
% webtex -o naochan.tex http://www.naochan.com/
.LP
outputs into a file \fInaochan.tex\fP, not to stdout.

If you already have the HTML file (eg. naochan.html)
.IP
% webtex -o naochan.tex < naochan.html
.LP
uses HTML data from the standard input (not the specified URL).

If you want to handle non-EUC Japanese texts, insert some 
Kanji-code converter like this:
.IP
% nkf -e < naochan.html | webtex > naochan.tex
.LP
If you don't have the HTML file, do with the attached utility, \fIgethtml\fP,
.IP
% gethtml http://www.naochan.com/ | nkf -e \\
    | webtex >naochan.tex
.LP
and you will get what you wanted.

.\"----------------------------
.SH AND ALSO THE GIF IMAGES...
.\"----------------------------

When the HTML includes GIF images, you can use \fI\-g\fP options like:
.IP
% webtex -g http://www.naochan.com/
.LP
to paste those images by using \fInetpbm\fP and \fIepsbox.sty\fP, as far as
possible. (in current version, those image files are saved in /tmp/ directory.)

If you want (again) to handle non-EUC Japanese texts, insert some 
Kanji-code converter like this:
.IP
% nkf -e < naochan.html \\
    | webtex -b http://www.naochan.com/ \\
    > naochan.tex
.LP
sets base URL (for relative URL definition in <A HREF="...">) by 
\fI\-b\fP option, and convert \fInaochan.html\fP into \fInaochan.tex\fP.

The ultimate form to manage Japanese HTMLs is:
.IP
% gethtml http://www.naochan.com/ | nkf -e \\
    | webtex -b http://www.naochan.com/ -g \\
             -o naochan.tex
.LP
By these commands, \fBwebtex\fP fetches a home page from specified URL
(http://www.naochan.com/), and coerce any Kanji-codes in HTML into 
\fBwebtex\fP-processable code, and convert it into LaTeX format, and
save it into the file \fInaochan.tex\fP successfully.
Since I often use these command, I make it a script \fIw2t\fP.
.IP
% w2t http://www.naochan.com/
.LP
makes LaTeX file \fIw2t.tex\fP directly. If you set anything as 
the 2nd argument of \fIw2t\fP like:
.IP
% w2t http://www.naochan.com/ -v
.LP
\fIw2t\fP automatically calls \fIjlatex\fP(1) and \fIdviout\fP(1), 
to preview the result.
(*NB* when HTML contains some GIF images, \fIdviout\fP runs slowly.)

.\"-----------------------
.SH SOFTWARE REQUIREMENTS
.\"-----------------------
Of course you must have the LaTeX typesetting system.
And prepare some style files following:
.PD
.RS
.ta 16
.ie n .TP 15
.el .TP 15
- a4.sty     (a4j.sty in Japanese mode)
.PD 0
.TP
- epsbox.sty
.TP
- webtex.sty (attached with \fBwebtex\fP)
.PD
.RE

In order to handle some GIF images (-g), 
\fIgiftopnm\fP(1) and \fIpnmtops\fP(1) (in \fInetpbm\fP tools) are needed.

To preview the GIF-involved result, you need the TeX(dvi) preview system,
\fIdviout\fP(1) version 2.39 or later (in Linux JE, etc.).

To print, you need either \fIdviprt\fP(1) version 2.39 (normal printer), 
or \fIjdvi2kps\fP(1) (PostScript printer). They are in Linux JE, etc.

This version of 
.B webtex
doesn't have a functionality of Kanji Auto Detect, so it's convenient 
to use with some Kanji-code converter such as \fInkf\fP(1).

If you have the TeX(dvi) preview and printing-out system, 
\fIdviout\fP(1) and \fIdviprt\fP(1), you will feel happy.

.\"----------------
.SH RELEASE POLICY
.\"----------------
From version 0.95,
.B webtex
is released and distributed as a 'freeware',
under the GNU General Public License.

You cannot use whole or part of the source of
.B webtex,
for commercial purpose.

.\"-------------------------
.SH DEVELOPMENT ENVIRONMENT
.\"-------------------------
Developed by yacc, lex and C languages.

Before version 0.91, I 'make'd it with 
\fIbison\fP (GNU version of \fIyacc\fP(1)) + flex,

After version 0.92, I 'make' with Berkeley version of \fIyacc\fP.

Developed on DEC HiNote CT475 + Linux 2.0.27.

.\"-------------
.SH BUG REPORTS
.\"-------------
If you find a bug in
.B webtex,
you should report it to me <\fInaochan@naochan.com\fP>.
But first, you should make sure that it really is a bug, and that it appears 
in the latest version of
.B webtex
that you have.
.PP
ALL bug reports should include:
.PP
.PD 0
.TP 20
- Version of \fBwebtex\fP (please type 'webtex -v'.)
.TP
- Hardware and OS
.TP
- Troubled HTML document
.TP
- Debug Information (with -d option, comes from stderr.)
.PD
.PP
I cannot ensure rapid responses,
nor making your opinion into the next version of \fBwebtex\fP.

.\"------------
.SH DISCLAIMER
.\"------------
.B webtex
is distributed with *no warranty* whatever. 
And there is *no* user-support services.
The author and any other contributors take no responsibility for 
the consequences of its use.

Although \fBwebtex\fP has been tested it may have errors that will
prevent it from working correctly on your system.  Some of these 
errors may cause serious problems including loss of data, etc.

The grammer of HTML (or its usage) may differ between browsers, and
there are so many religions about LaTeX layout. So I cannot follow
everyone's tendencies.

In current version, I don't consider much about robustness of
when managing 'grammatically incorrect' HTML files.

.\"------
.SH BUGS
.\"------
This man page is translated (by me) from Japanese manual (also by me).
I translated this rapidly (in 2.5 hours), 
and many bugs may be involved in this manual.
If you find them, please e-mail me!

.\"-----------
.SH COPYRIGHT
.\"-----------
.B webtex
is Copyright (c) 1997 Naoya Tozuka. All rights reserved.

It may be used, copied and modified under the terms of the GNU
General Public License.

Commercial utilisation of whole or part of the source of
.B webtex
is prohibited.

.\"--------------------
.SH NEWEST INFORMATION
.\"--------------------
The newest information about webtex might be in:
.IP
http://www.naochan.com/nc/it/soft/webtex/index-en.html
.LP
in English.
(Japanese information is in the same directory, index-jp.html.)

.\"------------
.SH "SEE ALSO"
.\"------------
.PD 0
.TP
\fIlatex\fP(1),
.TP
\fIgiftopnm\fP(1), \fIpnmtops\fP(1)
.PD

.\"--------
.SH AUTHOR
.\"--------
Naoya TOZUKA <\fInaochan@naochan.com\fP>

