XSolvo
Current user:   guest    Change     Preferences 
   List directory   Similar   Print version 
Diff: About html2xml
 Legend:   Removed   Changed   Added 
 Ownership:  rw-rw-r-- stga wheel
 Modified:  today, 21:26
 Modified by:  Stefan Garpefält (stga)
Rev.:  13 (Old)
 
 Ownership:  rw-rw-r-- stga wheel
 Modified:  today, 21:29
 Modified by:  Stefan Garpefält (stga)
Rev.:  14 (Current)


+ %TITLE%
<plugin Embed src = "http://www.xsolvo.com/gallery2/d/158-2/html2xmlScreenShoot.png" alt = "screenshot of html2xml application" style="float:left; margin: 5px 10px 5px 10px;">
XSolvo html2xml is a free program that converts HTML data to XML.

The HTML format is a loose format that makes it quite a difficulty in parsing and processing data from HTML files. It is much simpler to have it in XML instead.

With the file in XML you can use common utilities and tools to process the data.

So, get rid of the HTML parsing problem with html2xml and convert your HTML pages to XML. Then you can process it as you want with XML parsers, XSLT etc...

++ Current version

Version 1.2.0.4, you can download it from the ((download|XSolvo download page)(download)) page.

Se the ((html2xml ChangeLog)) to see changes between releases.

Added a command line version of the program.

Now it has support for converting html from following codepages:

<pre>
+ %TITLE%
<plugin Embed src = "http://www.xsolvo.com/gallery2/d/158-2/html2xmlScreenShoot.png" alt = "screenshot of html2xml application" style="float:left; margin: 5px 10px 5px 10px;">
XSolvo html2xml is a free program that converts HTML data to XML.

The HTML format is a loose format that makes it quite a difficulty in parsing and processing data from HTML files. It is much simpler to have it in XML instead.

With the file in XML you can use common utilities and tools to process the data.

So, get rid of the HTML parsing problem with html2xml and convert your HTML pages to XML. Then you can process it as you want with XML parsers, XSLT etc...

++ Current version

Version 1.2.0.4, you can download it from the ((download|XSolvo download page)(download)) page.

Se the ((html2xml ChangeLog)) to see changes between releases.

Added a command line version of the program.

Now it has support for converting html from following codepages:

<pre>
ASMO-708 DOS-720 iso-8859-6 arabic csISOLatinArabic ECMA-114 ISO_8859-6 ISO_8859-6:1987 iso-ir-127 x-mac-arabic windows-1256 ibm775 CP500 iso-8859-4 csISOLatin4 ISO_8859-4 ISO_8859-4:1988 iso-ir-110 l4 latin4 windows-1257 ibm852 iso-8859-2 csISOLatin2 iso_8859-2 iso_8859-2:1987 iso8859-2 iso-ir-101 l2 latin2 x-mac-ce windows-1250 x-cp1250 cp866 ibm866 cp1251 iso-8859-5 csISOLatin5 csISOLatinCyrillic cyrillic ISO_8859-5 ISO_8859-5:1988 iso-ir-144 iso8859-5 koi8-r csKOI8R koi koi8 koi8r koi8-u koi8-ru x-mac-cyrillic windows-1251 cp1251 x-Europa x-IA5-German ibm737 cp737 iso-8859-7 csISOLatinGreek ECMA-118 ELOT_928 greek greek8 ISO_8859-7 ISO_8859-7:1987 iso-ir-126 x-mac-greek windows-1253 ibm869 cp869 DOS-862 iso-8859-8-i logical iso-8859-8 csISOLatinHebrew hebrew ISO_8859-8 ISO_8859-8:1988 ISO-8859-8 iso-ir-138 visual x-mac-hebrew windows-1255 ISO_8859-8-I ISO-8859-8 visual CP870 CP1026 ibm861 iso-8859-3 csISO Latin3 ISO_8859-3 ISO_8859-3:1988 iso-ir-109 l3 latin3 iso8859_3 iso-8859-15 csISO Latin9 ISO_8859-15 l9 latin9 x-IA5-Norwegian IBM437 437 cp437 csPC8 CodePage437 x-IA5-Swedish windows-874 DOS-874 iso-8859-11 TIS-620 cp874 ibm857 cp857 iso-8859-9 csISO Latin5 ISO_8859-9 ISO_8859-9:1989 iso-ir-148 l5 latin5 x-mac-turkish windows-1254 cp1254 ISO_8859-9 ISO_8859-9:1989 iso-8859-9 iso-ir-148 latin5 us-ascii ANSI_X3.4-1968 ANSI_X3.4-1986 ascii cp367 csASCII IBM367 ISO_646.irv:1991 ISO646-US iso-ir-6us windows-1258 ibm850 x-IA5 iso-8859-1 cp819 csISO Latin1 ibm819 iso_8859-1 iso_8859-1:1987 iso8859-1 iso-ir-100 l1 latin1 macintosh Windows-1252 ANSI_X3.4-1968 ANSI_X3.4-1986 ascii cp367 cp819 csASCII IBM367 ibm819 ISO_646.irv:1991 iso_8859-1 iso_8859-1:1987 ISO646-US iso8859-1 iso-8859-1 iso-ir-100 iso-ir-6 latin1 us us-ascii x-ansi ascii cp1250 cp1251 cp1252 cp1253 cp1254 cp1255 cp1256 cp1257 cp1258 cp437 cp737 cp775 cp850 cp852 cp855 cp856 cp857 cp860 cp861 cp863 cp864 cp865 cp866 cp869 cp874 iso8859_10 iso8859_13 iso8859_14 iso8859_15 iso8859_16 iso8859_1 iso8859_2 iso8859_3 iso8859_4 iso8859_5 iso8859_6 iso8859_7 iso8859_8 iso8859_9 koi8_r
ASMO-708 DOS-720 iso-8859-6 arabic csISOLatinArabic
ECMA-114
ISO_8859-6 ISO_8859-6:1987 iso-ir-127 x-mac-arabic
windows-1256
ibm775 CP500 iso-8859-4 csISOLatin4
ISO_8859-4
ISO_8859-4:1988 iso-ir-110 l4 latin4
windows-1257
ibm852 iso-8859-2 csISOLatin2 iso_8859-2
iso_8859-2:1987
iso8859-2 iso-ir-101 l2 latin2
x-mac-ce
windows-1250 x-cp1250 cp866 ibm866
cp1251
iso-8859-5 csISOLatin5 csISOLatinCyrillic cyrillic
ISO_8859-5
ISO_8859-5:1988 iso-ir-144 iso8859-5 koi8-r
csKOI8R
koi koi8 koi8r koi8-u
koi8-ru
x-mac-cyrillic windows-1251 cp1251 x-Europa
x-IA5-German
ibm737 cp737 iso-8859-7 csISOLatinGreek
ECMA-118
ELOT_928 greek greek8 ISO_8859-7
ISO_8859-7:1987
iso-ir-126 x-mac-greek windows-1253 ibm869
cp869
DOS-862 iso-8859-8-i logical iso-8859-8
csISOLatinHebrew
hebrew ISO_8859-8 ISO_8859-8:1988 ISO-8859-8
iso-ir-138
visual x-mac-hebrew windows-1255 ISO_8859-8-I
ISO-8859-8
visual CP870 CP1026 ibm861
iso-8859-3
csISO Latin3 ISO_8859-3 ISO_8859-3:1988
iso-ir-109
l3 latin3 iso8859_3 iso-8859-15
csISO
Latin9 ISO_8859-15 l9 latin9
x-IA5-Norwegian
IBM437 437 cp437 csPC8
CodePage437
x-IA5-Swedish windows-874 DOS-874 iso-8859-11
TIS-620
cp874 ibm857 cp857 iso-8859-9
csISO
Latin5 ISO_8859-9 ISO_8859-9:1989 iso-ir-148
l5
latin5 x-mac-turkish windows-1254 cp1254
ISO_8859-9
ISO_8859-9:1989 iso-8859-9 iso-ir-148 latin5
us-ascii
ANSI_X3.4-1968 ANSI_X3.4-1986 ascii cp367
csASCII
IBM367 ISO_646.irv:1991 ISO646-US iso-ir-6us
windows-1258
ibm850 x-IA5 iso-8859-1 cp819
csISO
Latin1 ibm819 iso_8859-1 iso_8859-1:1987
iso8859-1
iso-ir-100 l1 latin1 macintosh
Windows-1252
ANSI_X3.4-1968 ANSI_X3.4-1986 ascii cp367
cp819
csASCII IBM367 ibm819 ISO_646.irv:1991
iso_8859-1
iso_8859-1:1987 ISO646-US iso8859-1 iso-8859-1
iso-ir-100
iso-ir-6 latin1 us us-ascii
x-ansi
ascii cp1250 cp1251 cp1252
cp1253
cp1254 cp1255 cp1256 cp1257
cp1258
cp437 cp737 cp775 cp850
cp852
cp855 cp856 cp857 cp860
cp861
cp863 cp864 cp865 cp866
cp869
cp874 iso8859_10 iso8859_13 iso8859_14
iso8859_15
iso8859_16 iso8859_1 iso8859_2 iso8859_3
iso8859_4
iso8859_5 iso8859_6 iso8859_7 iso8859_8
iso8859_9
koi8_r
</pre>

++ Support

You can report bugs and enhancement request at http://bugs.xsolvo.com Use the html2xml queue.

++ Comments

To handle all charset in HTML the output file is in Unicode format.

<table align="right" cellspacing="0" border="0" style="text-align:right">
| <sub>views: %VIEWS% </sub>
</table>
</pre>

++ Support

You can report bugs and enhancement request at http://bugs.xsolvo.com Use the html2xml queue.

++ Comments

To handle all charset in HTML the output file is in Unicode format.

<table align="right" cellspacing="0" border="0" style="text-align:right">
| <sub>views: %VIEWS% </sub>
</table>