![]() |
|
|
|
| ||||||
|
Welcome to the The ProgrammersTalk Community forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact contact us. |
| Tags: |
![]() |
![]() | | LinkBack | Thread Tools | Display Modes | ![]() |
| |||
| [SOLVED] i want a C program to convert .doc files to .html files? But for this i want to know the format of .doc files and picture files(.bmp,jpeg,gif..). IF any one know the formats of these file please help me so that i can do the program myself. actually i dont have enough space in my drive for MS Office. And i am in holiday now. So i thought i can do some useful programming. Even if am not successful, i can get to know the formats of some files. Thanks |
| |
| |||
| There are numerous existing tools to do this already, buuuuut, if you want to go through the exercise... For DOC, You can contact microsoft, and request the specifications. You will need to sign an NDA, but they should be freely available. For the other file formats, look for each type and SDK in a search engine. |
| |||
| Whether you know it or not, you are asking for a lot. MS Windoze .doc format is binary with embedded bitmaps for the graphics. MSWord embeds a bitmap for positioning the graphics on the computer monitor and then other code within MSWord will convert that bitmap, if need be, for high-quality output. On the other hand, .html files are ascii or unicode text with either .gif, .jpg or .png graphics. To write a program, especially in C, to read a MS binary file stream until EOF, and anticipate every possible page, whether it be 2 column, 1 column, a document with multiple pages, a document with formats, a document with no formats, a document with headers, page numbers and footers, etc., etc. That, my friend, would be a long-time project. If you are determined to do it in C, you might see if you can read the binary .doc file and put it into >cout terminal. Java would save you years of developing such a program vs. C. If you are willing to look at other possibilities you might investigate SourceForge.org, and OpenOffice.org for open source programs with source code you can look at. |
![]() |
| Thread Tools | |
| Display Modes | |
| |