Plucker Electronic Books mini-HOWTO
View this article as: [ xml / txt ]
Plucker Electronic Books HOWTO
Many people use Plucker to read electronic books, or "ebooks", but aren't always sure where to go to find them. There are many resources available that have free and open books you can fetch and read with Plucker. I'll try to go through some of the more-popular sites, and give you some examples describing how to Pluck them.
To begin with, you'll need to have Plucker installed (of course) and configured for your Palm device (colors, storage, and so on). Once that part is done, the rest is fairly simple.
Here are two of the more-popular "free" ebook and electronic text ("etext") sites you can use. I'll go through each of them and describe the best way to Pluck the content. They each have their own little quirks and options:
- Baen Free Ebook Library
Eric Flint started the Free Library to provide a free way for people to read electronic copies of books and articles online. More about the history of Baen can be found over here at Baen.
- Project Gutenberg
Project Gutenberg began in 1971 by a man named Michael Hart, at the Materials Research Lab at the University of Illinois.
The Project Gutenberg Philosophy is to make information, books and other materials available to the general public in forms a vast majority of the computers, programs and people can easily read, use, quote, and search.
You can read more about the history of Project Gutenberg on their site.
There are also quite a few sites that produce electronic books in Plucker format directly, so you don't have to fetch and convert them yourself. Here are four of the most-popular ones:
- Byron's Emporium
- PluckerBooks
PluckerBooks is website that offers stripped-down HTML versions of popular public domain literary works, suitable for viewing on handheld computing devices.
- MemoWare Plucker Ebooks
MemoWare, a division of Handmark, is a unique collection of thousands of documents (databases, literature, maps, technical references, lists, etc.) specially formatted to be easily added to your PalmOS device, Pocket PC, Windows CE, EPOC, Symbian or other handheld device. The documents available here come in a variety of formats and cover a wide range of topics.
- Bandersnatch Unpress
- Linux Documentation Project
The "Linux Documentation Project" (LDP) now releases all its HOWTOs and mini-HOWTOs in Plucker format.
To Pluck electronic books, "ebooks" in the parlance.. you will need to know either the URL of the HTML or text version of the "book" you want to read in Plucker, or you will have to download the HTML or text version to your local machine and process it there.
This is one of the most flexible features of Plucker; you can use Plucker to convert local files, as well as remote files. Why would you want to convert a local file, when you can just fetch it from the URL? Because in many cases, you might want to make changes to the document or webpage, so it looks better on a Palm, than it does in a web browser, before you convert it with Plucker.
Once you know the URL of the ebook you wish to read, simply point Plucker to it, and fetch away. Make sure that you have the right depth, stayonhost, stayondomain, and so on values, depending on the site you're plucking. This widely varies from site to site, so you'll have to tinker and test a bit before you know what works well for you.
Baen Free Library with Plucker
If you wanted to fetch some books from the Baen Free Library, for example, you would point your web-browser (not Plucker) to the following url:
http://www.webscription.net/free/
From there, you can see a list of ebooks, by title, as well as a "download" link on the right. Clicking on a title will then pop up another browser window with the first chapter in the main frame. Since Plucker does not support frames, you'll want to click on the "frameless" link on the left side of the page. Now, look in your browser's URL field, and you will notice that the url changed to add 3 underscore characters at the end of the ebook hash, and before the chapter number. This means an ebook of 0671319752.htm in frames, would be 0671319752_1.htm without frames. You won't really need to know this, unless you're automatically processing these ebooks with some automated tools (as I do here, with perl =).
In that new page you're viewing, the one without frames, you can now grab the URL in the URL field of your browser, and begin spidering that with Plucker. If you use Windows, you can put this URL in as your starting page in Plucker Desktop or JPluck. If you're using Linux or BSD, you can just use plucker-build, and put the URL in as the "-H" argument to the Python distiller code.
Another option, is to click on the "Cover" link on the right, when viewing the main framed page, and pluck from there instead, drilling down with a proper stayondomain/staybelow declaration. You want to make sure you stay on webscription.net, and don't traverse offsite from this point, otherwise you'll get much more content than you need.
Another option is to download the HTML directly to your local machine, using the "download" link on the main Baen webscription page. From there, you just unpack it onto your local drive, and spider it locally, making sure to use the proper stayonhost, so it doesn't follow links off of your system to the internet proper.
Project Gutenberg with Plucker
Project Gutenberg differs from Baen and other sites in that it provides its books in text-only format as well as HTML format. The benefit of text-only is that you can format them however you wish before parsing them with Plucker, or you can just pluck them directly, with the 'file:/' method to the distiller you use (plucker-build, Plucker Desktop, JPluck, etc.).
I personally prefer the text-only versions, which I reformat with some perl using a module called Text::Wrap, but you won't need to use that. I just use it to reformat and reflow the text to suit my reading habits.
To pluck these etexts, simply point to the .txt or .htm versions as they appear on the Gutenberg website, and you should be all set. They are entirely self-contained within one file, so a maxdepth of '1' should work in most cases.
There really is nothing magical about Plucking ebooks, etexts, or online articles. In most cases, you just have to make sure they are formatted to best suit your needs, and that you don't spider more than you need inside the document. A combination of the proper maxdepth, stayonhost, staybelow, and stayondomain can ensure that the final document contains just what it should, and no more.
If you have any other suggestions, comments, or corrections to this (or any other) article, please feel free to contact me directly, and I'll attend to it right away.
Good luck, and keep on Plucking!
If you find Plucker useful, and want to support the project, please visit our donate page to find out how to contribute to help support Plucker's development and maturity.
The Plucker Team
plucker-dev@plkr.org
Useful Links
Latest Plucker News
Latter Day Saints readings in Plucker format
Tuesday October 12th 2004
http://www.nathanbullock.org/nathan/software/lds-get.html As I l ...
1st Ever Plucker Artwork Contest
Thursday February 22nd 2001
Get our your tablets, your sketchpads, your crayons, and your pens and pencils!!! We're going to have a contest for the best t-shirt de ...
Thursday February 24th 2000
Many people have asked what Plucker is used for, so I'll try to do my best to answer from my own perspective what it's useful for to me (others may ...
Standards
This website is 100% compliant with XHTML 1.0 Strict and uses valid CSS.
Only very minimal use of tables allows us to achieve this result.
Compatibility
Plucker runs on Microsoft Windows, Linux, Mac OS X, Palm handheld devices, Windows Mobile devices and handheld devices powered by Linux.
Donations
Plucker is a Free Software project. We do this entirely in our spare time, to provide a great piece of software for you.
None of the Plucker developers get paid for their work on Plucker.
Syndicate :: XML
rss :: Plucker Workshoprss :: Digg Palm News
rss :: 1SRC News
rss :: Everything Treo
rss :: PalmAddicts
rss :: PalmInfocenter
rss :: TreoCentral
rss :: Treonauts
rss :: Y Technology