pandoc - universal document converter

Message
Author
User avatar
GoManutd
Forum Guide
Forum Guide
Posts: 2952
Joined: Mon Jun 30, 2008 8:06 pm

pandoc - universal document converter

#1 Postby GoManutd » Fri Jun 01, 2012 11:40 am

from time, to time, there have been questions posited about the best methods to batch convert documents.

happened across a solution, called pandoc. haven't tried it, yet, but it looks promising.

http://johnmacfarlane.net/pandoc/

User avatar
Jerry3904
Forum Veteran
Forum Veteran
Posts: 15026
Joined: Wed Jul 19, 2006 6:13 am

Re: pandoc - universal document converter

#2 Postby Jerry3904 » Fri Jun 01, 2012 11:43 am

Certainly looks interesting...thanks.
Production: 4.2.0-0.bpo.1-amd64, MX-15 RC1, AMD FX-4130 Quad-Core, GeForce GT 630/PCIe/SSE2, 8 GB, Kingston SSD 120 GB and WesternDigital 1TB
Testing: AAO 722: 3.16-0-4-686-pae. MX-15, AMD C-60 APU, 4 GB

User avatar
fatjak
Forum Regular
Forum Regular
Posts: 479
Joined: Fri Jul 28, 2006 8:19 pm

Re: pandoc - universal document converter

#3 Postby fatjak » Fri Jun 01, 2012 4:07 pm

Interesting but i don't see any thing about PDF files in the mix. I could use a simple PDF to text converter. If you know of anything that doesn't require a degree to use, would be happy to try it. Played with poppler i think its called but it complained about the files being linear or something similar. Not up on things like that but would like to break out a paragraph now and then instead of putzing for ever.

User avatar
chippy52
Forum Regular
Forum Regular
Posts: 372
Joined: Wed Jul 29, 2009 6:05 pm

Re: pandoc - universal document converter

#4 Postby chippy52 » Fri Jun 01, 2012 4:21 pm

I have used this site for file conversion, especially when MS changed there formats.

http://www.zamzar.com/
Mepis 11 64bit Linux 3.2.0-0.bpo.4-amd64, KDE 4.5.1
Intel i5 2400, Asus P8H67-M-EVO, G-Skill Ripjaws 2x4GB DDR3-1333, nVidia GeForce GT430, Seagate 500GB sata3 HDD

User avatar
lucky9
Forum Veteran
Forum Veteran
Posts: 12271
Age: 70
Joined: Wed Jul 12, 2006 5:54 am

Re: pandoc - universal document converter

#5 Postby lucky9 » Fri Jun 01, 2012 5:36 pm

fatjak wrote:would like to break out a paragraph now and then instead of putzing for ever.


Wouldn't highlighting the text and Copy/Paste into an empty KWrite document do for a paragraph or two?
Yes, even I am dishonest. Not in many ways, but in some. Forty-one, I think it is.
--Mark Twain

User avatar
GoManutd
Forum Guide
Forum Guide
Posts: 2952
Joined: Mon Jun 30, 2008 8:06 pm

Re: pandoc - universal document converter

#6 Postby GoManutd » Fri Jun 01, 2012 5:42 pm

it does PDF, but you need to have latex installed, too.

this tool isn't for a few docs, rather batch conversions of many docs.

User avatar
fatjak
Forum Regular
Forum Regular
Posts: 479
Joined: Fri Jul 28, 2006 8:19 pm

Re: pandoc - universal document converter

#7 Postby fatjak » Sat Jun 02, 2012 12:26 pm

lucky9 wrote:
fatjak wrote:would like to break out a paragraph now and then instead of putzing for ever.


Wouldn't highlighting the text and Copy/Paste into an empty KWrite document do for a paragraph or two?


You would think so but haven't managed to find the key to doing so. All i get is a hand sliding up & down the page, can't figure how to highlight the text & no right click options...?? Something unusual about these PDF files i guess. Document handling not my long suit, wrench in hand & im at home.

User avatar
lucky9
Forum Veteran
Forum Veteran
Posts: 12271
Age: 70
Joined: Wed Jul 12, 2006 5:54 am

Re: pandoc - universal document converter

#8 Postby lucky9 » Sat Jun 02, 2012 12:54 pm

Might be a function of the PDF reader. I remember using Word for the first time and losing it when I couldn't use the _ to underline a few words. Computer nerds are just arcane.
Yes, even I am dishonest. Not in many ways, but in some. Forty-one, I think it is.
--Mark Twain

User avatar
Gaer Boy
Forum Guide
Forum Guide
Posts: 1956
Age: 80
Joined: Sat Jun 06, 2009 6:06 am

Re: pandoc - universal document converter

#9 Postby Gaer Boy » Sat Jun 02, 2012 12:58 pm

fatjak wrote:
lucky9 wrote:
fatjak wrote:would like to break out a paragraph now and then instead of putzing for ever.


Wouldn't highlighting the text and Copy/Paste into an empty KWrite document do for a paragraph or two?


You would think so but haven't managed to find the key to doing so. All i get is a hand sliding up & down the page, can't figure how to highlight the text & no right click options...?? Something unusual about these PDF files i guess. Document handling not my long suit, wrench in hand & im at home.

There are 2 sorts of pdf files - those created using font details, and those created from an image. You can extract text from the former, but not the latter. Image pdfs tend to be much larger than font-based pdfs. It's also possible to create a hybrid pdf, where some elements are font-based and others are images. This is an example:
2012cert12-133.pdf

Open it in Okular, click the last icon on the toolbar (a sort of box with a pencil across it) which is the selection tool. You will find that you can extract text from the bold black areas (which are text-based and specific to the event) but can only extract an image from the other text areas (these are created as an image template).

Other pdf readers work similarly. If the text looks a bit fuzzy, you can be fairly certain it's an image-based pdf.

Phil
You do not have the required permissions to view the files attached to this post.

AsRock FM2A88X-ITX+, A8-6500, 8GB, 120GB Samsung SSD (GPT), 1TB HDD (MBR), MX-15, MX-14.4
Acer Aspire One 150, Atom N270, 120GB HDD, MX-14.3

User avatar
fatjak
Forum Regular
Forum Regular
Posts: 479
Joined: Fri Jul 28, 2006 8:19 pm

Re: pandoc - universal document converter

#10 Postby fatjak » Sat Jun 02, 2012 1:44 pm

Well i found the key right under my nose as usual. In ocular in tools , selection tool lets you draw a box round the item then right mouse context menu works to copy to clipboard etc. Couldn't see the forest for tha trees. Hard to teach an old dog new tricks at times.

Thanks for input guys, made the wheels go round finally.


Return to “Chat”

Who is online

Users browsing this forum: No registered users and 1 guest