textract – extract text from any document.  Currently supports .doc, .docx, .eml, .json, .html, .pptx, .pdf, and .txt.
Tag: command line
cool old term – terminal emulator which mimics the old cathode display
Validating website HTML, CSS, and links from the command line
When working on a long running projects, it’s easy to lose track of HTML and CSS standard compliance. Â Also, link rot is a common occurrence. Â Gladly, there are command line tools that can be executed on a regular basis (think weekly or monthly cron jobs), that would check the site and report any issues with it. Â Here is one of the ways.
Installation on Fedora:
yum install linkchecker yum install python-tidy yum install python-cssutils
Example command line:
linkchecker -t20 --check-html --check-css https://mamchenkov.net
Obviously, check the manual of linkchecker for more options.
HTTPie – command line HTTP client, a user-friendly cURL replacement
HTTPie – command line HTTP client, a user-friendly cURL replacement.
Main features:
- Expressive and intuitive syntax
- Formatted and colorized terminal output
- Built-in JSON support
- Forms and file uploads
- HTTPS, proxies, and authentication
- Arbitrary request data
- Custom headers
- Persistent sessions
- Wget-like downloads
- Python 2.6, 2.7 and 3.x support
- Linux, Mac OS X and Windows support
- Documentation
- Test coverage