https://mamchenkov.net/wordpress/2014/08/05/textract-extract-text-from-any-document/
textract - extract text from any document