Top "Extraction" questions

Questions related to retrieving specific information from a (typically minimally structured) data source, such as a web site, media file, source code collection or compressed archive (in which case the desired information is one or more original, uncompressed files).

How to extract the source code from a *.jar file on a Mac?

I'm very confused. I downloaded a *.jar file as a bit of software. So, I would like to extract the …

java jar extraction
Extract LSB bit from a Byte in python

I have a byte in variable 'DATA'. I want to extract the LSB bit out of it and print it. …

python-3.x byte extraction lsb
What algorithm does Readability use for extracting text from URLs?

For a while, I've been trying to find a way of intelligently extracting the "relevant" text from a URL by …

javascript asp.net extraction
Extracting information from PDFs of research papers

I need a mechanism for extracting bibliographic metadata from PDF documents, to save people entering it by hand or cut-and-pasting …

pdf metadata extraction
PDF table extraction

I have (same) data saved as a GIF image file and as a PDF file and I want to parse …

pdf pdfbox extraction
Stroke Width Transform (SWT) implementation (Java, C#...)

I recently discovered the stroke width transform, as documented in the following research paper: Detecting Text in Natural Scenes with …

c# java image-processing ocr extraction
Extract text after a symbol in R

sample1 = read.csv("pirate.csv") sample1[,7] [1] >>xyz>>hello>>mate 1 [2] >>xyz>>…

regex r text-mining extraction
Function names extraction from static library

I have a static library static_library.a How to list functions and methods realized there. or at least how …

unix extraction static-libraries
OpenCV: How to get inlier points using findHomography()/findFundamental() and RANSAC

OpenCV does not provide a RANSAC-function per se or at least in such a form that you can just call …

opencv mask extraction points ransac
how can we extract text from pdf using itextsharp with spaces?

I am using below method to extract pdf text line by line. But problem that, it is not reading spaces …

c# pdf extract extraction pdf-reader