Architecture guide
This document outlines the architecture that makes up the basis of the CardScan framework.

Overview

The CardScan framework operates like a pipeline. Images from a data source (e.g. a Camera) are processed by a pool of analyzers in a loop. The results from those analyzers are aggregated together by a voting algorithm, and a final result is returned to the app.
CardScan pipeline
The pipeline consists of the following basic concepts:

Camera Adapters

First and foremost, CardScan is used to process images from a device camera. The scan-camera android module provides an abstraction around the Camera1 and Camera2 android APIs. It provides a simple interface by which the camera can be initialized, attached to a preview View, and provide a stream of images to process.

Example

This code example uses a Camera2Adapter to stream preview images from the device camera and display the preview in a TextureView.
Kotlin
Java
1
class MyCameraActivity : AppCompatActivity(), CameraErrorListener {
2
private val cameraAdapter by lazy {
3
Camera2Adapter(
4
activity = this,
5
6
// A TextureView where the preview will show. If null, no preview
7
// will be shown.
8
previewView = textureView,
9
10
// the minimum image resolution that should be streamed.
11
minimumResolution = MINIMUM_RESOLUTION,
12
cameraErrorListener = this
13
)
14
}
15
16
/**
17
* Call this method to start streaming images from the camera.
18
*/
19
fun startProcessingCameraImages() {
20
cameraAdapter.bindToLifecycle(this)
21
cameraAdapter.getImageStream().collect { processCameraImage(it) }
22
}
23
24
private fun processCameraImage(previewFrame: Bitmap) {
25
// Do something with the preview frame
26
}
27
28
override fun onCameraOpenError(cause: Throwable?) {
29
// The camera could not be opened
30
}
31
32
override fun onCameraAccessError(cause: Throwable?) {
33
// The camera could not be accessed
34
}
35
36
override fun onCameraUnsupportedError(cause: Throwable?) {
37
// the camera is not supported on this device.
38
}
39
}
Copied!
1
public class MyCameraActivity
2
extends AppCompatActivity
3
implements CameraErrorListener {
4
5
// Most cardscan models require a minimum resolution of 1280x720
6
private static final Size MINIMUM_RESOLUTION = new Size(1280, 720);
7
8
private CameraAdapter<Bitmap> cameraAdapterInstance = null;
9
10
private CameraAdapter<Bitmap> getCameraAdapter() {
11
if (cameraAdapterInstance == null) {
12
cameraAdapterInstance = Camera2Adapter(
13
/* activity */ this,
14
15
// A TextureView where the preview will show. If null, no
16
// preview will be shown.
17
/* previewView */ (TextureView) findViewById(R.id.textureView),
18
19
// the minimum image resolution that should be streamed.
20
/* minimumResolution */ MINIMUM_RESOLUTION,
21
/* cameraErrorListener */ this
22
);
23
}
24
25
return cameraAdapterInstance;
26
}
27
28
/**
29
* Call this method to start streaming images from the camera.
30
*/
31
public void startProcessingCameraImages() {
32
final CameraAdapter<Bitmap> cameraAdapter = getCameraAdapter();
33
cameraAdapter.bindToLifecycle(this);
34
cameraAdapter.getImageStream().collect(
35
(bitmap, continuation) -> {
36
processCameraImage(bitmap);
37
Coroutine.resumeJava(continuation, Unit.INSTANCE);
38
return Unit.INSTANCE;
39
},
40
new EmptyJavaContinuation<>()
41
);
42
}
43
44
private void processCameraImage(@NotNull Bitmap previewFrame) {
45
// Do something with the preview frame
46
}
47
48
@Override
49
public void onCameraOpenError(@Nullable Throwable cause) {
50
// The camera could not be opened
51
}
52
53
@Override
54
public void onCameraAccessError(@Nullable Throwable cause) {
55
// The camera could not be accessed
56
}
57
58
@Override
59
public void onCameraUnsupportedError(@Nullable Throwable cause) {
60
// the camera is not supported on this device.
61
}
62
}
Copied!

Analyzers

Analyzers represent the smallest single unit of the pipeline, the ML models. Analyzers take a single input, process that input, and return a single output.
Analyzer

Result Handlers

Result handlers take an input and result (often from an analyzer), and do something with that result. For example, a result handler might perform a voting algorithm or update the UI.
Result Handler

Parallel Analyzers

With TensorFlow Lite, ML models perform better when multiple inference engines run in parallel. To process camera images as fast as possible, the framework runs multiple analyzers in parallel on the camera frames. This maximizes CPU usage and image throughput.
Parallel Analyzers

Result Aggregation

Using OCR to extract a payment card number as an example, we only need a single result from this process, the payment card number. The CardScan SDK uses a ResultAggregator to run a voting algorithm on the results from each camera frame.
This is a specialized version of a ResultHandler which handles multiple results from multiple ML models running in parallel. Once aggregation is complete (e.g. voting has decided on a card number), the aggregator sends the final result to a listener.
Result Aggregator

Analyzer Pools

It takes time to load ML models into memory and create an inference engine. Analyzer pools work like a thread pool in that multiple analyzers are created, and then used as needed. The CardScan SDK creates a pool of analyzers at the beginning of the scan process that can be re-used to process images from the camera.
Analyzer Pool

Loops

Loops tie analyzers and result aggregators together with a shared state. The loop provides an interface for accepting images from the camera or other data source, processes those images through a pool of analyzers, and finally collects the results in a result aggregator.
Analyzers are able to read the shared state from the loop, while the result aggregator can read and update the state. This allows for coordination between the result aggregator and the analyzers without strong coupling.
Loop

Fetchers

ML model analyzers can be created from multiple sources:
  • Resources packaged with the SDK
  • Downloaded from fixed URLs
  • Downloaded from CDNs using signed URLs
  • Downloaded from CDNs based on a server-driven configuration
Fetchers provide the means to get the ML model data (TensorFlow Lite files) and store it locally on disk for faster future retrieval.
Fetcher

Loaders and Analyzer Factories

Loaders read data from disk and prepare it to be read into memory by creating a MappedByteBuffer of the data stored on disk.
Analyzer Factories use Loaders to create instances of TensorFlow Lite inference engines using the byte buffer.
Loaders and Analyzer Factories
Analyzer Pools are created using an Analyzer Factory. An AnalyzerPoolFactory uses an analyzer factory to create multiple instances of analyzers, which it adds to a pool.
Analyzer Pool Factory