blinkid-android
Everything you need to add AI-driven ID scanning into your native Android app.
Stars: 445
README:
The BlinkID Android SDK lets you build a fantastic onboarding experience in your Android app.
With one quick scan, your users will be able to extract information from their identity cards, passports, driver licenses and virtually any other government-issued ID there is.
BlinkID is:
- Fast. Real-time data extraction in less than 400ms. Way better than minutes-long form-filling.
- Secure. Privacy first, always. Scanning works even if the user’s phone is in airplane mode, meaning personal information never touches a third-party server.
- Intelligent. Machine learning models, optimized to read and parse identity documents from more than 180 countries worldwide, automatically, no need to preselect any of them.
- Lightweight. Designed to increase your app’s usability, not weight.
- What you make of it. Customize and rebrand the default UI or leave it as it is. It’s up to you.
- More than just a powerful ID scanner. Powerful data extraction, coupled with powerful perks. Get a cropped document image back, spot printed documents or data match both sides of the ID for parity.
To see all of these features at work download our free demo app:
Feeling ready to crack on with the integration? First make sure we support your document type ➡️ full list. And then follow the guidelines below.
- Quick Start
- Device requirements
- BlinkID SDK integration levels
- Available built-in activities and overlays
- Handling processing events with
RecognizerRunner
andRecognizerRunnerView
-
Recognizer
concept andRecognizerBundle
- List of available recognizers
- Embedding BlinkID inside another SDK
- Processor architecture considerations
- Troubleshooting
- FAQ and known issues
- Additional info
- Open Android Studio.
- In Quick Start dialog choose Import project (Eclipse ADT, Gradle, etc.).
- In File dialog select BlinkIDSample folder.
- Wait for the project to load. If Android studio asks you to reload project on startup, select
Yes
.
- BlinkID-aMinimalSample demonstrates quick and simple integration of BlinkID library
- BlinkID-ComposeMinimalSample demonstrates simple integration of BlinkID library while using Compose
- BlinkID-MinimalSampleAdvanced demonstrates simple integration of BlinkID library where both recognizer and UI settings can be customized
- BlinkID-AllRecognizersSample demonstrates integration of almost all available features. This sample application is best for performing a quick test of supported features
- BlinkID-CustomCombinedSample demonstrates advanced custom UI integration and usage of the combined recognizers within a custom scan activity
- BlinkID-ComposeFragmentSample demonstrates advanced custom UI integration and usage of the combined recognizers within a fragment while using Compose
- BlinkID-CustomUISample demonstrates advanced integration within custom scan activity
- BlinkID-DirectApiSample demonstrates how to perform scanning of Android Bitmaps
- BlinkID-ImagesSample demonstrates how to obtain document images
-
BlinkID-OverlaySample demonstrates how to use
RecognizerRunnerFragment
and built in camera overlay controller within your activity
In your build.gradle
, add BlinkID maven repository to repositories list
repositories {
maven { url 'https://maven.microblink.com' }
}
Add BlinkID as a dependency and make sure transitive
is set to true
dependencies {
implementation('com.microblink:blinkid:6.11.1@aar') {
transitive = true
}
}
Android studio should automatically import javadoc from maven dependency. If that doesn't happen, you can do that manually by following these steps:
- In Android Studio project sidebar, ensure project view is enabled
- Expand
External Libraries
entry (usually this is the last entry in project view) - Locate
blinkid-6.11.1
entry, right click on it and selectLibrary Properties...
- A
Library Properties
pop-up window will appear - Click the second
+
button in bottom left corner of the window (the one that contains+
with little globe) - Window for defining documentation URL will appear
- Enter following address:
https://blinkid.github.io/blinkid-android/
- Click
OK
-
A valid license key is required to initialize scanning. You can request a free trial license key, after you register, at Microblink Developer Hub. License is bound to package name of your app, so please make sure you enter the correct package name when asked.
Download your licence file and put it in your application's assets folder. Make sure to set the license key before using any other classes from the SDK, otherwise you will get a runtime exception.
We recommend that you extend Android Application class and set the license in onCreate callback like this:
public class MyApplication extends Application { @Override public void onCreate() { MicroblinkSDK.setLicenseFile("path/to/license/file/within/assets/dir", this); } }
public class MyApplication : Application() { override fun onCreate() { MicroblinkSDK.setLicenseFile("path/to/license/file/within/assets/dir", this) } }
-
In your main activity, define and create
ActivityResultLauncher
object by overriding theonActivityResult
method. BothOneSideDocumentScan
andTwoSideDocumentScan
can be used interchangably with no difference in implementation. The only functional difference is thatOneSideDocumentScan
is scanning only one side of the document and thatTwoSideDocumentScan
scans more than one side of the document.ActivityResultLauncher<Void> resultLauncher = registerForActivityResult( new TwoSideDocumentScan(), twoSideScanResult -> { ResultStatus resultScanStatus = twoSideScanResult.getResultStatus(); if (resultScanStatus == ResultStatus.FINISHED) { // code after a successful scan // use result.getResult() for fetching results, for example: String firstName = twoSideScanResult.getResult().getFirstName().value(); } else if (resultScanStatus == ResultStatus.CANCELLED) { // code after a cancelled scan } else if (resultScanStatus == ResultStatus.EXCEPTION) { // code after a failed scan } } );
private val resultLauncher = registerForActivityResult(TwoSideDocumentScan()) { twoSideScanResult: TwoSideScanResult -> when (twoSideScanResult.resultStatus) { ResultStatus.FINISHED -> { // code after a successful scan // use twoSideScanResult.result for fetching results, for example: val firstName = twoSideScanResult.result?.firstName?.value() } ResultStatus.CANCELLED -> { // code after a cancelled scan } ResultStatus.EXCEPTION -> { // code after a failed scan } else -> {} } }
@Composable fun createLauncher(): ActivityResultLauncher<Void?> { return rememberLauncherForActivityResult(TwoSideDocumentScan()) { twoSideScanResult: TwoSideScanResult -> when (twoSideScanResult.resultStatus) { ResultStatus.FINISHED -> { // code after a successful scan // use twoSideScanResult.result for fetching results, for example: val firstName = twoSideScanResult.result?.firstName?.value() } ResultStatus.CANCELLED -> { // code after a cancelled scan } ResultStatus.EXCEPTION -> { // code after a failed scan } else -> {} } } }
After a scan, the
result
, which is either aOneSideScanResult
orTwoSideScanResult
object instance, is going to be updated. You can define what happens with the data in the override of theonActivityResult
function (Kotlin code also overrides this function but it is implicit). Results are accessible in thetwoSideScanResult.getResult()
method (twoSideScanResult.result
in Kotlin). -
Start scanning process by calling
ActivityResultObject
and callingActivityResultLauncher.launch
:// method within MyActivity from previous step public void startScanning() { // Start scanning resultLauncher.launch(null); }
// method within MyActivity from previous step public fun startScanning() { // Start scanning resultLauncher.launch() }
// within @Composable function or setContent block val resultLauncher = createLauncher() resultLauncher.launch()
The results are going to be available in a callbacks, which are defined in the
ActivityResultObject
which was defined in the previous step.
BlinkID requires Android API level 21 or newer.
Camera video preview resolution also matters. In order to perform successful scans, camera preview resolution must be at least 720p. Note that camera preview resolution is not the same as video recording resolution.
BlinkID is distributed with ARMv7 and ARM64 native library binaries.
BlinkID is a native library, written in C++ and available for multiple platforms. Because of this, BlinkID cannot work on devices with obscure hardware architectures. We have compiled BlinkID native code only for the most popular Android ABIs.
Even before setting the license key, you should check if the BlinkID is supported on the current device (see next section: Compatibility check). Attempting to call any method from the SDK that relies on native code, such as license check, on a device with unsupported CPU architecture will crash your app.
If you are combining BlinkID library with other libraries that contain native code into your application, make sure you match the architectures of all native libraries.
For more information, see Processor architecture considerations section.
Here's how you can check whether the BlinkID is supported on the device:
// check if BlinkID is supported on the device,
RecognizerCompatibilityStatus status = RecognizerCompatibility.getRecognizerCompatibilityStatus(this);
if (status == RecognizerCompatibilityStatus.RECOGNIZER_SUPPORTED) {
Toast.makeText(this, "BlinkID is supported!", Toast.LENGTH_LONG).show();
} else if (status == RecognizerCompatibilityStatus.NO_CAMERA) {
Toast.makeText(this, "BlinkID is supported only via Direct API!", Toast.LENGTH_LONG).show();
} else if (status == RecognizerCompatibilityStatus.PROCESSOR_ARCHITECTURE_NOT_SUPPORTED) {
Toast.makeText(this, "BlinkID is not supported on current processor architecture!", Toast.LENGTH_LONG).show();
} else {
Toast.makeText(this, "BlinkID is not supported! Reason: " + status.name(), Toast.LENGTH_LONG).show();
}
// check if _BlinkID_ is supported on the device,
when (val status = RecognizerCompatibility.getRecognizerCompatibilityStatus(this)) {
RecognizerCompatibilityStatus.RECOGNIZER_SUPPORTED -> {
Toast.makeText(this, "BlinkID is supported!", Toast.LENGTH_LONG).show()
}
RecognizerCompatibilityStatus.NO_CAMERA -> {
Toast.makeText(this, "BlinkID is supported only via Direct API!", Toast.LENGTH_LONG).show()
}
RecognizerCompatibilityStatus.PROCESSOR_ARCHITECTURE_NOT_SUPPORTED -> {
Toast.makeText(this, "BlinkID is not supported on current processor architecture!", Toast.LENGTH_LONG).show()
}
else -> {
Toast.makeText(this, "BlinkID is not supported! Reason: " + status.name, Toast.LENGTH_LONG).show()
}
}
Some recognizers require camera with autofocus. If you try using them on a device that doesn't support autofocus, you will get an error. To prevent that, you can check whether a recognizer requires autofocus by calling its requiresAutofocus method.
If you already have an array of recognizers, you can easily filter out recognizers that require autofocus from array using the following code snippet:
Recognizer[] recArray = ...;
if(!RecognizerCompatibility.cameraHasAutofocus(CameraType.CAMERA_BACKFACE, this)) {
recArray = RecognizerUtils.filterOutRecognizersThatRequireAutofocus(recArray);
}
var recArray: Array<Recognizer> = ...
if(!RecognizerCompatibility.cameraHasAutofocus(CameraType.CAMERA_BACKFACE, this)) {
recArray = RecognizerUtils.filterOutRecognizersThatRequireAutofocus(recArray)
}
You can integrate BlinkID into your app in five different ways, depending on your use case and customisation needs:
- One line document scanning (
OneSideDocumentScan
andTwoSideDocumentScan
) - SDK handles everything and you just need to start our built-in activity and handle result, no customisation options - Built-in activities (
UISettings
) - SDK handles most of the work, you just need to define a recognizer, settings, start our built-in activity and handle result, customisation options are limited - Built-in fragment (
RecognizerRunnerFragment
) - reuse scanning UX from our built-in activities in your own activity - Custom UX (
RecognizerRunnerView
) - SDK handles camera management while you have to implement completely custom scanning UX - Direct Api (
RecognizerRunner
) - SKD only handles recognition while you have to provide it with the images, either from camera or from a file
OneSideDocumentScan
and TwoSideDocumentScan
are classes that contain all of the necessary setting definitions in order to quickly start SDK's built-in scan activities. It allows the user to skip all of the setup steps like UISettings
and RecognizerBundle
and go directly to scanning.
As shown in the Performing your first scan requires just the definition of a result listener, to define what is going to happen with the scan results, and calling the actual scanning function.
UISettings
is a class that contains all the necessary settings for SDK's built-in scan activities. It configures scanning activity behaviour, strings, icons and other UI elements.
You should use ActivityRunner
to start the scan activity configured by UISettings
, shown in the example below.
We provide multiple UISettings
classes specialised for different scanning scenarios. Each UISettings
object has properties which can be changed via appropriate setter methods. For example, you can customise camera settings with setCameraSettings
metod.
All available UISettings
classes are listed here.
-
In your main activity, create recognizer objects that will perform image recognition, configure them and put them into RecognizerBundle object. You can see more information about available recognizers and
RecognizerBundle
here.For example, to scan supported document, configure your recognizer like this:
public class MyActivity extends Activity { private BlinkIdMultiSideRecognizer mRecognizer; private RecognizerBundle mRecognizerBundle; @Override protected void onCreate(Bundle bundle) { super.onCreate(bundle); // setup views, as you would normally do in onCreate callback // create BlinkIdMultiSideRecognizer mRecognizer = new BlinkIdMultiSideRecognizer(); // bundle recognizers into RecognizerBundle mRecognizerBundle = new RecognizerBundle(mRecognizer); } }
public class MyActivity : Activity() { private lateinit var mRecognizer: BlinkIdMultiSideRecognizer private lateinit var mRecognizerBundle: RecognizerBundle override fun onCreate(bundle: Bundle) { // setup views, as you would normally do in onCreate callback // create BlinkIdMultiSideRecognizer mRecognizer = BlinkIdMultiSideRecognizer() // build recognizers into RecognizerBundle mRecognizerBundle = RecognizerBundle(mRecognizer) } }
-
Start recognition process by creating
BlinkIdUISettings
and callingActivityRunner.startActivityForResult
:// method within MyActivity from previous step public void startScanning() { // Settings for BlinkIdActivity BlinkIdUISettings settings = new BlinkIdUISettings(mRecognizerBundle); // tweak settings as you wish // Start activity ActivityRunner.startActivityForResult(this, MY_REQUEST_CODE, settings); }
// method within MyActivity from previous step public fun startScanning() { // Settings for BlinkIdActivity val settings = BlinkIdUISettings(mRecognizerBundle) // tweak settings as you wish // Start activity ActivityRunner.startActivityForResult(this, MY_REQUEST_CODE, settings) }
-
onActivityResult
will be called in your activity after scanning is finished, here you can get the scanning results.@Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); if (requestCode == MY_REQUEST_CODE) { if (resultCode == Activity.RESULT_OK && data != null) { // load the data into all recognizers bundled within your RecognizerBundle mRecognizerBundle.loadFromIntent(data); // now every recognizer object that was bundled within RecognizerBundle // has been updated with results obtained during scanning session // you can get the result by invoking getResult on recognizer BlinkIdMultiSideRecognizer.Result result = mRecognizer.getResult(); if (result.getResultState() == Recognizer.Result.State.Valid) { // result is valid, you can use it however you wish } } } }
override protected fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent) { super.onActivityResult(requestCode, resultCode, data); if (requestCode == MY_REQUEST_CODE) { if (resultCode == Activity.RESULT_OK && data != null) { // load the data into all recognizers bundled within your RecognizerBundle mRecognizerBundle.loadFromIntent(data) // now every recognizer object that was bundled within RecognizerBundle // has been updated with results obtained during scanning session // you can get the result by invoking getResult on recognizer val result = mRecognizer.result if (result.resultState == Recognizer.Result.State.Valid) { // result is valid, you can use it however you wish } } } }
For more information about available recognizers and
RecognizerBundle
, see RecognizerBundle and available recognizers.
If you want to reuse our built-in activity UX inside your own activity, use RecognizerRunnerFragment
. Activity that will host RecognizerRunnerFragment
must implement ScanningOverlayBinder
interface. Attempting to add RecognizerRunnerFragment
to activity that does not implement that interface will result in ClassCastException
.
The ScanningOverlayBinder
is responsible for returning non-null
implementation of ScanningOverlay
- class that will manage UI on top of RecognizerRunnerFragment
. It is not recommended to create your own ScanningOverlay
implementation, use one of our implementations listed here instead.
Here is the minimum example for activity that hosts the RecognizerRunnerFragment
:
public class MyActivity extends AppCompatActivity implements RecognizerRunnerFragment.ScanningOverlayBinder {
private BlinkIdMultiSideRecognizer mRecognizer;
private RecognizerBundle mRecognizerBundle;
private BlinkIdOverlayController mScanOverlay;
private RecognizerRunnerFragment mRecognizerRunnerFragment;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate();
setContentView(R.layout.activity_my_activity);
mScanOverlay = createOverlay();
if (null == savedInstanceState) {
// create fragment transaction to replace R.id.recognizer_runner_view_container with RecognizerRunnerFragment
mRecognizerRunnerFragment = new RecognizerRunnerFragment();
FragmentTransaction fragmentTransaction = getSupportFragmentManager().beginTransaction();
fragmentTransaction.replace(R.id.recognizer_runner_view_container, mRecognizerRunnerFragment);
fragmentTransaction.commit();
} else {
// obtain reference to fragment restored by Android within super.onCreate() call
mRecognizerRunnerFragment = (RecognizerRunnerFragment) getSupportFragmentManager().findFragmentById(R.id.recognizer_runner_view_container);
}
}
@Override
@NonNull
public ScanningOverlay getScanningOverlay() {
return mScanOverlay;
}
private BlinkIdOverlayController createOverlay() {
// create BlinkIdMultiSideRecognizer
mRecognizer = new BlinkIdMultiSideRecognizer();
// bundle recognizers into RecognizerBundle
mRecognizerBundle = new RecognizerBundle(mRecognizer);
BlinkIdUISettings settings = new BlinkIdUISettings(mRecognizerBundle);
return settings.createOverlayController(this, mScanResultListener);
}
private final ScanResultListener mScanResultListener = new ScanResultListener() {
@Override
public void onScanningDone(@NonNull RecognitionSuccessType recognitionSuccessType) {
// pause scanning to prevent new results while fragment is being removed
mRecognizerRunnerFragment.getRecognizerRunnerView().pauseScanning();
// now you can remove the RecognizerRunnerFragment with new fragment transaction
// and use result within mRecognizer safely without the need for making a copy of it
// if not paused, as soon as this method ends, RecognizerRunnerFragments continues
// scanning. Note that this can happen even if you created fragment transaction for
// removal of RecognizerRunnerFragment - in the time between end of this method
// and beginning of execution of the transaction. So to ensure result within mRecognizer
// does not get mutated, ensure calling pauseScanning() as shown above.
}
@Override
public void onUnrecoverableError(@NonNull Throwable throwable) {
}
};
}
package com.microblink.blinkid
class MainActivity : AppCompatActivity(), RecognizerRunnerFragment.ScanningOverlayBinder {
private lateinit var mRecognizer: BlinkIdMultiSideRecognizer
private lateinit var mRecognizerRunnerFragment: RecognizerRunnerFragment
private lateinit var mRecognizerBundle: RecognizerBundle
private lateinit var mScanOverlay: BlinkIdOverlayController
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
if (!::mScanOverlay.isInitialized) {
mScanOverlay = createOverlayController()
}
setContent {
this.run {
// viewBinding has to be set to 'true' in buildFeatures block of the build.gradle file
AndroidViewBinding(RecognizerRunnerLayoutBinding::inflate) {
mRecognizerRunnerFragment =
fragmentContainerView.getFragment<RecognizerRunnerFragment>()
}
}
}
}
override fun getScanningOverlay(): ScanningOverlay {
return mScanOverlay
}
private fun createOverlay(): BlinkIdOverlayController {
// create BlinkIdMultiSideRecognizer
val mRecognizer = BlinkIdMultiSideRecognizer()
// bundle recognizers into RecognizerBundle
mRecognizerBundle = RecognizerBundle(mRecognizer)
val settings = BlinkIdUISettings(mRecognizerBundle)
return settings.createOverlayController(this, mScanResultListener)
}
private val mScanResultListener: ScanResultListener = object : ScanResultListener {
override fun onScanningDone(p0: RecognitionSuccessType) {
// pause scanning to prevent new results while fragment is being removed
mRecognizerRunnerFragment!!.recognizerRunnerView!!.pauseScanning()
// now you can remove the RecognizerRunnerFragment with new fragment transaction
// and use result within mRecognizer safely without the need for making a copy of it
// if not paused, as soon as this method ends, RecognizerRunnerFragments continues
// scanning. Note that this can happen even if you created fragment transaction for
// removal of RecognizerRunnerFragment - in the time between end of this method
// and beginning of execution of the transaction. So to ensure result within mRecognizer
// does not get mutated, ensure calling pauseScanning() as shown above.
}
override fun onUnrecoverableError(p0: Throwable) {
}
}
}
Please refer to sample apps provided with the SDK for more detailed example and make sure your host activity's orientation is set to nosensor
or has configuration changing enabled (i.e. is not restarted when configuration change happens). For more information, check scan orientation section.
This section discusses how to embed RecognizerRunnerView into your scan activity and perform scan.
- First make sure that
RecognizerRunnerView
is a member field in your activity. This is required because you will need to pass all activity's lifecycle events toRecognizerRunnerView
. - It is recommended to keep your scan activity in one orientation, such as
portrait
orlandscape
. Settingsensor
as scan activity's orientation will trigger full restart of activity whenever device orientation changes. This will provide very poor user experience because both camera and BlinkID native library will have to be restarted every time. There are measures against this behaviour that are discussed later. - In your activity's
onCreate
method, create a newRecognizerRunnerView
, set RecognizerBundle containing recognizers that will be used by the view, define CameraEventsListener that will handle mandatory camera events, define ScanResultListener that will receive call when recognition has been completed and then call itscreate
method. After that, add your views that should be layouted on top of camera view. - Pass in your activity's lifecycle using
setLifecycle
method to enable automatic handling of lifeceycle events.
Here is the minimum example of integration of RecognizerRunnerView
as the only view in your activity:
public class MyScanActivity extends AppCompatActivity {
private static final int PERMISSION_CAMERA_REQUEST_CODE = 42;
private RecognizerRunnerView mRecognizerRunnerView;
private BlinkIdMultiSideRecognizer mRecognizer;
private RecognizerBundle mRecognizerBundle;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
// create BlinkIdMultiSideRecognizer
mRecognizer = new BlinkIdMultiSideRecognizer();
// bundle recognizers into RecognizerBundle
mRecognizerBundle = new RecognizerBundle(mRecognizer);
// create RecognizerRunnerView
mRecognizerRunnerView = new RecognizerRunnerView(this);
// set lifecycle to automatically call recognizer runner view lifecycle methods
mRecognizerRunnerView.setLifecycle(getLifecycle());
// associate RecognizerBundle with RecognizerRunnerView
mRecognizerRunnerView.setRecognizerBundle(mRecognizerBundle);
// scan result listener will be notified when scanning is complete
mRecognizerRunnerView.setScanResultListener(mScanResultListener);
// camera events listener will be notified about camera lifecycle and errors
mRecognizerRunnerView.setCameraEventsListener(mCameraEventsListener);
setContentView(mRecognizerRunnerView);
}
@Override
public void onConfigurationChanged(Configuration newConfig) {
super.onConfigurationChanged(newConfig);
// changeConfiguration is not handled by lifecycle events so call it manually
mRecognizerRunnerView.changeConfiguration(newConfig);
}
private final CameraEventsListener mCameraEventsListener = new CameraEventsListener() {
@Override
public void onCameraPreviewStarted() {
// this method is from CameraEventsListener and will be called when camera preview starts
}
@Override
public void onCameraPreviewStopped() {
// this method is from CameraEventsListener and will be called when camera preview stops
}
@Override
public void onError(Throwable exc) {
/**
* This method is from CameraEventsListener and will be called when
* opening of camera resulted in exception or recognition process
* encountered an error. The error details will be given in exc
* parameter.
*/
}
@Override
@TargetApi(23)
public void onCameraPermissionDenied() {
/**
* Called in Android 6.0 and newer if camera permission is not given
* by user. You should request permission from user to access camera.
*/
requestPermissions(new String[]{Manifest.permission.CAMERA}, PERMISSION_CAMERA_REQUEST_CODE);
/**
* Please note that user might have not given permission to use
* camera. In that case, you have to explain to user that without
* camera permissions scanning will not work.
* For more information about requesting permissions at runtime, check
* this article:
* https://developer.android.com/training/permissions/requesting.html
*/
}
@Override
public void onAutofocusFailed() {
/**
* This method is from CameraEventsListener will be called when camera focusing has failed.
* Camera manager usually tries different focusing strategies and this method is called when all
* those strategies fail to indicate that either object on which camera is being focused is too
* close or ambient light conditions are poor.
*/
}
@Override
public void onAutofocusStarted(Rect[] areas) {
/**
* This method is from CameraEventsListener and will be called when camera focusing has started.
* You can utilize this method to draw focusing animation on UI.
* Areas parameter is array of rectangles where focus is being measured.
* It can be null on devices that do not support fine-grained camera control.
*/
}
@Override
public void onAutofocusStopped(Rect[] areas) {
/**
* This method is from CameraEventsListener and will be called when camera focusing has stopped.
* You can utilize this method to remove focusing animation on UI.
* Areas parameter is array of rectangles where focus is being measured.
* It can be null on devices that do not support fine-grained camera control.
*/
}
};
private final ScanResultListener mScanResultListener = new ScanResultListener() {
@Override
public void onScanningDone(@NonNull RecognitionSuccessType recognitionSuccessType) {
// this method is from ScanResultListener and will be called when scanning completes
// you can obtain scanning result by calling getResult on each
// recognizer that you bundled into RecognizerBundle.
// for example:
BlinkIdMultiSideRecognizer.Result result = mRecognizer.getResult();
if (result.getResultState() == Recognizer.Result.State.Valid) {
// result is valid, you can use it however you wish
}
// Note that mRecognizer is stateful object and that as soon as
// scanning either resumes or its state is reset
// the result object within mRecognizer will be changed. If you
// need to create a immutable copy of the result, you can do that
// by calling clone() on it, for example:
BlinkIdMultiSideRecognizer.Result immutableCopy = result.clone();
// After this method ends, scanning will be resumed and recognition
// state will be retained. If you want to prevent that, then
// you should call:
mRecognizerRunnerView.resetRecognitionState();
// Note that reseting recognition state will clear internal result
// objects of all recognizers that are bundled in RecognizerBundle
// associated with RecognizerRunnerView.
// If you want to pause scanning to prevent receiving recognition
// results or mutating result, you should call:
mRecognizerRunnerView.pauseScanning();
// if scanning is paused at the end of this method, it is guaranteed
// that result within mRecognizer will not be mutated, therefore you
// can avoid creating a copy as described above
// After scanning is paused, you will have to resume it with:
mRecognizerRunnerView.resumeScanning(true);
// boolean in resumeScanning method indicates whether recognition
// state should be automatically reset when resuming scanning - this
// includes clearing result of mRecognizer
}
};
}
If activity's screenOrientation
property in AndroidManifest.xml
is set to sensor
, fullSensor
or similar, activity will be restarted every time device changes orientation from portrait to landscape and vice versa. While restarting activity, its onPause
, onStop
and onDestroy
methods will be called and then new activity will be created anew. This is a potential problem for scan activity because in its lifecycle it controls both camera and native library - restarting the activity will trigger both restart of the camera and native library. This is a problem because changing orientation from landscape to portrait and vice versa will be very slow, thus degrading a user experience. We do not recommend such setting.
For that matter, we recommend setting your scan activity to either portrait
or landscape
mode and handle device orientation changes manually. To help you with this, RecognizerRunnerView
supports adding child views to it that will be rotated regardless of activity's screenOrientation
. You add a view you wish to be rotated (such as view that contains buttons, status messages, etc.) to RecognizerRunnerView
with addChildView method. The second parameter of the method is a boolean that defines whether the view you are adding will be rotated with device. To define allowed orientations, implement OrientationAllowedListener interface and add it to RecognizerRunnerView
with method setOrientationAllowedListener
. This is the recommended way of rotating camera overlay.
However, if you really want to set screenOrientation
property to sensor
or similar and want Android to handle orientation changes of your scan activity, then we recommend to set configChanges
property of your activity to orientation|screenSize
. This will tell Android not to restart your activity when device orientation changes. Instead, activity's onConfigurationChanged
method will be called so that activity can be notified of the configuration change. In your implementation of this method, you should call changeConfiguration
method of RecognizerView
so it can adapt its camera surface and child views to new configuration.
This section will describe how to use direct API to recognize android Bitmaps without the need for camera. You can use direct API anywhere from your application, not just from activities.
Image recognition performance highly depends on the quality of the input images. When our camera management is used (scanning from a camera), we do our best to get camera frames with the best possible quality for the used device. On the other hand, when Direct API is used, you need to provide high-quality images without blur and glare for successful recognition.
- First, you need to obtain reference to RecognizerRunner singleton using getSingletonInstance.
- Second, you need to initialize the recognizer runner.
- After initialization, you can use singleton to process:
-
Still Android
Bitmaps
obtained, for example, from the gallery. Use recognizeBitmap or recognizeBitmapWithRecognizers. -
Video
Images
that are built from custom camera video frames, for example, when you use your own or third party camera management. Recognition will be optimized for speed and will rely on time-redundancy between consecutive video frames in order to yield best possible recognition result. Use recognizeVideoImage or recognizeVideoImageWithRecognizers. -
Still
Images
when you need thorough scanning of single or few images which are not part of the video stream and you want to get best possible results from the singleInputImage
. InputImage type comes from our SDK or it can be created by using ImageBuilder. Use recognizeStillImage or recognizeStillImageWithRecognizers.
- When you want to delete all cached data from multiple recognitions, for example when you want to scan other document and/or restart scanning, you need to reset the recognition state.
- Do not forget to terminate the recognizer runner singleton after usage (it is a shared resource).
Here is the minimum example of usage of direct API for recognizing android Bitmap:
public class DirectAPIActivity extends Activity {
private RecognizerRunner mRecognizerRunner;
private BlinkIdMultiSideRecognizer mRecognizer;
private RecognizerBundle mRecognizerBundle;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate();
// initialize your activity here
// create BlinkIdMultiSideRecognizer
mRecognizer = new BlinkIdMultiSideRecognizer();
// bundle recognizers into RecognizerBundle
mRecognizerBundle = new RecognizerBundle(mRecognizer);
try {
mRecognizerRunner = RecognizerRunner.getSingletonInstance();
} catch (FeatureNotSupportedException e) {
Toast.makeText(this, "Feature not supported! Reason: " + e.getReason().getDescription(), Toast.LENGTH_LONG).show();
finish();
return;
}
mRecognizerRunner.initialize(this, mRecognizerBundle, new DirectApiErrorListener() {
@Override
public void onRecognizerError(Throwable t) {
Toast.makeText(DirectAPIActivity.this, "There was an error in initialization of Recognizer: " + t.getMessage(), Toast.LENGTH_SHORT).show();
finish();
}
});
}
@Override
protected void onResume() {
super.onResume();
// start recognition
Bitmap bitmap = BitmapFactory.decodeFile("/path/to/some/file.jpg");
mRecognizerRunner.recognizeBitmap(bitmap, Orientation.ORIENTATION_LANDSCAPE_RIGHT, mScanResultListener);
}
@Override
protected void onDestroy() {
super.onDestroy();
mRecognizerRunner.terminate();
}
private final ScanResultListener mScanResultListener = new ScanResultListener() {
@Override
public void onScanningDone(@NonNull RecognitionSuccessType recognitionSuccessType) {
// this method is from ScanResultListener and will be called
// when scanning completes
// you can obtain scanning result by calling getResult on each
// recognizer that you bundled into RecognizerBundle.
// for example:
BlinkIdMultiSideRecognizer.Result result = mRecognizer.getResult();
if (result.getResultState() == Recognizer.Result.State.Valid) {
// result is valid, you can use it however you wish
}
}
};
}
ScanResultListener.onScanningDone method is called for each input image that you send to the recognition. You can call RecognizerRunner.recognize*
method multiple times with different images of the same document for better reading accuracy until you get a successful result in the listener's onScanningDone
method. This is useful when you are using your own or third-party camera management.
Some recognizers support recognition from String
. They can be used through Direct API to parse given String
and return data just like when they are used on an input image. When recognition is performed on String
, there is no need for the OCR. Input String
is used in the same way as the OCR output is used when image is being recognized.
Recognition from String
can be performed in the same way as recognition from image, described in the previous section.
The only difference is that one of the RecognizerRunner singleton methods for recognition from string should be called:
Direct API's RecognizerRunner
singleton is a state machine that can be in one of 3 states: OFFLINE
, READY
and WORKING
.
- When you obtain the reference to
RecognizerRunner
singleton, it will be inOFFLINE
state. - You can initialize
RecognizerRunner
by calling initialize method. If you callinitialize
method whileRecognizerRunner
is not inOFFLINE
state, you will getIllegalStateException
. - After successful initialization,
RecognizerRunner
will move toREADY
state. Now you can call any of therecognize*
methods. - When starting recognition with any of the
recognize*
methods,RecognizerRunner
will move toWORKING
state. If you attempt to call these methods whileRecognizerRunner
is not inREADY
state, you will getIllegalStateException
- Recognition is performed on background thread so it is safe to call all
RecognizerRunner's
methods from UI thread - When recognition is finished,
RecognizerRunner
first moves back toREADY
state and then calls the onScanningDone method of the providedScanResultListener
. - Please note that
ScanResultListener
'sonScanningDone
method will be called on background processing thread, so make sure you do not perform UI operations in this callback. Also note that until theonScanningDone
method completes,RecognizerRunner
will not perform recognition of another image or string, even if any of therecognize*
methods have been called just after transitioning toREADY
state. This is to ensure that results of the recognizers bundled withinRecognizerBundle
associated withRecognizerRunner
are not modified while possibly being used withinonScanningDone
method. - By calling
terminate
method,RecognizerRunner
singleton will release all its internal resources. Note that even after callingterminate
you might receiveonScanningDone
event if there was work in progress whenterminate
was called. -
terminate
method can be called from anyRecognizerRunner
singleton's state - You can observe
RecognizerRunner
singleton's state with methodgetCurrentState
Both RecognizerRunnerView and RecognizerRunner
use the same internal singleton that manages native code. This singleton handles initialization and termination of native library and propagating recognizers to native library. It is possible to use RecognizerRunnerView
and RecognizerRunner
together, as internal singleton will make sure correct synchronization and correct recognition settings are used. If you run into problems while using RecognizerRunner
in combination with RecognizerRunnerView
, let us know!
When you are using combined recognizer and images of both document sides are required, you need to call RecognizerRunner.recognize*
multiple times. Call it first with the images of the first side of the document, until it is read, and then with the images of the second side. The combined recognizer automatically switches to second side scanning, after it has successfully read the first side. To be notified when the first side scanning is completed, you have to set the FirstSideRecognitionCallback through MetadataCallbacks. If you don't need that information, e.g. when you have only one image for each document side, don't set the FirstSideRecognitionCallback
and check the RecognitionSuccessType in ScanResultListener.onScanningDone, after the second side image has been processed.
BlinkIdOverlayController
implements new UI for scanning identity documents, which is optimally designed to be used with new BlinkIdMultiSideRecognizer
and BlinkIdSingleSideRecognizer
. It implements several new features:
- clear indication for searching phase, when BlinkID is searching for an ID document
- clear progress indication, when BlinkID is busy with OCR and data extraction
- clear message when the document is not supported
- visual indications when the user needs to place the document closer to the camera
- when BlinkIdMultiSideRecognizer is used, visual indication that the data from the front side of the document doesn't match the data on the back side of the document.
The new UI allows the user to scan the document at an any angle, in any orientation. We recommend forcing landscape orientation if you scan barcodes on the back side, because in that orientation success rate will be higher.
To launch a built-in activity that uses BlinkIdOverlayController
use BlinkIdUISettings
.
To customise overlay, provide your custom style resource via BlinkIdUISettings.setOverlayViewStyle()
method or via ReticleOverlayView
constructor. You can customise elements labeled on screenshots above by providing the following attributes in your style:
exit
-
mb_exitScanDrawable
- icon drawable - note that you can disable this element by using
BlinkIdUISettings.setShowCancelButton(false)
torch
-
mb_torchOnDrawable
- icon drawable that is shown when the torch is enabled -
mb_torchOffDrawable
- icon drawable that is show when the torch is disabled - note that you can disable this element by using
BlinkIdUISettings.setShowTorchButton(false)
instructions
-
mb_instructionsTextAppearance
- style that will be used asandroid:textAppearance
-
mb_instructionsBackgroundDrawable
- drawable used for background -
mb_instructionsBackgroundColor
- color used for background
flashlight warning
-
mb_flashlightWarningTextAppearance
- style that will be used asandroid:textAppearance
-
mb_flashlightWarningBackgroundDrawable
- drawable used for background - note that you can disable this element by using
BlinkIdUISettings.setShowFlashlightWarning(false)
card icon
-
mb_cardFrontDrawable
- icon drawable shown during card flip animation, representing front side of the card -
mb_cardBackDrawable
- icon drawable shown during card flip animation, representing back side of the card
reticle
-
mb_reticleDefaultDrawable
- drawable shown when reticle is in neutral state -
mb_reticleSuccessDrawable
- drawable shown when reticle is in success state (scanning was successful) -
mb_reticleErrorDrawable
- drawable shown when reticle is in error state -
mb_reticleColor
- color used for rotating reticle element -
mb_reticleDefaultColor
- color used for reticle in neutral state -
mb_reticleErrorColor
- color used for reticle in error state -
mb_successFlashColor
- color used for flash effect on successful scan
To customize the visibility and style of these two dialogs, use methods provided in BlinkIdUISettings
.
The method for controlling the visibility of the introduction dialog is BlinkIdUISettings.setShowIntroductionDialog(boolean showIntroductionDialog)
and it is set to true by default, meaning the introduction dialog will be shown.
The method for controlling the visibility of the onboarding dialog is BlinkIdUISettings.setShowOnboardingInfo(boolean showOnboardingInfo)
and it is set to true by default, meaning the introduction dialog will be shown.
There is also a method for controlling the delay of the "Show help?" tooltip that is shown above the help button. The button itself will be shown if the previous method for showing onboarding is true.
The method for setting the delay length of the tooltip is BlinkIdUISettings.setShowTooltipTimeIntervalMs(long showTooltipTimeIntervalMs)
. Time parameter is set in milliseconds.
The default setting of the delay is 12 seconds (12000 milliseconds).
Customizing and theming these introduction and onboarding elements can be done in the same way as explained in the previous chapter, by providing the following attributes:
help button
-
mb_helpButtonDrawable
- drawable that is shown when help button is enabled -
mb_helpButtonBackgroundColor
- color used for help button background -
mb_helpButtonQuestionmarkColor
- color used for help button foreground - note that this element is disabled if onboarding screens are disabled
help tooltip
-
mb_helpTooltipBackground
- drawable that is shown as a background when help tooltip pops up -
mb_helpTooltipColor
- color used for help tooltip background -
mb_helpTooltipTextAppearance
- style that will be used asandroid:textAppearance
introduction dialog
-
mb_introductionBackgroundColor
- color used for introduction screen background -
mb_introductionTitleTextAppearance
- style that will be used asandroid:textAppearance
-
mb_introductionMessageTextAppearance
- style that will be used asandroid:textAppearance
-
mb_introductionButtonTextAppearance
- style that will be used asandroid:textAppearance
- note that you can disable this element by using
BlinkIdUISettings.setShowIntroductionDialog(false)
onboarding dialog
-
mb_onboardingBackgroundColor
- color used for onboarding screens background -
mb_onboardingPageIndicatorColor
- color used for circular page indicators in onboarding screens -
mb_onboardingTitleTextAppearance
- style that will be used asandroid:textAppearance
-
mb_onboardingMessageTextAppearance
- style that will be used asandroid:textAppearance
-
mb_onboardingButtonTextAppearance
- style that will be used asandroid:textAppearance
- note that you can disable this element by using
BlinkIdUISettings.setShowOnboardingInfo(false)
Alert dialogs called by the SDK have their own set of properties that can be modified in styles.xml
.
MB_alert_dialog
is a theme that extends Theme.AppCompat.Light.Dialog.Alert
theme and uses the application theme's default colors.
In order to change the attributes in these alert dialogs without changing other attributes in user's application, MB_alert_dialog
theme needs to be overwritten.
<style name="MB_alert_dialog" parent="Theme.AppCompat.Light.Dialog.Alert">
<item name="android:textSize">TEXT_SIZE</item>
<item name="android:background">COLOR</item>
<item name="android:textColorPrimary">COLOR</item>
<item name="colorAccent">COLOR</item>
</style>
The attributes that are not overwritten are going to use application theme's default colors and sizes.
colorAccent
attibute is used to change the color of the alert dialog button. If the application theme's colorAccent
attribute is changed someplace else, this alert dialog button color will also be changed.
However, overwriting MB_alert_dialog
theme and this attribute within it will assure that only the button color in Microblink SDK's alert dialog is changed.
If the application theme extends a theme from the MaterialComponents
set (e.g. Theme.MaterialComponents.Light.NoActionBar
), the aforementioned button color may only be changed by overwriting colorOnPrimary
attribute instead of colorAccent
attribute.
DocumentUISettings
launches activity that uses BlinkIdOverlayController
with alternative UI. It is best suited for scanning single document side of various card documents and it shouldn't be used with combined recognizers as it provides no user instructions on when to switch to the back side.
LegacyDocumentVerificationUISettings
launches activity that uses BlinkIdOverlayController
with alternative UI. It is best suited for combined recognizers because it manages scanning of multiple document sides in the single camera opening and guides the user through the scanning process. It can also be used for single side scanning of ID cards, passports, driver's licenses, etc.
Strings used within built-in activities and overlays can be localized to any language. If you are using RecognizerRunnerView
(see this chapter for more information) in your custom scan activity or fragment, you should handle localization as in any other Android app. RecognizerRunnerView
does not use strings nor drawables, it only uses assets from assets/microblink
folder. Those assets must not be touched as they are required for recognition to work correctly.
However, if you use our built-in activities or overlays, they will use resources packed within LibBlinkID.aar
to display strings and images on top of the camera view. We have already prepared strings for several languages which you can use out of the box. You can also modify those strings, or you can add your own language.
To use a language, you have to enable it from the code:
-
To use a certain language, on application startup, before opening any UI component from the SDK, you should call method
LanguageUtils.setLanguageAndCountry(language, country, context)
. For example, you can set language to Croatian like this:// define BlinkID language LanguageUtils.setLanguageAndCountry("hr", "", this);
BlinkID can easily be translated to other languages. The res
folder in LibBlinkID.aar
archive has folder values
which contains strings.xml
- this file contains english strings. In order to make e.g. croatian translation, create a folder values-hr
in your project and put the copy of strings.xml
inside it (you might need to extract LibBlinkID.aar
archive to access those files). Then, open that file and translate the strings from English into Croatian.
To modify an existing string, the best approach would be to:
- Choose a language you want to modify. For example Croatian ('hr').
- Find
strings.xml
in folderres/values-hr
of theLibBlinkID.aar
archive - Choose a string key which you want to change. For example:
<string name="MBBack">Back</string>
- In your project create a file
strings.xml
in the folderres/values-hr
, if it doesn't already exist - Create an entry in the file with the value for the string which you want. For example:
<string name="MBBack">Natrag</string>
- Repeat for all the string you wish to change
Processing events, also known as Metadata callbacks are purely intended for giving processing feedback on UI or to capture some debug information during development of your app using BlinkID SDK. For that reason, built-in activities and fragments handle those events internally. If you need to handle those events yourself, you need to use either RecognizerRunnerView or RecognizerRunner.
Callbacks for all events are bundled into the MetadataCallbacks object. Both RecognizerRunner and RecognizerRunnerView have methods which allow you to set all your callbacks.
We suggest that you check for more information about available callbacks and events to which you can handle in the javadoc for MetadataCallbacks class.
Please note that both those methods need to pass information about available callbacks to the native code and for efficiency reasons this is done at the time setMetadataCallbacks
method is called and not every time when change occurs within the MetadataCallbacks
object. This means that if you, for example, set QuadDetectionCallback
to MetadataCallbacks
after you already called setMetadataCallbacks
method, the QuadDetectionCallback
will not be registered with the native code and you will not receive its events.
Similarly, if you, for example, remove the QuadDetectionCallback
from MetadataCallbacks
object after you already called setMetadataCallbacks
method, your app will crash with NullPointerException
when our processing code attempts to invoke the method on removed callback (which is now set to null
). We deliberately do not perform null
check here because of two reasons:
- it is inefficient
- having
null
callback, while still being registered to native code is illegal state of your program and it should therefore crash
Remember, each time you make some changes to MetadataCallbacks
object, you need to apply those changes to to your RecognizerRunner
or RecognizerRunnerView
by calling its setMetadataCallbacks
method.
This section will first describe what is a Recognizer
and how it should be used to perform recognition of the images, videos and camera stream. Next, we will describe how RecognizerBundle
can be used to tweak the recognition procedure and to transfer Recognizer
objects between activities.
RecognizerBundle is an object which wraps the Recognizers and defines settings about how recognition should be performed. Besides that, RecognizerBundle
makes it possible to transfer Recognizer
objects between different activities, which is required when using built-in activities to perform scanning, as described in first scan section, but is also handy when you need to pass Recognizer
objects between your activities.
List of all available Recognizer
objects, with a brief description of each Recognizer
, its purpose and recommendations how it should be used to get best performance and user experience, can be found here .
The Recognizer is the basic unit of processing within the BlinkID SDK. Its main purpose is to process the image and extract meaningful information from it. As you will see later, the BlinkID SDK has lots of different Recognizer
objects that have various purposes.
Each Recognizer
has a Result
object, which contains the data that was extracted from the image. The Result
object is a member of corresponding Recognizer
object and its lifetime is bound to the lifetime of its parent Recognizer
object. If you need your Result
object to outlive its parent Recognizer
object, you must make a copy of it by calling its method clone()
.
Every Recognizer
is a stateful object, that can be in two states: idle state and working state. While in idle state, you can tweak Recognizer
object's properties via its getters and setters. After you bundle it into a RecognizerBundle
and use either RecognizerRunner or RecognizerRunnerView to run the processing with all Recognizer
objects bundled within RecognizerBundle
, it will change to working state where the Recognizer
object is being used for processing. While being in working state, you cannot tweak Recognizer
object's properties. If you need to, you have to create a copy of the Recognizer
object by calling its clone()
, then tweak that copy, bundle it into a new RecognizerBundle
and use reconfigureRecognizers
to ensure new bundle gets used on processing thread.
While Recognizer
object works, it changes its internal state and its result. The Recognizer
object's Result
always starts in Empty state. When corresponding Recognizer
object performs the recognition of given image, its Result
can either stay in Empty
state (in case Recognizer
failed to perform recognition), move to Uncertain state (in case Recognizer
performed the recognition, but not all mandatory information was extracted), move to StageValid state (in case Recognizer
successfully scanned one part/side of the document and there are more fields to extract) or move to Valid state (in case Recognizer
performed recognition and all mandatory information was successfully extracted from the image).
As soon as one Recognizer
object's Result
within RecognizerBundle
given to RecognizerRunner
or RecognizerRunnerView
changes to Valid
state, the onScanningDone
callback will be invoked on same thread that performs the background processing and you will have the opportunity to inspect each of your Recognizer
objects' Results
to see which one has moved to Valid
state.
As already stated in section about RecognizerRunnerView
, as soon as onScanningDone
method ends, the RecognizerRunnerView
will continue processing new camera frames with same Recognizer
objects, unless paused. Continuation of processing or resetting recognition will modify or reset all Recognizer
objects's Results
. When using built-in activities, as soon as onScanningDone
is invoked, built-in activity pauses the RecognizerRunnerView
and starts finishing the activity, while saving the RecognizerBundle
with active Recognizer
objects into Intent
so they can be transferred back to the calling activities.
The RecognizerBundle is wrapper around Recognizers objects that can be used to transfer Recognizer
objects between activities and to give Recognizer
objects to RecognizerRunner
or RecognizerRunnerView
for processing.
The RecognizerBundle
is always constructed with array of Recognizer
objects that need to be prepared for recognition (i.e. their properties must be tweaked already). The varargs constructor makes it easier to pass Recognizer
objects to it, without the need of creating a temporary array.
The RecognizerBundle
manages a chain of Recognizer
objects within the recognition process. When a new image arrives, it is processed by the first Recognizer
in chain, then by the second and so on, iterating until a Recognizer
object's Result
changes its state to Valid
or all of the Recognizer
objects in chain were invoked (none getting a Valid
result state). If you want to invoke all Recognizers
in the chain, regardless of whether some Recognizer
object's Result
in chain has changed its state to Valid
or not, you can allow returning of multiple results on a single image.
You cannot change the order of the Recognizer
objects within the chain - no matter the order in which you give Recognizer
objects to RecognizerBundle
, they are internally ordered in a way that provides best possible performance and accuracy. Also, in order for BlinkID SDK to be able to order Recognizer
objects in recognition chain in the best way possible, it is not allowed to have multiple instances of Recognizer
objects of the same type within the chain. Attempting to do so will crash your application.
Besides managing the chain of Recognizer
objects, RecognizerBundle
also manages transferring bundled Recognizer
objects between different activities within your app. Although each Recognizer
object, and each its Result
object implements Parcelable interface, it is not so straightforward to put those objects into Intent and pass them around between your activities and services for two main reasons:
-
Result
object is tied to itsRecognizer
object, which manages lifetime of the nativeResult
object. -
Result
object often contains large data blocks, such as images, which cannot be transferred viaIntent
because of Android's Intent transaction data limit.
Although the first problem can be easily worked around by making a copy of the Result
and transfer it independently, the second problem is much tougher to cope with. This is where, RecognizerBundle's
methods saveToIntent and loadFromIntent come to help, as they ensure the safe passing of Recognizer
objects bundled within RecognizerBundle
between activities according to policy defined with method setIntentDataTransferMode
:
- if set to
STANDARD
, theRecognizer
objects will be passed viaIntent
using normal Intent transaction mechanism, which is limited by Android's Intent transaction data limit. This is same as manually puttingRecognizer
objects intoIntent
and is OK as long as you do not useRecognizer
objects that produce images or other large objects in theirResults
. - if set to
OPTIMISED
, theRecognizer
objects will be passed via internal singleton object and no serialization will take place. This means that there is no limit to the size of data that is being passed. This is also the fastest transfer method, but it has a serious drawback - if Android kills your app to save memory for other apps and then later restarts it and redeliversIntent
that should containRecognizer
objects, the internal singleton that should contain savedRecognizer
objects will be empty and data that was being sent will be lost. You can easily provoke that condition by choosing No background processes under Limit background processes in your device's Developer options, and then switch from your app to another app and then back to your app. - if set to
PERSISTED_OPTIMISED
, theRecognizer
objects will be passed via internal singleton object (just like inOPTIMISED
mode) and will additionaly be serialized into a file in your application's private folder. In case Android restarts your app and internal singleton is empty after re-delivery of theIntent
, the data will be loaded from file and nothing will be lost. The files will be automatically cleaned up when data reading takes place. Just likeOPTIMISED
, this mode does not have limit to the size of data that is being passed and does not have a drawback thatOPTIMISED
mode has, but some users might be concerned about files to which data is being written.- These files will contain end-user's private data, such as image of the object that was scanned and the extracted data. Also these files may remain saved in your application's private folder until the next successful reading of data from the file.
- If your app gets restarted multiple times, only after first restart will reading succeed and will delete the file after reading. If multiple restarts take place, you must implement
onSaveInstanceState
and save bundle back to file by calling itssaveState
method. Also, after saving state, you should ensure that you clear saved state in youronResume
, asonCreate
may not be called if activity is not restarted, whileonSaveInstanceState
may be called as soon as your activity goes to background (beforeonStop
), even though activity may not be killed at later time. - If saving data to file in private storage is a concern to you, you should use either
OPTIMISED
mode to transfer large data and image between activities or create your own mechanism for data transfer. Note that your application's private folder is only accessible by your application and your application alone, unless the end-user's device is rooted.
This section will give a list of all Recognizer
objects that are available within BlinkID SDK, their purpose and recommendations how they should be used to get best performance and user experience.
The FrameGrabberRecognizer
is the simplest recognizer in BlinkID SDK, as it does not perform any processing on the given image, instead it just returns that image back to its FrameCallback
. Its Result never changes state from Empty.
This recognizer is best for easy capturing of camera frames with RecognizerRunnerView
. Note that Image
sent to onFrameAvailable
are temporary and their internal buffers all valid only until the onFrameAvailable
method is executing - as soon as method ends, all internal buffers of Image
object are disposed. If you need to store Image
object for later use, you must create a copy of it by calling clone
.
Also note that FrameCallback
interface extends Parcelable interface, which means that when implementing FrameCallback
interface, you must also implement Parcelable
interface.
This is especially important if you plan to transfer FrameGrabberRecognizer
between activities - in that case, keep in mind that the instance of your object may not be the same as the instance on which onFrameAvailable
method gets called - the instance that receives onFrameAvailable
calls is the one that is created within activity that is performing the scan.
The SuccessFrameGrabberRecognizer
is a special Recognizer
that wraps some other Recognizer
and impersonates it while processing the image. However, when the Recognizer
being impersonated changes its Result
into Valid
state, the SuccessFrameGrabberRecognizer
captures the image and saves it into its own Result
object.
Since SuccessFrameGrabberRecognizer
impersonates its slave Recognizer
object, it is not possible to give both concrete Recognizer
object and SuccessFrameGrabberRecognizer
that wraps it to same RecognizerBundle
- doing so will have the same result as if you have given two instances of same Recognizer
type to the RecognizerBundle
- it will crash your application.
This recognizer is best for use cases when you need to capture the exact image that was being processed by some other Recognizer
object at the time its Result
became Valid
. When that happens, SuccessFrameGrabber's
Result
will also become Valid
and will contain described image. That image can then be retrieved with getSuccessFrame()
method.
Unless stated otherwise for concrete recognizer, single side BlinkID recognizers from this list can be used in any context, but they work best with BlinkIdUISettings
and DocumentScanUISettings
, with UIs best suited for document scanning.
Combined recognizers should be used with BlinkIdUISettings
. They manage scanning of multiple document sides in the single camera opening and guide the user through the scanning process. Some combined recognizers support scanning of multiple document types, but only one document type can be scanned at a time.
The BlinkIdSingleSideRecognizer
scans and extracts data from the single side of the supported document.
You can find the list of the currently supported documents here.
We will continue expanding this recognizer by adding support for new document types in the future. Star this repo to stay updated.
The BlinkIdSingleSideRecognizer
works best with the BlinkIdUISettings
and BlinkIdOverlayController
.
Use BlinkIdMultiSideRecognizer
for scanning both sides of the supported document. First, it scans and extracts data from the front, then scans and extracts data from the back, and finally, combines results from both sides. The BlinkIdMultiSideRecognizer
also performs data matching and returns a flag if the extracted data captured from the front side matches the data from the back.
You can find the list of the currently supported documents here.
We will continue expanding this recognizer by adding support for new document types in the future. Star this repo to stay updated.
The BlinkIdMultiSideRecognizer
works best with the BlinkIdUISettings
and BlinkIdOverlayController
.
The MrtdRecognizer
is used for scanning and data extraction from the Machine Readable Zone (MRZ) of the various Machine Readable Travel Documents (MRTDs) like ID cards and passports. This recognizer is not bound to the specific country, but it can be configured to only return data that match some criteria defined by the MrzFilter
.
You can find information about usage context at the beginning of this section.
The MrtdCombinedRecognizer
scans Machine Readable Zone (MRZ) after scanning the full document image and face image (usually MRZ is on the back side and face image is on the front side of the document). Internally, it uses DocumentFaceRecognizer for obtaining full document image and face image as the first step and then MrtdRecognizer for scanning the MRZ.
You can find information about usage context at the beginning of this section.
The PassportRecognizer
is used for scanning and data extraction from the Machine Readable Zone (MRZ) of the various passport documents. This recognizer also returns face image from the passport.
You can find information about usage context at the beginning of this section.
The VisaRecognizer
is used for scanning and data extraction from the Machine Readable Zone (MRZ) of the various visa documents. This recognizer also returns face image from the visa document.
You can find information about usage context at the beginning of this section.
The IdBarcodeRecognizer
is used for scanning barcodes from various ID cards. Check this document to see the list of supported document types.
You can find information about usage context at the beginning of this section.
The DocumentFaceRecognizer
is a special type of recognizer that only returns face image and full document image of the scanned document. It does not extract document fields like first name, last name, etc. This generic recognizer can be used to obtain document images in cases when specific support for some document type is not available.
You can find information about usage context at the beginning of this section.
You need to ensure that the final app gets all resources required by BlinkID. At the time of writing this documentation, Android does not have support for combining multiple AAR libraries into single fat AAR. The problem is that resource merging is done while building application, not while building AAR, so application must be aware of all its dependencies. There is no official Android way of "hiding" third party AAR within your AAR.
This problem is usually solved with transitive Maven dependencies, i.e. when publishing your AAR to Maven you specify dependencies of your AAR so they are automatically referenced by app using your AAR. Besides this, there are also several other approaches you can try:
- you can ask your clients to reference BlinkID in their app when integrating your SDK
- since the problem lies in resource merging part you can try avoiding this step by ensuring your library will not use any component from BlinkID that uses resources (i.e. built-in activities, fragments and views, except
RecognizerRunnerView
). You can perform custom UI integration while taking care that all resources (strings, layouts, images, ...) used are solely from your AAR, not from BlinkID. Then, in your AAR you should not referenceLibBlinkID.aar
as gradle dependency, instead you should unzip it and copy its assets to your AAR’s assets folder, itsclasses.jar
to your AAR’s lib folder (which should be referenced by gradle as jar dependency) and contents of its jni folder to your AAR’s src/main/jniLibs folder. - Another approach is to use 3rd party unofficial gradle script that aim to combine multiple AARs into single fat AAR. Use this script at your own risk and report issues to its developers - we do not offer support for using that script.
- There is also a 3rd party unofficial gradle plugin which aims to do the same, but is more up to date with latest updates to Android gradle plugin. Use this plugin at your own risk and report all issues with using to its developers - we do not offer support for using that plugin.
BlinkID is distributed with ARMv7 and ARM64 native library binaries.
ARMv7 architecture gives the ability to take advantage of hardware accelerated floating point operations and SIMD processing with NEON. This gives BlinkID a huge performance boost on devices that have ARMv7 processors. Most new devices (all since 2012.) have ARMv7 processor so it makes little sense not to take advantage of performance boosts that those processors can give. Also note that some devices with ARMv7 processors do not support NEON and VFPv4 instruction sets, most popular being those based on NVIDIA Tegra 2, ARM Cortex A9 and older. Since these devices are old by today's standard, BlinkID does not support them. For the same reason, BlinkID does not support devices with ARMv5 (armeabi
) architecture.
ARM64 is the new processor architecture that most new devices use. ARM64 processors are very powerful and also have the possibility to take advantage of new NEON64 SIMD instruction set to quickly process multiple pixels with a single instruction.
There are some issues to be considered:
- ARMv7 build of the native library cannot be run on devices that do not have ARMv7 compatible processor
- ARMv7 processors do not understand x86 instruction set
- ARM64 processors understand ARMv7 instruction set, but ARMv7 processors do not understand ARM64 instructions.
- NOTE: as of the year 2018, some android devices that ship with ARM64 processors do not have full compatibility with ARMv7. This is mostly due to incorrect configuration of Android's 32-bit subsystem by the vendor, however Google decided that as of August 2019 all apps on PlayStore that contain native code need to have native support for 64-bit processors (this includes ARM64 and x86_64) - this is in anticipation of future Android devices that will support 64-bit code only, i.e. that will have ARM64 processors that do not understand ARMv7 instruction set.
- if ARM64 processor executes ARMv7 code, it does not take advantage of modern NEON64 SIMD operations and does not take advantage of 64-bit registers it has - it runs in emulation mode
LibBlinkID.aar
archive contains ARMv7 and ARM64 builds of the native library. By default, when you integrate BlinkID into your app, your app will contain native builds for all these processor architectures. Thus, BlinkID will work on ARMv7 and ARM64 devices and will use ARMv7 features on ARMv7 devices and ARM64 features on ARM64 devices. However, the size of your application will be rather large.
We recommend that you distribute your app using App Bundle. This will defer apk generation to Google Play, allowing it to generate minimal APK for each specific device that downloads your app, including only required processor architecture support.
If you are unable to use App Bundle, you can create multiple flavors of your app - one flavor for each architecture. With gradle and Android studio this is very easy - just add the following code to build.gradle
file of your app:
android {
...
splits {
abi {
enable true
reset()
include 'armeabi-v7a', 'arm64-v8a'
universalApk true
}
}
}
With that build instructions, gradle will build two different APK files for your app. Each APK will contain only native library for one processor architecture and one APK will contain all architectures. In order for Google Play to accept multiple APKs of the same app, you need to ensure that each APK has different version code. This can easily be done by defining a version code prefix that is dependent on architecture and adding real version code number to it in following gradle script:
// map for the version code
def abiVersionCodes = ['armeabi-v7a':1, 'arm64-v8a':2]
import com.android.build.OutputFile
android.applicationVariants.all { variant ->
// assign different version code for each output
variant.outputs.each { output ->
def filter = output.getFilter(OutputFile.ABI)
if(filter != null) {
output.versionCodeOverride = abiVersionCodes.get(output.getFilter(OutputFile.ABI)) * 1000000 + android.defaultConfig.versionCode
}
}
}
For more information about creating APK splits with gradle, check this article from Google.
After generating multiple APK's, you need to upload them to Google Play. For tutorial and rules about uploading multiple APK's to Google Play, please read the official Google article about multiple APKs.
If you won't be distributing your app via Google Play or for some other reasons want to have single APK of smaller size, you can completely remove support for certain CPU architecture from your APK. This is not recommended due to consequences.
To keep only some CPU architectures, for example armeabi-v7a
and arm64-v8a
, add the following statement to your android
block inside build.gradle
:
android {
...
ndk {
// Tells Gradle to package the following ABIs into your application
abiFilters 'armeabi-v7a', 'arm64-v8a'
}
}
This will remove other architecture builds for all native libraries used by the application.
To remove support for a certain CPU architecture only for BlinkID, add the following statement to your android
block inside build.gradle
:
android {
...
packagingOptions {
exclude 'lib/<ABI>/libBlinkID.so'
}
}
where <ABI>
represents the CPU architecture you want to remove:
- to remove ARMv7 support, use
exclude 'lib/armeabi-v7a/libBlinkID.so'
- to remove ARM64 support, use
exclude 'lib/arm64-v8a/libBlinkID.so'
- NOTE: this is not recommended. See this notice.
You can also remove multiple processor architectures by specifying exclude
directive multiple times. Just bear in mind that removing processor architecture will have side effects on performance and stability of your app. Please read this for more information.
-
Google decided that as of August 2019 all apps on Google Play that contain native code need to have native support for 64-bit processors (this includes ARM64 and x86_64). This means that you cannot upload application to Google Play Console that supports only 32-bit ABI and does not support corresponding 64-bit ABI.
-
By removing ARMv7 support, BlinkID will not work on devices that have ARMv7 processors.
-
By removing ARM64 support, BlinkID will not use ARM64 features on ARM64 device
- also, some future devices may ship with ARM64 processors that will not support ARMv7 instruction set. Please see this note for more information.
If you are combining BlinkID library with other libraries that contain native code into your application, make sure you match the architectures of all native libraries. For example, if third party library has got only ARMv7 version, you must use exactly ARMv7 version of BlinkID with that library, but not ARM64. Using this architectures will crash your app at initialization step because JVM will try to load all its native dependencies in same preferred architecture and will fail with UnsatisfiedLinkError
.
BlinkID contains native code that depends on the C++ runtime. This runtime is provided by the libc++_shared.so
, which needs to be available in your app that is using BlinkID. However, the same file is also used by various other libraries that contain native components. If you happen to integrate both such library together with BlinkID in your app, your build will fail with an error similar to this one:
* What went wrong:
Execution failed for task ':app:mergeDebugNativeLibs'.
> A failure occurred while executing com.android.build.gradle.internal.tasks.MergeJavaResWorkAction
> 2 files found with path 'lib/arm64-v8a/libc++_shared.so' from inputs:
- <path>/.gradle/caches/transforms-3/3d428f9141586beb8805ce57f97bedda/transformed/jetified-opencv-4.5.3.0/jni/arm64-v8a/libc++_shared.so
- <path>/.gradle/caches/transforms-3/609476a082a81bd7af00fd16a991ee43/transformed/jetified-blinkid-6.11.1/jni/arm64-v8a/libc++_shared.so
If you are using jniLibs and CMake IMPORTED targets, see
https://developer.android.com/r/tools/jniLibs-vs-imported-targets
The error states that multiple different dependencies provide the same file lib/arm64/libc++_shared.so
(in this case, OpenCV and BlinkID).
You can resolve this issue by making sure that the dependency that uses newer version of libc++_shared.so
is listed first in your dependency list, and then, simply add the following to your build.gradle
:
android {
packaging {
jniLibs {
pickFirsts.add("lib/armeabi-v7a/libc++_shared.so")
pickFirsts.add("lib/arm64-v8a/libc++_shared.so")
}
}
}
IMPORTANT NOTE
The code above will always select the first libc++_shared.so
from your dependency list, so make sure that the dependency that uses the latest version of libc++_shared.so
is listed first. This is because libc++_shared.so
is backward-compatible, but not forward-compatible. This means that, e.g. libBlinkID.so
built against libc++_shared.so
from NDK r24 will work without problems when you package it together with libc++_shared.so
from NDK r26, but will crash when you package it together with libc++_shared.so
from NDK r21. This is true for all your native dependencies.
In case of problems with SDK integration, first make sure that you have followed integration instructions. If you're still having problems, please contact us at help.microblink.com.
If you are getting "invalid license key" error or having other license-related problems (e.g. some feature is not enabled that should be or there is a watermark on top of camera), first check the ADB logcat. All license-related problems are logged to error log so it is easy to determine what went wrong.
When you have to determine what is the license-relate problem or you simply do not understand the log, you should contact us help.microblink.com. When contacting us, please make sure you provide following information:
- exact package name of your app (from your
AndroidManifest.xml
and/or yourbuild.gradle
file) - license that is causing problems
- please stress out that you are reporting problem related to Android version of BlinkID SDK
- if unsure about the problem, you should also provide excerpt from ADB logcat containing license error
Keep in mind: Versions 5.8.0 and above require an internet connection to work under our new License Management Program.
We’re only asking you to do this so we can validate your trial license key. Data extraction still happens offline, on the device itself. Once the validation is complete, you can continue using the SDK in offline mode (or over a private network) until the next check.
If you are having problems with scanning certain items, undesired behaviour on specific device(s), crashes inside BlinkID or anything unmentioned, please do as follows:
-
enable logging to get the ability to see what is library doing. To enable logging, put this line in your application:
com.microblink.blinkid.util.Log.setLogLevel(com.microblink.blinkid.util.Log.LogLevel.LOG_VERBOSE);
After this line, library will display as much information about its work as possible. Please save the entire log of scanning session to a file that you will send to us. It is important to send the entire log, not just the part where crash occurred, because crashes are sometimes caused by unexpected behaviour in the early stage of the library initialization.
-
Contact us at help.microblink.com describing your problem and provide following information:
- log file obtained in previous step
- high resolution scan/photo of the item that you are trying to scan
- information about device that you are using - we need exact model name of the device. You can obtain that information with any app like this one
- please stress out that you are reporting problem related to Android version of BlinkID SDK
After switching from trial to production license I get InvalidLicenseKeyException
when I construct specific Recognizer
object
Each license key contains information about which features are allowed to use and which are not. This exception indicates that your production license does not allow using of specific Recognizer
object. You should contact support to check if provided license is OK and that it really contains all features that you have purchased.
Whenever you construct any Recognizer
object or any other object that derives from Entity
, a check whether license allows using that object will be performed. If license is not set prior constructing that object, you will get InvalidLicenseKeyException
. We recommend setting license as early as possible in your app, ideally in onCreate
callback of your Application singleton.
When my app starts, I get exception telling me that some resource/class cannot be found or I get ClassNotFoundException
This usually happens when you perform integration into Eclipse project and you forget to add resources or native libraries into the project. You must alway take care that same versions of both resources, assets, java library and native libraries are used in combination. Combining different versions of resources, assets, java and native libraries will trigger crash in SDK. This problem can also occur when you have performed improper integration of BlinkID SDK into your SDK. Please read how to embed BlinkID inside another SDK.
This error happens when JVM fails to load some native method from native library If performing integration into Android studio and this error happens, make sure that you have correctly combined BlinkID SDK with third party SDKs that contain native code, especially if you need resolving conflict over libc++_shared.so
. If this error also happens in our integration sample apps, then it may indicate a bug in the SDK that is manifested on specific device. Please report that to our support team.
Please consult the section about resolving libc++_shared.so
conflict.
Make sure that after adding your callback to MetadataCallbacks
you have applied changes to RecognizerRunnerView
or RecognizerRunner
as described in this section.
I've removed my callback to MetadataCallbacks
object, and now app is crashing with NullPointerException
Make sure that after removing your callback from MetadataCallbacks
you have applied changes to RecognizerRunnerView
or RecognizerRunner
as described in this section.
In my onScanningDone
callback I have the result inside my Recognizer
, but when scanning activity finishes, the result is gone
This usually happens when using RecognizerRunnerView
and forgetting to pause the RecognizerRunnerView
in your onScanningDone
callback. Then, as soon as onScanningDone
happens, the result is mutated or reset by additional processing that Recognizer
performs in the time between end of your onScanningDone
callback and actual finishing of the scanning activity. For more information about statefulness of the Recognizer
objects, check this section.
I am using built-in activity to perform scanning and after scanning finishes, my app crashes with IllegalStateException
stating Data cannot be saved to intent because its size exceeds intent limit
.
This usually happens when you use Recognizer
that produces image or similar large object inside its Result
and that object exceeds the Android intent transaction limit. You should enable different intent data transfer mode. For more information about this, check this section. Also, instead of using built-in activity, you can use RecognizerRunnerFragment
with built-in scanning overlay.
This usually happens when you attempt to transfer standalone Result
that contains images or similar large objects via Intent and the size of the object exceeds Android intent transaction limit. Depending on the device, you will get either TransactionTooLargeException, a simple message BINDER TRANSACTION FAILED
in log and your app will freeze or your app will get into restart loop. We recommend that you use RecognizerBundle
and its API for sending Recognizer
objects via Intent in a more safe manner (check this section for more information). However, if you really need to transfer standalone Result
object (e.g. Result
object obtained by cloning Result
object owned by specific Recognizer
object), you need to do that using global variables or singletons within your application. Sending large objects via Intent is not supported by Android.
When automatic scanning of camera frames with our camera management is used (provided camera overlays or direct usage of RecognizerRunnerView
), we use a stream of video frames and send multiple images to the recognition to boost reading accuracy. Also, we perform frame quality analysis and combine scanning results from multiple camera frames. On the other hand, when you are using the Direct API with a single image per document side, we cannot combine multiple images. We do our best to extract as much information as possible from that image. In some cases, when the quality of the input image is not good enough, for example, when the image is blurred or when glare is present, we are not able to successfully read the document.
Online trial licenses require a public network access for validation purposes. See Licensing issues.
onOcrResult()
method in my OcrCallback
is never invoked and all Result
objects always return null
in their OCR result getters
In order to be able to obtain raw OCR result, which contains locations of each character, its value and its alternatives, you need to have a license that allows that. By default, licenses do not allow exposing raw OCR results in public API. If you really need that, please contact us and explain your use case.
You can find BlinkID SDK size report for all supported ABIs here.
Complete API reference can be found in Javadoc.
For any other questions, feel free to contact us at help.microblink.com.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for blinkid-android
Similar Open Source Tools
blinkid-ios
BlinkID iOS is a mobile SDK that enables developers to easily integrate ID scanning and data extraction capabilities into their iOS applications. The SDK supports scanning and processing various types of identity documents, such as passports, driver's licenses, and ID cards. It provides accurate and fast data extraction, including personal information and document details. With BlinkID iOS, developers can enhance their apps with secure and reliable ID verification functionality, improving user experience and streamlining identity verification processes.
SirChatalot
A Telegram bot that proves you don't need a body to have a personality. It can use various text and image generation APIs to generate responses to user messages. For text generation, the bot can use: * OpenAI's ChatGPT API (or other compatible API). Vision capabilities can be used with GPT-4 models. Function calling can be used with Function calling. * Anthropic's Claude API. Vision capabilities can be used with Claude 3 models. Function calling can be used with tool use. * YandexGPT API Bot can also generate images with: * OpenAI's DALL-E * Stability AI * Yandex ART This bot can also be used to generate responses to voice messages. Bot will convert the voice message to text and will then generate a response. Speech recognition can be done using the OpenAI's Whisper model. To use this feature, you need to install the ffmpeg library. This bot is also support working with files, see Files section for more details. If function calling is enabled, bot can generate images and search the web (limited).
qb
QANTA is a system and dataset for question answering tasks. It provides a script to download datasets, preprocesses questions, and matches them with Wikipedia pages. The system includes various datasets, training, dev, and test data in JSON and SQLite formats. Dependencies include Python 3.6, `click`, and NLTK models. Elastic Search 5.6 is needed for the Guesser component. Configuration is managed through environment variables and YAML files. QANTA supports multiple guesser implementations that can be enabled/disabled. Running QANTA involves using `cli.py` and Luigi pipelines. The system accesses raw Wikipedia dumps for data processing. The QANTA ID numbering scheme categorizes datasets based on events and competitions.
RAGMeUp
RAG Me Up is a generic framework that enables users to perform Retrieve and Generate (RAG) on their own dataset easily. It consists of a small server and UIs for communication. Best run on GPU with 16GB vRAM. Users can combine RAG with fine-tuning using LLaMa2Lang repository. The tool allows configuration for LLM, data, LLM parameters, prompt, and document splitting. Funding is sought to democratize AI and advance its applications.
oterm
Oterm is a text-based terminal client for Ollama, a large language model. It provides an intuitive and simple terminal UI, allowing users to interact with Ollama without running servers or frontends. Oterm supports multiple persistent chat sessions, which are stored along with context embeddings and system prompt customizations in a SQLite database. Users can easily customize the model's system prompt and parameters, and select from any of the models they have pulled in Ollama or their own custom models. Oterm also supports keyboard shortcuts for creating new chat sessions, editing existing sessions, renaming sessions, exporting sessions as markdown, deleting sessions, toggling between dark and light themes, quitting the application, switching to multiline input mode, selecting images to include with messages, and navigating through the history of previous prompts. Oterm is licensed under the MIT License.
PolyMind
PolyMind is a multimodal, function calling powered LLM webui designed for various tasks such as internet searching, image generation, port scanning, Wolfram Alpha integration, Python interpretation, and semantic search. It offers a plugin system for adding extra functions and supports different models and endpoints. The tool allows users to interact via function calling and provides features like image input, image generation, and text file search. The application's configuration is stored in a `config.json` file with options for backend selection, compatibility mode, IP address settings, API key, and enabled features.
ai-town
AI Town is a virtual town where AI characters live, chat, and socialize. This project provides a deployable starter kit for building and customizing your own version of AI Town. It features a game engine, database, vector search, auth, text model, deployment, pixel art generation, background music generation, and local inference. You can customize your own simulation by creating characters and stories, updating spritesheets, changing the background, and modifying the background music.
Open-LLM-VTuber
Open-LLM-VTuber is a project in early stages of development that allows users to interact with Large Language Models (LLM) using voice commands and receive responses through a Live2D talking face. The project aims to provide a minimum viable prototype for offline use on macOS, Linux, and Windows, with features like long-term memory using MemGPT, customizable LLM backends, speech recognition, and text-to-speech providers. Users can configure the project to chat with LLMs, choose different backend services, and utilize Live2D models for visual representation. The project supports perpetual chat, offline operation, and GPU acceleration on macOS, addressing limitations of existing solutions on macOS.
opencommit
OpenCommit is a tool that auto-generates meaningful commits using AI, allowing users to quickly create commit messages for their staged changes. It provides a CLI interface for easy usage and supports customization of commit descriptions, emojis, and AI models. Users can configure local and global settings, switch between different AI providers, and set up Git hooks for integration with IDE Source Control. Additionally, OpenCommit can be used as a GitHub Action to automatically improve commit messages on push events, ensuring all commits are meaningful and not generic. Payments for OpenAI API requests are handled by the user, with the tool storing API keys locally.
warc-gpt
WARC-GPT is an experimental retrieval augmented generation pipeline for web archive collections. It allows users to interact with WARC files, extract text, generate text embeddings, visualize embeddings, and interact with a web UI and API. The tool is highly customizable, supporting various LLMs, providers, and embedding models. Users can configure the application using environment variables, ingest WARC files, start the server, and interact with the web UI and API to search for content and generate text completions. WARC-GPT is designed for exploration and experimentation in exploring web archives using AI.
gpt-subtrans
GPT-Subtrans is an open-source subtitle translator that utilizes large language models (LLMs) as translation services. It supports translation between any language pairs that the language model supports. Note that GPT-Subtrans requires an active internet connection, as subtitles are sent to the provider's servers for translation, and their privacy policy applies.
redbox-copilot
Redbox Copilot is a retrieval augmented generation (RAG) app that uses GenAI to chat with and summarise civil service documents. It increases organisational memory by indexing documents and can summarise reports read months ago, supplement them with current work, and produce a first draft that lets civil servants focus on what they do best. The project uses a microservice architecture with each microservice running in its own container defined by a Dockerfile. Dependencies are managed using Python Poetry. Contributions are welcome, and the project is licensed under the MIT License.
unstructured
The `unstructured` library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of `unstructured` revolve around streamlining and optimizing the data processing workflow for LLMs. `unstructured` modular functions and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and efficient in transforming unstructured data into structured outputs.
fabric
Fabric is an open-source framework for augmenting humans using AI. It provides a structured approach to breaking down problems into individual components and applying AI to them one at a time. Fabric includes a collection of pre-defined Patterns (prompts) that can be used for a variety of tasks, such as extracting the most interesting parts of YouTube videos and podcasts, writing essays, summarizing academic papers, creating AI art prompts, and more. Users can also create their own custom Patterns. Fabric is designed to be easy to use, with a command-line interface and a variety of helper apps. It is also extensible, allowing users to integrate it with their own AI applications and infrastructure.