API overview¶
This is a brief overview of the API design goals, the SDK's conceptual model, and the two supported audio processing modes.
Design goals¶
The TrulyNatural SDK API is a result of these design goals:
- Pure C implementation.
- Lowest common denominator, widest toolchain availability.
- No C++ runtime overhead.
- Fast.
- Simple API.
- Small footprint: limited number of functions and data types.
- Generic, independent of the inference task.
- Fundamental data types only: floating point, integer, strings, streams, and opaque object instance handles.
- Make it easier to provide bindings for languages other than C.
- Flexible configuration.
- Hide complexity,
- but still allow for fine-grained configuration if needed.
- Settings indexed by string names, documented settings define public API.
- One self-contained model per task.
- Model includes a flow graph that specifies how various low-level internal modules (feature extractors, acoustic models, etc.) connect and interact.
- Includes all required module configurations.
- Run on a wide variety of platforms, including ones without file system support.
There is a significant downside to these design choices: Discoverability is very limited. You cannot determine model behavior from function or method names alone. You must refer to the model type documentation for expected task behavior and available settings.
Conceptual model¶
This library uses a dataflow approach to evaluate speech recognition tasks. It uses inversion of control: The SDK invokes event handlers to report results and control task flow.
The API contains two primary data types: Session used for model inference, and a Stream abstraction for input and output.
Sessions hold the entire state of a model instance, and use streams for all input and output. There is, for example, a single load function to load a model into a session, but this supports loading from a named file, an open FILE * handle, a memory segment, from the code segment, and from compressed assets on Android.
Models (snsr files) define flow pipelines and session behavior. These contain the serialized content1 of the session flow graph, including all binary models and configurations. Think of these as hierarchical key-value databases. Once loaded into a Session, you can query or change the setting keys with generic getter and setter functions.
Processing modes¶
We support two modes for audio processing:
- Pull mode, where the run function reads audio from the configured input stream. This blocks on read until new data are available. The run function returns only when the stream runs out of data (for example the end of a file), an event handler tells it to stop, or an error occurs.
- Push mode, where the application repeatedly calls the push function with small chunks of the audio data. The push function returns once it has processed or buffered these data. The application eventually calls stop to flush and process any buffered data.
Model evaluation typically follows this recipe:
- Create a new session instance with new.
- Load a task model into the instance.
- Set the input source stream.
- Register one or more event handlers.
- Enter the main loop by calling run. The library will process the input streams and invoke event handlers at appropriate times. The main loop continues until a terminating condition is reached, such as an event returning an error code.
- Release the session instance.
Your first program, live-spot.c, evalUDT.java, and hello_world.py.
push-audio.c and stt_push.py.
Language bindings¶
This version of the TrulyNatural SDK supports three language bindings: C, Java, and Python. C is the native API; Java and Python are generated wrappers with idiomatic naming and error handling on top of the same session and stream model.
C¶
The C binding exposes the native API directly. Functions use the snsr prefix and pass opaque handles explicitly, for example snsrSetHandler(s, key, callback).
The C binding uses a latched-error model: every function returns SnsrRC, and once a session enters an error state, every subsequent call short-circuits with the same code until clearRC (or reset) is called. Read the latched code with rC and the human-readable detail with errorDetail.
The C binding uses reference counting on every Session, Stream, and Callback handle (and on the C string returned by getString); manage lifetimes with retain and release.
Java¶
A C function snsrXxx(SnsrSession s, ...) becomes a Java method Session.xxx(...): the Snsr prefix and the SnsrSession first argument are absorbed into the receiver. For example, snsrSetHandler(s, key, c) ↔ s.setHandler(key, c).
The Java binding does not surface latched errors to callers. Each Java method either completes successfully or throws an exception describing the failure; subsequent method calls on the same session start fresh. Six methods that perform I/O — Session.load, Session.run, and Stream.copy, Stream.getDelim, Stream.open, Stream.read, Stream.skip, Stream.write — declare throws java.io.IOException and so are checked. All other exceptions are unchecked subclasses of java.lang.RuntimeException. The mapping from SnsrRC to Java exception class is part of the binding's contract:
| Java exception class | Thrown for SnsrRC codes such as | Typical cause |
|---|---|---|
java.io.IOException (checked) | EOF, STREAM, STREAM_END, NOT_OPEN, BUFFER_OVERRUN, BUFFER_UNDERRUN, DELIM_NOT_FOUND, LIBRARY_TOO_OLD | Stream I/O failed or reached end-of-data. Only thrown from the six methods that declare throws IOException; identical conditions outside those methods raise RuntimeException. |
java.lang.OutOfMemoryError | NO_MEMORY, NOT_ENOUGH_SPACE | Allocation failed. |
java.lang.IllegalArgumentException | INVALID_ARG, INVALID_HANDLE, INCORRECT_SETTING_TYPE, SETTING_IS_READ_ONLY, FORMAT_NOT_SUPPORTED, VERSION_MISMATCH | The caller passed a value that is the wrong type, the wrong format, or otherwise unacceptable. |
java.lang.IndexOutOfBoundsException | SETTING_NOT_FOUND, SETTING_NOT_AVAILABLE, VALUE_NOT_SET, ARG_OUT_OF_RANGE, NAME_NOT_UNIQUE, ITERATION_LIMIT | A lookup by name, index, or port missed; or a numeric / iteration limit was exceeded. |
java.lang.RuntimeException | All other non-OK codes — ERROR, NOT_IMPLEMENTED, CONFIGURATION_*, ELEMENT_*, LICENSE_*, NO_MODEL, NOT_INITIALIZED, NOT_SUPPORTED, TIMED_OUT, and so on. | Misconfiguration, license issue, internal API violation, or the catch-all bucket. |
Callbacks that need to control the run loop without raising an exception may return any of OK, STREAM_END, STOP, SKIP, REPEAT, or TIMED_OUT; any other return value from a callback is translated into an exception by the same mapping.
Java methods that have no out-parameters in C return the Session instance instead, so callers chain freely:
s.load(input).setHandler(KEY, listener).run();
When an exception is thrown, the underlying SnsrRC code remains available on the Session (or Stream) for the duration of the catch block: call rC to read it programmatically, or errorDetail for the human-readable message. (The exception's getMessage() is set to the same errorDetail text.) This is useful when the exception class alone is too coarse — for example, distinguishing a missing setting from an unsupported one when both surface as IndexOutOfBoundsException:
try {
s.set(KEY, value).run();
} catch (IndexOutOfBoundsException e) {
if (s.rC() == SnsrRC.SETTING_NOT_FOUND) {
// Treat as a config-file typo, fall back to a default.
} else {
throw e;
}
}
The handler does not need to do anything to "reset" the session — as above, the next method call on the same session starts fresh.
The Java binding uses standard garbage collection. Explicit retain / release are not exposed in Java and are not needed; Session.release() is provided for callers who want to free native resources promptly without waiting for GC, but is otherwise optional.
Python 7.8.0¶
The Python binding follows the Java naming recipe with PEP 8 snake_case: drop the snsr/Snsr prefix and the session/stream receiver, then convert camelCase to snake_case (snsrSetInt → set_int, snsrGetString → get_string). Class methods use the type name instead of a prefix (snsrStreamFromAudioFile → Stream.from_audio_file). A few names differ on purpose: snsrSet(s, "+i+…") → apply("+i+…") (not set_*), snsrStreamFromFileName → Stream.from_filename, and stream status uses properties (s.rc, s.error_detail) instead of snsrStreamRC / snsrStreamErrorDetail. Look up signatures on the Python tabs throughout this reference (setting keys and enums include Python automatically).
The Python binding also avoids a latched-error session state. Any call that would return a non-OK SnsrRC in C raises snsr.Error instead; read the detail string from Error.message (there is no clearRC or session errorDetail on the Python surface). run is special: it returns an RC on success (for example when a handler returns STOP) and raises snsr.Error only for true failures. Callbacks may return OK, STREAM_END, STOP, SKIP, REPEAT, or TIMED_OUT; any other code from a handler is raised as snsr.Error. Install the wheel from the SDK installer, not PyPI — see Integrate with your build § Python.
| Python | Thrown for SnsrRC codes such as | Typical cause |
|---|---|---|
snsr.Error | All non-OK codes (same set as the Java RuntimeException and IOException rows above, plus stream and configuration failures) | Any API or handler failure; Error.message matches C errorDetail. |
| Normal return from run | OK, STOP, STREAM_END, … | Handler stopped the loop or the input stream ended without an error. |
Context managers. with snsr.Stream(...) as s: calls open on enter and close on exit. with snsr.Session() as s: does not open the session (construction already initializes it); it only releases on exit. That differs from with open(...) on a file and from Java, which has no context-manager support.
The Python binding does not expose retain or release to callers. Use with snsr.Session() as s or with snsr.Stream(...) as s to free native resources promptly; otherwise handles are released when Python finalizes the objects. Session getters that return a Stream (for example get_stream in Python) yield a retained handle (same ownership rules as the C API, documented on Memory management).
Kotlin on Android¶
Kotlin Android apps use the Java binding unchanged. The same com.sensory.speech.snsr @aar artifact, the same Session and Stream classes, and the same Listener interface — called from Kotlin source. There is no separate Kotlin SDK; the Java tabs on Inference and I/O are the reference for Kotlin callers too.
The same notes apply to desktop Kotlin against the JAR coordinates in Integrate with your build § Java; only Android-specific items below are platform-specific.
A few interop points are worth knowing because the call site looks slightly different from Java:
-
Lambdas for Listener. Listener is a Java single-abstract-method interface, so Kotlin converts a lambda to it directly:
session.setHandler(Snsr.RESULT_EVENT) { ses, _ -> println("Spotted \"${ses.getString(Snsr.RES_TEXT)}\".") SnsrRC.STOP }Kotlin's own
interfacetypes would requirefun interfacefor the same conversion, but Kotlin → Java SAM interop is unconditional. Listener parameters arrive as Kotlin platform types (SnsrSession!,String!) because the Java SDK has no nullability annotations; treat them as non-null. -
Checked
IOException. The six methods listed above declarethrows java.io.IOException. The Kotlin compiler does not enforce Java checked exceptions — these methods still throwIOExceptionat runtime, but callers receive no compile-time warning. Wrap Session.load, Session.run, and Stream I/O intry/catch(orrunCatching) even though Kotlin lets you omit it. -
release()is notclose(). Session and Stream exposerelease(); they do not implementjava.lang.AutoCloseable. The Kotlinuse { }helper does not apply. Callrelease()explicitly, or write an app-side extension function — there is no SDK-shipped Kotlin extension. -
Threading. run blocks for the lifetime of the recognition session. Use a dedicated single thread (e.g.
newSingleThreadContext("snsr")or a plainThread) — the same worker-thread pattern as the Java Android samples. Do not run run onDispatchers.IO; that pool is sized for short blocking I/O, not for an open-ended pull loop.
For end-to-end Kotlin code, see the Kotlin sub-tab in Your first program § The program.
Android examples, Integrate with your build § Android.
-
Similar in concept to protocol buffers, but with streamed unpacking into native data structures in RAM, no need for accessor functions, and additional features such as conversion to code for running from the text segment. ↩