Last Updated: Sep '22

September 2022

Transcription Language Skill

  • The Transcription Skill converts audio to text, which can be then analyzed and transformed by other Language Skills
  • Labels extracted from transcriptions hold timestamps of the labeled words in the audio input
  • Automated diarization- different speakers are identified and exchanges are separated into different sections
  • The pipeline API supports .wav & .mp3 file inputs. The Transcription Skill is required to process audio files.
  • Use the async endpoint (details below) to process big audio files asynchronously

```const pipeline = new oneai.Pipeline(

const output = await pipeline.runFile('my-conversation.mp3');
console.log(output.transcription.text);                  // transcribed text
console.log(output.transcription.emotions);        // emotions that appear in the conversation
console.log(output.transcription.summary.text); // summary of the conversation```

New & Improved Language Skills

Headline Generation

  • Generate an appropriate headline for your input text.

Subheadings Generation

  • Generate a subheading for your input text.


  • The Summary Skill now generates `origin` labels, mapping words in the summary to their position in the source input (string indices for text inputs, timestamps for audio inputs).


  • Skill is now configurable, providing control over how many sections the input is split into.
  • Combine with the Headline / Subheadings Skills to generate headline per section


  • Numbers and quantities that appear in text form are interpreted & provided as numeric data
  • Dates, times & time-durations in various formats are provided in a standardized date string
  • Added a `datetime_baseline` parameter to control the current time dates & times should use as a baseline
Convert text to numbers and dates


  • The Names Skill detects when names that appear in text reference real-world entities (people, companies, locations, etc.) based on context.
  • `name` labels are enriched with entity data when known entities are recognized.
Detect names by context


  • Automatically identify the dominant language used in the input and help with further processing and translation.

Analytics API (beta)

  • The analytics engine makes large amounts of text data digestible, by clustering together texts with similar meaning and accumulating metadata generated by Language Skills (such as Sentiment & Topics).
  • The API accepts text items and organizes them in hierarchical clusters by meaning.
  • Generated clusters can be fetched from the API or reviewed at the Analytics section of the Studio (the UI is open-source)
  • Cluster collections can be queried with specific text items, fetching the cluster with the most similar meaning, making the analytics API an effective tool for intent based classification and search.
  • Text items can be enriched with metadata generated by Language Skills. Clusters then aggregate item metadata to derive insights at scale.
  • Quickstart guide -
Reviews of Amazon Echo by subject and aggregate Sentiment data • Analytics API TreeMap UI
Reviews of Amazon Echo by subject and aggregate Sentiment data • Analytics API TreeMap UI

Async API

  • The async API endpoints introduce asynchronous processing of large text inputs and audio files.
  • The upload request of the input and the response with the output are split into separate endpoints, so outputs can be retrieved without waiting for the entire processing time.
  • Use the `/async`endpoint to process raw text
  • Use the `/async/files` endpoint to process binary encoded files
  • Successful async requests will return a `task_id` parameter that was assigned to the input.
  • Task status and outputs can be fetched via polling from the `/async/tasks` endpoint