License

Copyright 2012, 2013, 2014, 2015 transLectures-UPV Team / Machine Learning and Language Processing (MLLP) research group.

Licensed under the Apache License, Version 2.0.

The transLectures-UPV Platform (TLP) includes software developed at the Universitat Politècnica de València (UPV) by the MLLP research group as part of the transLectures EU project until TLP Version 1.0.1.

1. Introduction

The transLectures-UPV Platform (TLP) consists of a set of software tools for multilingual automatic subtitling of large video repositories, as well as the integration of these processes into existing workflows. It was developed up to Version 1.0.1 by the MLLP research group from the Universitat Politècnica de València (UPV) as part of the EU research project transLectures. After the end of transLectures, TLP is still being maintained by the MLLP research group, with the release of TLP versions 1.1, 1.2, 2.0 and 2.1.

2. Getting Started

In this section we will give a brief overview of the transLectures-UPV Platform (TLP), describing the workflows involved when integrating Automatic Speech Recognition (ASR), Machine Translation (MT) and Text-To-Speech Systhesis (TTS) technologies into large media repositories with the aid of this platform.

2.1. TLP Overview

TLP is a self-contained piece of software that includes everything that is needed in order to integrate transcription, translation and speech synthesis technologies into large media repositories. Its main components are the Database, the Web Service, the Ingest Service and the Player, each of which are described in their corresponding sections. TLP offers several client tools that are also described in detail in the Client Tools Section.

The figure below shows the main components of TLP and a simplification of the interactions between them.

TLP Components

2.2. Use Cases

We have defined three use cases to illustrate the main ways a media repository and its users can interact with the TransLectures-UPV Platform:

  1. A new recording from the media repository is uploaded to TLP for the generation of automatic subtitles and audio tracks.

  2. A user plays a media file with subtitles from the media repository’s website.

  3. A user corrects subtitle errors (transcription or translation).

Use Case 1: A new recording from the media repository is uploaded to TLP for the generation of automatic subtitles and audio tracks

Use Case 1

A lecturer/speaker records a new lecture/media in a recording studio, in a classroom, or during a conference. To get this video transcribed and translated into several languages, a Media Package File (MPF) package made up with the recorded media file plus metadata is created and sent to the TLP Web Service via the /ingest interface. The TLP Ingest Service unpacks the MPF and launches the required transcription, translation and/or speech synthesis processes. During this stage, the client (the remote media repository) can check at any time the progress of the upload using the /status endpoint of the Web Service. Finally, the Ingest Service creates a new media record in the Database and stores all media, subtitles, and synthesized audiotrack files.

Use Case 2: A user plays a media file with subtitles from the media repository’s website

Use Case 2

A user browses the media repository’s catalogue and selects the media he or she wants to watch/listen using the repository’s media player. The user can watch the selected media with subtitles in different languages, or even listen to it in another language using automatically synthesized audio tracks where available. To get the list of all subtitle languages available, the repository’s media player sends a request to the /langs interface of the TLP’s Web Service, displaying to the user the language availability. As the user selects the desired subtitle language, the repository’s media player calls the /subs endpoint to download the corresponding subtitle file in the required format (srt, vtt, dfxp, etc.), which is immediately processed and displayed in the media player. A similar procedure is applied when the user requests a synthesized audio track, but in this case the media player makes use of the /audiotrack interface instead of the /subs one.

Use Case 3: A user corrects subtitle errors (transcription or translation)

Use Case 3

A user, while playing a media file with subtitles (as shown above in use case 2), notices that the displayed subtitles contain some errors and decides to correct them. To do this, the user presses an Edit Subtitles button (or similar) that is shown by the repository’s media player, and afterwards, the user is redirected to the TLP Player. The TLP Player offers an ergonomic and efficient interface for subtitle editing. It loads the main media file and the subtitles file by calling the /metadata and /subs interfaces of the Web Service, respectively. Any corrections made by the user are sent back to the Web Service via the /mod interface and appended to the original DFXP file. The updated DFXP file is committed to the Database and afterwards, automatic translations and synthesized audio tracks are automatically re-generated using user corrections.

3. Database

The TLP Database is a SQL-based relational database which stores all the data required for the Web Service and the Ingest Service. The main categories of data stored in the Database are the following:

  • Media/Lecture: All the information related to a specific media/lecture is stored in the database, including language, duration, title, keywords and category. An external ID, provided by the client repository, is used to identify the media object in all transactions performed between the client and the Web Service API.

  • Speakers: Information about the speaker/lecturer can be used by the ASR system to adapt the underlying models to the unique characteristics of the given speaker and, therefore, improve the quality of the resulting subtitles.

  • Subtitles: All subtitles automatically generated by the Ingest Service are stored in DFXP format into the database and retrieved by the client via the Web Service.

  • Audiotracks: As in the case of subtitles, automatically synthesized audio tracks from translated subtitles are also stored in the database.

  • Uploads: Every time an /ingest operation is performed, a new upload entry is stored in the database to track its progress.

4. Web Service

The TLP Web Service is the API interface for exchanging information and data between the client’s media repository and the transLectures-UPV Platform. It also enables the subtitle display and editing capabilities of the TLP Player. The Web Service defines a wide set of API HTTP interfaces to allow for the full integration between TLP and the remote media repository:

Interfaces for media upload and management:
/ingest

Upload media (audio/video) files and any attachments and metadata to the TLP Server for automatic multilingual subtitling and speech synthesis.

/uploadslist

Get a list of all the user’s uploads.

/status

Check the current status of a specific upload ID.

/systems

Get a list of all available Speech Recognition, Machine Translation, and Text-To-Speech Systems that can be applied to transcribe, translate, and synthesize a media file.

Interfaces for downloading media and subtitle files:
/metadata

Get metadata and media file locations for a given media ID.

/langs

Get a list of all subtitle and audiotrack languages available for a given media ID.

/subs

Download the current subtitle file for a given media ID and language.

/audiotrack

Download an audiotrack file for a given media ID and language.

Interfaces for editing subtitles:
/start_session

Starts an edition session to send and commit modifications of a subtitles file.

/session_status

Returns the current status of the given session ID.

/mod

Send and commit subtitle corrections under an edit session.

/end_session

Ends an open edition session, and depending on the confidence of the user, editions are directly stored in the corresponding subtitles files or left for revision.

Interfaces for managing subtitles' user revisions:
/lock_subs

Allow/disallow regular users to send subtitles modifications for an specific Media ID.

/edit_history

Returns a list of all edit sessions that involved an specific media ID.

/revisions

Returns a list of all edit sessions for all API user’s media files that are pending to be revised.

/mark_revised

Mark/unmark as revised an specific edit session ID, typically from another Session ID on the TLP Player.

/accept

Accept modifications of one or more pending edit sessions without having to revise them. Modifications are commited into the corresponding subtitles files.

/reject

Reject modifications of one or more pending edit sessions without having to revise them.

A detailed description of the Web Service API can be found in this Appendix. In addition, TLP offers several tools to interact with this API; you will find more information about them in the Client Tools Section.

4.1. API User Authentication

The TLP Web Service comes with a custom API user authentication system based on authentication tokens. Every API call must include a valid authentication token in order to authenticate the API user. TLP offers two different authentication methods:

  • Secret Key: An API user authentication token, associated to the user account, is provided to the Web Service. This token is valid for user authentication on all API interfaces. This is the recommended authentication method for direct client to server API calls.

  • Request Key: A lifetime-limited request-dependent authentication token is provided to the Web Service. This token is valid for user authentication only on a reduced set of API interfaces and for a limited period of time. This authentication method should be used in case the use of the secret key as an authentication token could be exposed or revealed to third-parties, for instance when a user belonging to the API client organisation is using the TLP Player to edit a subtitle file (in this case, the authentication token is sent via URL parameters).

The figure above shows a typical integration scenario between TLP and the remote media repository, in which the Secret Key authentication method is used for all direct API calls between both parts, whilst the alternative Request Key method is used to generate TLP Player URLs that will be followed by the repository’s users to review media subtitles. In this latter case, the Request Key is the authentication token used in all API calls between the Player and the Web Service.

For further information and technical details, please refer to the Preface of the Web Service’s API Documentation.

5. Player

The TLP Player is an HTML5 media player which allows users to review and modify media subtitles with ease. The Player provides a highly ergonomic editing interface, optimized to reduce user effort.

The TLP Player can be called externally using a valid URL. For further technical information, please refer to the Calling the TLP Player Annex.

5.1. User Guide

The TLP Player will automatically load the media file and subtitles. Manually edited subtitle segments will be shown in green, while automatic subtitle segments will appear in black.

Player Buttons:
  • TLP Player Previous Button : Jump to the beginning of the previous subtitle segment (Up arrow)

  • TLP Player Jump Backwards Button : Seek video -1.5 seconds (Alt + Left arrow)

  • TLP Player Play/Pause Button : Play/Pause video (Tab)

  • TLP Player Jump Forward Button : Seek video +1.5 seconds (Alt + Right arrow)

  • TLP Player Next Button : Jump to the beginning of the next subtitle segment (Down arrow)

  • TLP Player Help Button : Reveals a Help layer with descriptions and keyboard shortcuts.

  • TLP Player Save Button : Saves both Reference and Editing subtitle changes, if any.

  • TLP Player Closed Captions Button : Allows the user to select the Reference and Editing languages. The Editing language must be selected first, and it is the language the user wishes to edit. The Reference language can be optionally displayed to help the user in the translation process. In this mode, both Reference and Editing subtitles can be edited simultaneously.

  • TLP Player Layout Button : Allows the user to select different editing layout modes.

  • TLP Player Options Button : Shows different options such as download/import subtitle file or enable/disable the Advanced mode.

Transcription shortcuts:
  • Enter (or click): Edit/Confirm the current segment.

  • Shift + Tab: Replay the current segment from the beginning.

Transcription shortcuts (only Advanced mode):
  • Ctrl + Enter: Create a new segment starting on current media time.

  • Ctrl + S: Split the current segment.

  • Ctrl + Backspace: Join the current and the previous segments.

6. Ingest Service

The Ingest Service is the service devoted to handle and process Media Package Files (MPF) uploaded via the /ingest interface of the Web Service. The Ingest Service checks periodically (typically every minute) whether new MPFs have been uploaded in order to start their processing, also checking if the ongoing uploads are progressing correctly or have failed. The uploads table of the Database is used to keep track of the status of every upload.

Media Package File specifications can be found in this Appendix.

The figure above shows the internal structure of the Ingest Service, which is split in two layers:

Upper Layer

The Upper Layer implements the main logic of the Ingest Service using a modular design. It has a central node, the Core, which the logic of all possible workflows that can be followed by a MPF, leaving data processing tasks to external modules. This means that the functionalities of the Ingest Service can be easily modified, replaced or extended by swapping these external modules with others, e.g., other Automatic Speech Recognition (ASR) and Machine Translation (MT) modules.

External modules can be divided in two categories:

  • Base Modules: Modules that implement APIs for basic operations used by the Core.

    • URL Downloader: Module that allows for the download media files from a given URL address. It also offers the possibility of downloading obfuscated URLs such as YouTube or Vimeo using external plug-ins, called URL decoders.

    • Media Module: Module that offers several methods of media format conversion.

    • Mailer Module: Module with routines used to send e-mail notifications regarding upload status updates.

  • transLectures Modules: Modules that integrate transcription, translation and speech synthesis technologies into the Ingest Service.

    • ASR Modules: Automatic Speech Recognition Modules, used to generate transcription subtitle files.

    • MT Modules: Machine Translation Modules, used to generate translated subtitle files.

    • TTS Modules: Text-To-Speech Modules, used to generate synthesized audiotracks in a specific language.

    • Text Retrieval Module: Extracts plain text information from the different file resources included in the MPF. It also downloads related text documents from the web. This text data can be used by ASR Modules to enhance transcription quality by adapting the underlying ASR System to the topic of the media file.

Lower layer

The Lower Layer satisfies all local installation dependencies related to data storage and job scheduling. It is split into two parallel sublayers:

  • Scheduler layer: Implements an API for launching and scheduling the transcription and translation processes, typically in a Grid Engine/Job Management System.

  • Storage layer: Implements an API that allows access to the data stored in the Database and in the TLP Server’s hard drive.

6.1. Uploads Workflow

In this section we explain the different steps an upload can follow from the moment it is ingested into the transLectures-UPV Platform until its processing finishes.

First we must distinguish between four types of operations:

  • New Media: This operation is requested when a newly-recorded, non-existing media is uploaded to TLP for the first time. In this operation, a new Media object is created in the Database.

  • Update Media: This operation is requested when updates are applied to an existing media. For instance, new text resources such as slides might be added to the Media Package File (MPF) to improve the automatic transcription and translations of the existing media, or to update the existing media file with a re-recording.

  • Delete Media: This operation is requested when a media is deleted from the remote repository.

  • Cancel Upload: This operation is requested to cancel an ongoing upload for whatever reason.

Depending on the type of operation and the input data, the steps an upload follows in the Ingest Service may vary. The figure below illustrates the standard Ingest Service workflow:

Media Package Files are uploaded to the transLectures-UPV Platform via the Web Service's /ingest interface and stored in the Database. The Ingest Service reads the uploads table of the database and starts processing the uploaded MPF. An upload will typically follow the following sequential steps, with some exceptions (some steps might be skipped depending on the input data):

  1. Media Package Processing: The MPF is processed for the first time, performing several security, data integrity and data format checks, and, if all checks are correct, the upload status moves to the next processing step.

  2. Transcription Generation: In this step, a transcription file in DFXP format is generated from the main media file (video, audio) using an Automatic Speech Recognition (ASR) Module.

    This step is skipped in the following cases:

    1. The Ingest Service does not feature a suitable ASR Module for the source language of the main media file.

    2. Subtitles in the source language were provided in the MPF.

    3. The client has explicitly not requested this step.

    4. In update operations that do not involve re-transcribing the lecture.

    5. In delete or cancel operations.

  3. Translation(s) Generation: In this step, one or more translation files in DFXP format are generated from a transcription file (either automatically generated in the previous step, or provided in the MPF), using the appropriate Machine Translation Modules.

    This step is skipped in the following cases:

    1. The Ingest Service does not have suitable MT Modules for the source language of the main media file.

    2. Subtitles in all requested translation languages offered by the Ingest Service are already provided in the MPF.

    3. The client has explicitly not requested this step.

    4. In update operations that do not involve re-translating the lecture.

    5. In delete or cancel operations.

  4. Text-To-Speech Track Generation: In this step one or more synthesized audiotrack files are generated from a translation file (either automatically generated in the previous step or provided in the MPF), using the appropriate Text-To-Speech Modules. This step is skipped in the following cases:

    1. The Ingest Service does not have suitable TTS Modules for the target language of any translation files.

    2. Audiotracks in all requested languages offered by the Ingest Service are already provided in the MPF.

    3. The client has explicitly not requested this step.

    4. In update operations that do not involve re-translating the lecture.

    5. In delete or cancel operations.

  5. Media Conversion: In this step, the main media file is converted into the media formats required by the TLP Player in order to maximize browser compatibility. This step is skipped in the following cases:

    1. All required media files were attached in the MPF.

    2. In update operations where the main media file has not changed.

    3. In delete or cancel operations.

  6. Store Data: This is the final step. For New Media and Update Media operations, the data contained in the MPF and the data automatically generated by the Ingest Service are stored in the Database. For Delete Media operations, all previously stored media files and data are deleted.

In every execution of the Ingest Service, the Core reviews which uploads are being processed, checking whether the related processes are:

  • Queued: Processes are queued when they are waiting to be executed. No action is performed.

  • Running: Processes are being executed. No action is performed.

  • Finished: All processes finished successfully. The Core changes the upload status to the next processing step.

  • Failed: Some processes failed. The Core changes the upload status to an error state.

Detailed information about the Ingest Service workflows and behaviour can be found in this Appendix.

6.2. User Quota

Each TLP user / API client account has an upload quota. This quota represents the remaining number of videos and media time that the user can upload. Once a new media file is uploaded, the Ingest Service checks whether the client has enough quota to process that particular media, updating accordingly the user’s quota after processing the media file (the length of the uploaded media is subtracted from the total remaining time). Automatic re-transcriptions and re-translations do not decrease the user’s quota.

Tip The Ingest Service features a Test Mode that allows the client to perform integration tests without consuming quota, and obtaining fast responses. For more information please refer to the Manifest JSON File Specification.

7. Client tools

The transLectures-UPV Platform offers several libraries and command-line utilities in order to facilitate the client’s interaction with the Web Service API and the TLP Player. These tools are located under the misc/client-tools folder.

Command-line utilities:
Libraries:

Appendices

Appendix A: Installation

In this section we provide intallation instructions to properly set up TLP.

Despite the distributed nature of TLP, that is, that each of its components can in theory be installed on different machines, for the sake of simplicity we recommend they be installed on a single machine to create what we have called a TLP Server.

We have tested a TLP installation on Ubuntu 14.04 LTS Desktop and Server versions. It has not been tested on other versions or distributions. All installation notes contained in this documentation are based on Debian/Ubuntu-based distributions and may not, therefore, be applicable to other distributions.

Requirements

What follows are the minimum and recommended hardware requirements for installing TLP on a single machine.

Minimum hardware requirements

  • Intel Core i5 processor.

  • 4 GB RAM.

  • 500 GB of free hard disk space.

  • Linux-based operating system.

Recommended hardware requirements (TLP + in-house grid engine)

  • High-end Intel Core i7 processor.

  • 2 x high-end GPUs with Nvidia CUDA support.

  • 128 GB RAM.

  • 2 TB of free hard disk space.

  • Ubuntu Server 14.04 LTS operating system.

Manual Installation

First, we recommend the creation of a TLP user on your machine for running all tasks and processess related to TLP workflows. In our installation examples, we will assume that a tluser user and tlgroup group have been created, as well as a home directory for this user (/home/tluser) where all media, subtitles and uploaded files will be stored. In Debian/Ubuntu systems this can be done as follows:

sudo useradd -d /home/tluser -m -s /bin/bash tluser
sudo groupadd tlgroup
sudo usermod -a -G tlgroup tluser

Manual installation and configuration guides for each TLP component:

Database

Requirements

  • PostgreSQL server and client (version 9.1 or above).

Installation Steps

The following steps take you through the process of creating a transLectures Repository, including the creation of a system user, the Database and the required directory structure.

  1. Install the PostgreSQL server and client packages. On Ubuntu 14.04 LTS, this is easy to do using the following command line:

    sudo apt-get install postgresql
  2. Create a new database user, the transLectures user (tluser), which will be used when connecting to the TLP Database. Remember to set a password for this user (e.g. tlpass). Note that, in order to execute the following commands, we must be operating as the default database superuser, postgres.

    sudo -u postgres createuser -s tluser
    sudo -u postgres psql -c "ALTER USER tluser WITH PASSWORD 'tlpass'"
  3. Create a new database with the name tldb, and insert the Database schema and static data (located at db/sql/schema.sql and db/sql/static_data.sql, respectively). Note that, in order to execute the following commands, we must be operating as the transLectures user (tluser, see Manual Installation).

    sudo -u tluser createdb tldb
    sudo -u tluser psql -f "db/sql/schema.sql" tldb
    sudo -u tluser psql -f "db/sql/static_data.sql" tldb
  4. Create Media, Transcriptions and Uploads root directories, and set proper directory permissions. Create these directories in the system user’s home directory.

    sudo mkdir -p /home/tluser/tlp-repo/media
    sudo mkdir -p /home/tluser/tlp-repo/trans
    sudo mkdir -p /home/tluser/tlp-repo/uploads
    sudo chown tluser:tlgroup /home/tluser/tlp-repo/*
    sudo chmod 775 /home/tluser/tlp-repo/*
    sudo chmod g+s /home/tluser/tlp-repo/*
  5. Insert a new row in the machines table and add a machine named localhost, with IP 127.0.0.1 and ID 0:

    sudo -u tluser psql tldb -c "INSERT INTO machines (id, hostname, ip) VALUES (0, 'localhost', '127.0.0.1');"
  6. Insert the mount point for each of the three directories mentioned above for the machine ID 0 created in step 5.

    sudo -u tluser psql tldb -c "
     INSERT INTO mount_points (name, machine_id, path) VALUES
         ('media', 0, '/home/tluser/tlp-repo/media'),
         ('transcriptions', 0, '/home/tluser/tlp-repo/trans'),
         ('uploads', 0, '/home/tluser/tlp-repo/uploads');"
  7. (optional) Check the connection to the Database using the following command line:

    sudo -u tluser psql tldb

Web Service

Requirements

Installation Steps

  1. Install the required software dependencies. On Ubuntu 14.04 LTS, this is easy to do using the following command line:

    sudo apt-get install apache2 apache2-utils libapache2-mod-wsgi python-psycopg2 python-paste
  2. Copy the web-service and lib directories into the desired installation directory. In our example we use /home/tluser/tlp.

    sudo mkdir -p /home/tluser/tlp
    sudo cp -r web-service lib /home/tluser/tlp/
  3. Configure your Apache sites-enabled file so that a relative address of your HTTP Server points to the Web Service's WSGI script (/home/tluser/tlp/web-service/ws.py). Here we configure it so that the Web Service is accessible from the relative path /api (e.g. http://myserver.com/api). In Ubuntu 14.04 LTS, the Apache configuration file is located at /etc/apache2/sites-enabled/000-default.conf. You have to add the following command line to your <VirtualHost> directive(s):

    <VirtualHost *:80>
     ...
     WSGIScriptAlias /api /home/tluser/tlp/web-service/ws.py
     <Directory /home/tluser/tlp/web-service>
       Require all granted
     </Directory>
     ...
    </VirtualHost>
  4. Add www-data user to the tlgroup group.

    sudo usermod -a -G tlgroup www-data
  5. Restart Apache server. On Ubuntu 14.04 LTS:

    sudo service apache2 restart

Configuration

The Web Service comes with a configuration file that indicates its root directory and database connection parameters, among other information. This configuration file must be given the name config.ini and be located in the Web Service installation dir (i.e. /home/tluser/tlp/web-service/config.ini). The specification of the configuration file is as follows:

[authentication]
key_generator = <string>

Web Service’s Secret Key generator.

[storage]
db_name = <string>

Database name.

db_user = <string>

Database user name.

db_host = <string>

Database hostname or IP address.

db_passwd = <string>

Database user password. You can leave this field empty if Database SSL Auth is enabled.

[media_urls]
use_urls = <boolean>

Return URLs instead of absolute file paths when returning URIs of media files.

base_url = <boolean>

URL prefix used to create full media URLs.

[mailing]
enabled = <boolean>

Send e-mail alerts whenever the Web Service fails for whatever reason.

smtp_server = <string>

SMTP server to send e-mails.

from_address = <string>

From address.

to_address = <string>

Comma-separated recipient e-mails.

[misc]
html_msg = <string>

HTML code to be returned by the Web Service when accessing to unexisting API endpoints.

Below is a real example of a Web Service configuration file.

Example of a Web Service’s configuration file
[authentication]
key_generator = 12345

[storage]
db_name = tldb
db_user = tluser
db_host = my-tlp-server.com
# You may leave this empty if SSL auth is enabled:
db_passwd =

[media_urls]
use_urls = yes
base_url = http://my-tlp-server.com/data

[mailing]
enabled = yes
smtp_server = smtp.my-tlp.server.com
from_address = noreply@my-tlp-server.com
to_address = admin@my-tlp-server.com

[misc]
html_msg = <html><head><meta charset="UTF-8"></head><body><p>Hi there!</p></body></html>

Player

Requirements

Installation steps

  1. Install the required external software dependencies. In our installation example, we use the open source Apache HTTP Server. On Ubuntu 14.04 LTS, this is easy to do using the following command line:

    sudo apt-get install apache2 php5 libapache2-mod-php5 php5-curl
  2. Move all of the files inside the player folder of the TLP package into any folder of your HTTP Server. In our example we will use the directory /home/tluser/tlp/player:

    sudo mkdir -p /home/tluser/tlp
    sudo cp -r player /home/tluser/tlp/

    You will need to add the following lines in your <VirtualHost> directive(s) of the Apache sites-enabled file (located at /etc/apache2/sites-enabled/000-default).

    <VirtualHost>
     ...
       Alias /player /home/tluser/tlp/player
        <Directory /home/tluser/tlp/player>
          Options Indexes FollowSymLinks MultiViews
          Require all granted
        </Directory>
     ...
    </VirtualHost>
  3. Set ownership of the player/translectures/config.json file to the HTTP Server user (in Apache it is www-data), and disable its read and write permissions for groups and others:

    sudo chown www-data /home/tluser/tlp/player/translectures/config.json
    sudo chmod 600 /home/tluser/tlp/player/translectures/config.json
  4. (Optional) Create a symbolic link inside the player directory pointing to the media repository directory (/home/tluser/tlp-repo/media, see the section on installation of the Database), for instance player/data:

    sudo ln -s /home/tluser/tlp-repo/media /home/tluser/tlp/player/data

    Note that this symbolic link, when in the form of a URL (i.e. http://localhost/player/data), will become the tlbaseurl parameter when calling the Player (please see this Appendix).

  5. Restart your HTTP Server where necessary (in our example, it is):

    sudo service apache2 restart
  6. (Optional) Check whether the Player can be accessed (note that it will show you an error message - this is normal):

    curl http://localhost/player/

Configuration

The configuration file player/translectures/config.json must be edited to in order to grant the Player access to the Web Service.

JSON config file specification:
 {
  "ws_url" : <str> ,
  "data_url" : <str>
 }

Configuration parameters:

  • ws_url: Web Service URL.

  • data_url: Media storage data URL (to serve non-URL videos).

JSON config file example:
 {
  "ws_url" : "http://my.server.com/tl/",
  "data_url" : "http://my.server.com/player/data/"
 }

Ingest Service

Requirements

Installation Steps

  1. Install the required software dependencies. On Ubuntu 14.04 LTS, this is easy to do using the following command line:

    sudo apt-get install python-psycopg2 zip

    The tldextract python library is not available in the official Ubuntu repositories. However, it can be easily installed via pip:

    sudo apt-get install python-pip
    sudo pip install tldextract

    The FFmpeg package stored in the Ubuntu repositories does not include H.264 codec support, so you will have to download all sources and compile them by yourself. You will find a useful guide for said compilation in Debian/Ubuntu distributions here.

    You can also set up a job scheduling/queue management system, required in order to execute and manage transcription, translation and media conversion processes. In our case we have tested TLP with the open source version of the Sun Grid Engine.

  2. Copy the ingest-service directory into the desired installation directory. In our example, we use /home/tluser/tlp.

    sudo mkdir -p /home/tluser/tlp
    sudo cp -r ingest-service /home/tluser/tlp/
  3. Specify the root directory where ASR, MT and TTS modules will be placed in the systems mount point of the Database. Asuming that we will put all these modules under /home/tluser/tlp/ingest-service/modules/tl:

    sudo -u tluser psql tldb -c "
     INSERT INTO mount_points (name, machine_id, path) VALUES
         ('systems', 0, '/home/tluser/tlp/ingest-service/modules/tl');"

Configuration

The TLP Ingest Service comes with a configuration file in which several parameters and options are defined. The Ingest Service's Core will attempt to load a file named config.ini located in the same directory (i.e. /home/tluser/tlp/ingest-service/config.ini). If the configuration file does not exist or cannot be parsed, the execution will fail. You can manually specify another path to the configuration file using the option --config-file (see Execution).

The specification of the configuration file is as follows:

[general]

In this section, general settings can be configured.

hostname = <string>

Host name of the machine that will run the Ingest Service.

tl_user = <string>

System user that will run the Ingest Service, in order to set ownership of all stored files.

tl_group = <string>

System group to set ownership of all stored files.

rm_finished_up_days = <int>

Automatically delete temporary data from finished uploads after n days.

rm_error_up_days = <int>

Automatically delete temporary data from failed uploads after n days.

local_repository = <boolean>

Store all media files uploaded to TLP instead of accessing them via URL (when provided).

[storage]

In this section, TLP Database connection settings can be customised.

db_name = <string>

Database name to connect with.

db_user = <string>

Database user name.

db_passwd = <string>

Database user password. You can leave this field empty if database SSL auth is enabled.

db_host = <string>

Database hostname or IP address.

[scheduler]

In this section, some settings relating the job management system are defined.

localhost = <string>

Local machine hostname for the job management system. It is used to launch media conversion processes in the local machine, as these tasks are very network-consuming.

status_cmd = <string>

System call to get the status of all processes previously submitted to the job management system.

submit_scr = <string>

Path to the script or binary program to submit tasks to the job management system.

submit_opts = <string>

Options of the submit script (which will be appended to all submit calls).

job_name_prefix = <string>

Job name prefix for all tasks submitted to the job management system.

[sessions]
enabled = <boolean>

Make the Ingest Service to create edit sessions on update operations over the media ID that is being processed. Hence, users won’t be able to edit subtitles with the TLP Player until the update operation finishes.

author_id = <boolean>

Author ID of the Ingest Service.

author_name = <boolean>

Author Name of the Ingest Service.

author_conf = <int>

Confidence of the Ingest Service, from 0 to 100 (100, right?).

[mailing]

In this section, you can customize several parameters of the Mailing module.

enabled = <boolean>

Enables or disables Mailing module.

smtp_server = <string>

SMTP Server hostname or IP address used to send e-mail notifications.

from_address = <string>

E-mail address that will be used as "From" address.

send_client_started_mail = <boolean>

Enables or disables e-mail notifications to the client to inform that an upload has started to be processed.

send_client_error_mail = <boolean>

Enables or disables e-mail notifications to the client to inform that an upload has failed.

send_client_finished_mail = <boolean>

Enables or disables e-mail notifications to the client to inform that an upload has successfully finished.

send_admin_started_mail = <boolean>

Enables or disables e-mail notifications to the system administrator to inform that an upload has started to be processed.

admin_address_started = <string>

E-mail address of the system administrator that will receive notifications about uploads that have started to be processed.

send_admin_error_mail = <boolean>

Enables or disables e-mail notifications to the system administrator to inform that an upload has failed.

admin_address_error = <string>

E-mail address of the system administrator that will receive notifications about uploads that have failed.

send_admin_finished_mail = <boolean>

Enables or disables e-mail notifications to the system administrator to inform that an upload has finished.

admin_address_finished = <string>

E-mail address of the system administrator that will receive notifications about uploads that have finished.

[file_formats]

In this section, the required/allowed file formats for every type of file that can be uploaded to the Ingest Service are defined. Please note that only file formats in the file_formats table of the TLP Database can be used.

max_audio_track_length = <int>

Defines the maximum length allowed, in seconds, of the audio track of the main media file.

generate_pcm_stream = <boolean>

Generate a PCM stream file to be retrieved by the TLP Player under the Advanced Mode.

required_video_formats = <string>

Defines which video formats are needed to maximise compatibility of the TLP Player with all browsers. If the uploaded media files are not in some of the required formats, then the Ingest Service will do the appropriate conversion.

allowed_video_formats = <string>

Comma-separated list of all video formats that will be allowed to be uploaded as main media.

allowed_audio_formats = <string>

Comma-separated list of all audio formats that will be allowed to be uploaded as main media.

allowed_slides_text_formats = <string>

Comma-separated list of all slides text formats that will be allowed to be uploaded.

allowed_slides_video_formats = <string>

Comma-separated list of all slides video formats (video-recorded slides) that will be allowed to be uploaded.

allowed_docs_formats = <string>

Comma-separated list of all document formats that will be allowed to be uploaded.

allowed_caption_formats = <string>

Comma-separated list of all subtitle formats that will be allowed to be uploaded.

allowed_thumbnail_formats = <string>

Comma-separated list of all thumbnail formats that will be allowed to be uploaded.

allowed_packages = <sring>

Comma-separated list of all package formats that will be allowed to be uploaded.

[text_retrieval_module]

In this section, the location of the text retrieval module is defined.

module_path = <string>

Path to the text retrieval module. If it is not available, just leave the right part empty.

[data]

In this section, paths to external data files are defined.

audio_background_img = <path>

Background image that will be used to encode videos for the TLP Player using an uploaded audio file as audio stream. If it is not provided, then the generated videos will show a black background.

test_thumbnail_img = <path>

Background image that will be used to encode a short video for the TLP Player when using the test mode of the Ingest Service.

[url_decoders]

In this section, URL decoders for non-public URLs can be registered to be used by the URL Downloader Module.

Example of a Ingest Service configuration file

[general]
hostname = my-tlp-server.com
tl_user = tluser
tl_group = users
rm_finished_up_days = 1
rm_error_up_days = 5
local_repository = no

[storage]
db_name = tldb
db_user = tluser
db_passwd =
db_host = my-tlp-server.com

[scheduler]
localhost = my-tlp-server
status_cmd = qstat -u tluser
submit_scr = qsubmit
submit_opts = ""
job_name_prefix = TLP-job

[sessions]
enabled = no
author_id = ingest-service
author_name = Ingest Service
author_conf = 100

[mailing]
enabled = yes
smtp_server = smtp.my-tlp.server.com
from_address = no-reply@my-tlp.server.com
send_client_started_mail = yes
send_client_error_mail = yes
send_client_finished_mail = yes
send_admin_started_mail = yes
admin_address_started = admin@my-tlp.server.com
send_admin_error_mail = yes
admin_address_error = admin@my-tlp.server.com
send_admin_finished_mail = yes
admin_address_finished = admin@my-tlp.server.com

[file_formats]
max_audio_track_length = 10800
generate_pcm_stream = yes
required_video_formats = mp4
allowed_video_formats = mp4, m4v, ogv, wmv, avi, mpg, flv, mov, 3gp, webm, mkv
allowed_audio_formats = wav, mp2, mp3, oga, flac, aac, ape, wma, m4a
allowed_slides_text_formats = txt, ppt, pptx, doc, docx, pdf
allowed_slides_video_formats = mp4, m4v, ogv, wmv, avi, mpg, flv
allowed_docs_formats = pdf, doc, docx, ppt, pptx, txt, html, xls, xlsx
allowed_caption_formats = dfxp, trs, srt
allowed_thumbnail_formats = jpg
allowed_packages = zip

[text_retrieval_module]
module_path = /path/to/my/text_retrieval_module.py

[data]
audio_background_img = /path/to/audio_background.png
test_thumbnail_img = /path/to/test_thumbnail.png

[url_decoders]
youtube = /path/to/url-decoder.youtube.py

Execution

In order to run the Ingest Service, you have to execute the Ingest Service’s Core (ingest-service/core.py). All options of the Core python script are shown below.

Usage: core.py [options]

Options:
  -h, --help            show this help message and exit
  -v, --verbose         Verbose power on!
  -d, --debug           Debug mode
  -C CONFIG_FILE, --config-file=CONFIG_FILE
                        Configuration file. Default: config.ini
  -D DB_NAME, --database=DB_NAME
                        Database with which to work. Default: specified in
                        config file

To run the Ingest Service:

python /home/tluser/tlp/ingest-service/core.py -v

However, you might probably want to schedule its execution periodically. Under UNIX systems you can consider using Crontab. For instance, if you want to execute the Ingest Service every minute, logging all information into a log file, put this line in the tluser's crontab file:

*/1 * * * * /bin/bash -l -c -x '
 source /home/tluser/.bashrc;
 python /home/tluser/tlp/ingest-service/core.py -v >> /home/tluser/tlp/ingest-service/core.log 2>&1;'

To prevent multiple executions of the Ingest Service to be running at the same time, you can simply use a custom lock file.

*/1 * * * * /bin/bash -l -c -x '
  if [ ! -e /home/tluser/.cron-ingest.lock ]; then
    touch /home/tluser/.cron-ingest.lock;
    source /home/tluser/.bashrc;
    python /home/tluser/tlp/ingest-service/core.py -v >> /home/tluser/tlp/ingest-service/core.log 2>&1;
    rm /home/tluser/.cron-ingest.lock;
  fi'

Appendix B: Calling the TLP Player

The TLP Player must be called using different input parameters depending on which subtitles are being edited, the language of these subtitles and what kind of user is doing the editing. These parameters are sent as a Base64-encoded JSON string via HTTP GET or POST methods. A full request key is sent for the authentication of the API client on all Web Service calls.

Warning API users should avoid sending their secret key on the Player input parameters, since these parameters are exposed to third-parties (i.e. external Player users) as they travel inside the URL to the Player. Please see Annex Generating a Request Key to learn how to produce valid full request keys in order to protect your private secret key.
Tip The transLectures-UPV Platform includes in its Client Tools Package the player-url-generator.py command-line script that generates valid Player URLs for the given input parameters. The usage of the --debug option might be very useful to check how these URLs are generated. Furthermore, you will find libraries for different platforms that include Player URL generation methods.

Input parameters

JSON Object Specification:
 {
  "id" : <str> ,
  "lang" : <str> ,
  "author_id" : <str> ,
  "author_conf" : <int> ,
  "author_name" : <str> ,
  "expire" : <int> ,
  "api_user" : <str> ,
  "request_key" : <str>
 }
id:<str>

Media ID.

lang:<str>

Language code of the subtitles being edited (i.e. en, es, ca). If this parameter is not defined, the Player will load the source language transcriptions (optional).

author_id:<str>

ID of the user that will edit the subtitles. It is typically the internal user ID that the API client’s organisation assigns to the user.

author_conf:<int>

Integer value (range 0-100) that indicates the confidence level that the API client’s organisation provide to the user.

author_name:<str>

Full name of the user that will edit the subtitles (optional).

expire:<int>

Expiration date of the URL in UNIX timestamp format.

api_user:<str>

TLP username / API Client username (Please see Web Service user authentication).

request_key:<str>

Request key (see Generating a Request Key).

Example 1. TLP Player base64-JSON-encoded parameters example:
 {
  "id" : "id-001",
  "lang" : "en",
  "author_id" : "bobama",
  "author_conf" : 100,
  "author_name" : "Barack Obama",
  "expire" : 1400173491,
  "api_user" : "tluser",
  "request_key" : "5251982f3d00544e6e9a91962a2eec2f0b3df38c"
 }

Parameters are sent as a Base64-encoded JSON string. The JSON string for the above example would be as follows:

 {"id" : "id-001", "lang" : "en", "author_id" : "bobama", "author_conf" : 100, "author_name" : "Barack Obama", "expire" : 1400173491, "api_user" : "tluser", "request_key" : "5251982f3d00544e6e9a91962a2eec2f0b3df38c"}

Base64 encode of the above JSON string:

 eyJpZCIgOiAiaWQtMDAxIiwgImxhbmciIDogImVuIiwgImF1dGhvcl9pZCIgOiAiYm9iYW1hIiwgImF1dGhvcl9jb25mIiA6IDEwMCwgImF1dGhvcl9uYW1lIiA6ICJCYXJhY2sgT2JhbWEiLCAiZXhwaXJlIiA6IDE0MDAxNzM0OTEsICJhcGlfdXNlciIgOiAidGx1c2VyIiwgInJlcXVlc3Rfa2V5IiA6ICI1MjUxOTgyZjNkMDA1NDRlNmU5YTkxOTYyYTJlZWMyZjBiM2RmMzhjIn0=

HTTP call

The requested Base64 JSON string is received by the Player via HTTP GET or POST methods using the following parameters:

  • request → Base64 JSON string

  • t → Media start time in seconds (optional)

Example 2. TLP Player URL:
http://ttp.mllp.upv.es/player?request=eyJpZCIgOiAiaWQtMDAxIiwgImxhbmciIDogImVuIiwgImF1dGhvciIgOiAiYm9iYW1hIiwgImF1dGhvcl9uYW1lIiA6ICJCYXJhY2sgT2JhbWEiLA0KImF1dGhvcl9jb25mIiA6IDEwMCwgImludGVybmFsdXNlciIgOiAwLCAiZXhwaXJlIiA6IDE0MDAxNzM0OTEsICJhcGlfdXNlciIgOiAidGx1c2VyIiwgDQoicmVxdWVzdF9rZXkiIDogIjUyNTE5ODJmM2QwMDU0NGU2ZTlhOTE5NjJhMmVlYzJmMGIzZGYzOGMifQ==

Appendix C: Web Service API Specification

In this section, a detailed description of the inputs and outputs of all Web Service interfaces is provided.

Tip The transLectures-UPV Platform includes in its Client Tools Package several libraries for different platforms that implement all the interfaces described below. Also, you will find in that package the ws-client.py command-line tool ready to be used for making all these API calls.

Preface

Please read carefully the following considerations before interacting with this API:

Allowed HTTP methods

The interfaces featured by the Web Service can be called using either GET or POST methods.

  • GET Method:

    • Using a single Base64-encoded JSON dict data GET parameter. Example:

      Parameters encoded into a JSON Dict:

      {"parameter1":"value1", "parameter2": "value2"}

      Base64-encoded JSON dict:

      eyJwYXJhbWV0ZXIxIjoidmFsdWUxIiwgInBhcmFtZXRlcjIiOiAidmFsdWUyIn0

      URL:

      http://ttp.mllp.upv.es/api/action?data=eyJwYXJhbWV0ZXIxIjoidmFsdWUxIiwgInBhcmFtZXRlcjIiOiAidmFsdWUyIn0
    • Using multiple GET parameters. Example:

      GET Parameters:

      parameter1=value1
      parameter2=value2

      URL:

      http://ttp.mllp.upv.es/api/action?parameter1=value1&parameter2=value2
  • POST Method:

    • All parameters must be sent as a Base64-encoded JSON dictionary stored in the body of the request. Example:

      Parameters encoded into a JSON Dict:

      {"parameter1":"value1", "parameter2": "value2"}

      Base64-encoded JSON dict:

      eyJwYXJhbWV0ZXIxIjoidmFsdWUxIiwgInBhcmFtZXRlcjIiOiAidmFsdWUyIn0

      URL:

      http://ttp.mllp.upv.es/api/action

      (+ POST data)

The explanation above is applicable to all interfaces, except for the /ingest, which requires a combined POST+GET query, where:

Note The Web Service will return an HTTP 400 Bad Request error code whenever required arguments are missing or are provided under an incorrect format.
Tip The ws-client.py command-line tool implements (via the libtlp python library) all possible ways to send query parameters to the API described above (see the options --use-get-query and --use-data-param). If you plan to implement your own API client, the --debug option of this script might be very useful for you to check how parameters and HTTP calls must be generated.

API Client Authentication

  • API clients have to send in every call to the Web Service the following parameters:

    • user → API Client username / TLP username,

    • auth_token → The authentication token for user authentication.

  • And additionally, if auth_token is a request_key:

    • expire → Expiration date UNIX timestamp for the request (seconds since 01-01-1970 in UTC to the expiration date).

  • To learn how to generate a valid request key, please refer to the Generating a Request Key Annex.

  • The request key authentication method is valid for all interfaces, except /ingest, /uploadslist, /status, /systems and /revisions.

  • The Web Service will return the following HTTP error codes:

    • 401 Unauthorized, if:

      • The API user does not exist,

      • the authentication token is invalid,

      • the API user does not have permissions to get information about the provided object ID.

    • 419 Authentication Timeout, if:

      • the authentication token has expired (only for Request Key).

/ingest

This interface allows the client to upload media files to the platform so they can be automatically transcribed and translated into several languages by the Ingest Service. The uploaded data (media, slides, documents, etc.) are bundled into a non-compressed ZIP file called Media Package (MPF). The Web Service stores the Media Packages in the server and returns an Upload ID, which can be used afterwards to check the upload progress via the /status interface.

Tip If you are developing your own API client, it is recommended to enable the Test Mode of the Ingest Service when performing call tests to this interface. Please refer to the Manifest JSON File Specification.

Input

Input data is divided in two parts: query parameters, that goes through the url as a GET query, and the Media Package file (ZIP format, Content-Type must be application/zip, or multipart/form-data with a field entitled file and Content-Type = application/zip) sent through the body as a POST query.

Parameter name Type Required Description Example

id

str

Yes

Object ID (metadataexternal_id key in the Manifest file).

MEDIA-ID-1234

opc

str

Yes

Operation code (operation_code key in the Manifest file).

0

email

str

No

Email address to send notifications about status updates.

jsnow21@got.com

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

Example 3. [/ingest] Upload a Media Package File (GET with multiple parameters, MPF goes via POST)
http://ttp.mllp.upv.es/api/ingest?id=MEDIA-ID-1234&opc=0&email=jsnow21@got.com&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
 {
  "rcode" : <int> ,
  "rcode_description" : <str> ,
  "id" : <str> ,
  "hash" : <str>
 }
rcode:<int>

Return code.

  • 0 → Upload completed.

rcode_description:<str>

Description of the return code (rcode).

id:<str>

Upload ID, which can be used afterwards to check the progress of the upload via the /status interface.

hash:<str>

Internal media hash ID.

Output Examples

Media Package File successfully uploaded.
 {
  "rcode" : 0,
  "rcode_description" : "Ingestion complete",
  "id" : "UPLOAD-ID-1234"
  "hash" : "433e11c295c51b94a074a"
 }

/uploadslist

Returns a list of all user’s uploads.

Input

Parameter name Type Required Description Example

object_id

str

No

Get list of uploads involving the provided object ID (could be an Upload ID or a Media ID).

MEDIA-ID-1234

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

Example 4. [/uploadslist] Get list of uploads.
http://ttp.mllp.upv.es/api/uploadslist?user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
[
    {
        "id": <str>,
        "object_id": <str>,
        "status_code": <int>,
        "uploaded_on": <str>,
        "last_update": <str>
    },
    ...
]
id:<str>

Upload ID.

object_id:<str>

Object ID involved (could be an Upload ID or a Media ID).

status_code:<int>

Status code of the Upload.

  • 0 → Video ingested, not processed yet.

  • 1 → Processing Media Package file.

  • 2 → Transcription in progress.

  • 3 → Translation(s) in progress.

  • 4 → Text-To-Speech audio track(s) generation in progress.

  • 5 → Preparing data for the TLP Player.

  • 6 → New media successfully processed.

  • 20 → Media successfully updated.

  • 30 → Media successfully deleted.

  • 40 → Cancel upload operation successfully processed.

  • 90 → Upload cancelled.

  • 100 → An unknown error occurred.

  • 101 → An error occurred while processing the Media Package.

  • 102 → An error occurred while transcribing the media.

  • 103 → An error occurred while translating the media.

  • 104 → An error occurred while generating Text-To-Speech audio tracks.

  • 105 → An error occurred while preparing data for the TLP Player.

  • 106 → An error occurred while updating internal database.

uploaded_on:<str>

Upload timestamp.

last_update:<str>

Last upload check timestamp.

Output Examples

Uploads List returned.
[
    {
        "id": "up-ac83be70-a01c-4c18-8cc4-dc0b2676cbb0",
        "object_id": "MEDIA-ID-1234",
        "status_code": 2,
        "uploaded_on": "2015-06-10 17:19:37.239458",
        "last_update": "2015-06-10 17:20:02.557135"
    },
    {
        "id": "up-60a70bbd-e111-4d0c-b41f-6e235c434330",
        "object_id": "MEDIA-ID-1234",
        "status_code": 101,
        "uploaded_on": "2015-06-09 11:21:07.735656",
        "last_update": "2015-06-09 11:22:02.549826"
    },
    {
        "id": "up-776ae4ec-6904-4da1-afa8-1b68017a524a",
        "object_id": "MEDIA-ID-5678",
        "status_code": 6,
        "uploaded_on": "2015-06-10 17:25:04.673541",
        "last_update": "2015-06-10 17:26:02.542902"
    }
]

/status

Returns information about the progress of an uploaded media given an Upload ID. It enables the remote repository to keep track of the automatic uploads and to notice possible processing errors.

Input

Parameter name Type Required Description Example

id

str

Yes

Upload ID, returned by the /ingest interface.

UPLOAD-ID-1234

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

Example 5. [/status] check status of an upload (GET with multiple parameters).
http://ttp.mllp.upv.es/api/status?id=UPLOAD-ID-1234&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
{
  "rcode" : <int> ,
  "rcode_description" : <str> ,
  "status_code" : <int> ,
  "info" : <str> ,
  "error_code" : <int> ,
  "uploaded_on" : <str> ,
  "last_update" : <str>
}
rcode:<int>

Return code.

  • 0 → Upload ID exists.

  • 1 → Upload ID does not exist.

rcode_description:<str>

Description of the return code (rcode).

status_code:<int>

Status code of the upload.

  • 0 → Video ingested, not processed yet.

  • 1 → Processing Media Package file.

  • 2 → Transcription in progress.

  • 3 → Translation(s) in progress.

  • 4 → Text-To-Speech audio track(s) generation in progress.

  • 5 → Preparing data for the TLP Player.

  • 6 → New media successfully processed.

  • 20 → Media successfully updated.

  • 30 → Media successfully deleted.

  • 40 → Cancel upload operation successfully processed.

  • 90 → Upload cancelled.

  • 100 → An unknown error occurred.

  • 101 → An error occurred while processing the Media Package.

  • 102 → An error occurred while transcribing the media.

  • 103 → An error occurred while translating the media.

  • 104 → An error occurred while generating Text-To-Speech audio tracks.

  • 105 → An error occurred while preparing data for the TLP Player.

  • 106 → An error occurred while updating internal database.

info:<str>

Detailed information about the status code.

error_code:<int>

Generic error code that identifies the operation that failed within the process, if any. Otherwise null.

uploaded_on:<str>

Upload timestamp.

last_update:<str>

Last status check timestamp.

Output Examples

The Upload ID provided does not exist.
 {
   "rcode": 1,
   "rcode_description" : "Upload ID [ up-1234 ] does not exist."
 }
The provided upload ID exists and it’s being processed.
 {
   "rcode": 0,
   "rcode_description" : "Upload ID exists.",
   "status_code": 2,
   "info": "Transcription in progress. It may take several hours for it to finish.",
   "uploaded_on": "2014-03-26 19:02:16.174944",
   "last_update": "2014-03-26 19:03:05.298861"
 }

/systems

Get a list of all available ASR/MT/TTS Systems that can be applied to transcribe/translate/synthesize a un uploaded media file using the /ingest interface.

Input

Parameter name Type Required Description Example

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

Example 6. [/systems] Get list of available systems.
http://ttp.mllp.upv.es/api/systems?user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
[
    "asr": [
        {
            "lang": <str>,
            "id": <int>,
            "name": <str>,
            "description": <str>
        },
        ...
    ],
    "mt": [
        {
            "source_lang": <str>,
            "target_lang": <str>,
            "id": <int>,
            "name": <str>,
            "description": <str>
        },
        ...
    ],
    "tts": [
        {
            "lang": <str>,
            "id": <int>,
            "name": <str>,
            "description": <str>,
            "voice_gender": <str>
        },
        ...
    ]
]
asr:<list:dict>

List of all available Automatic Speech Recognition Systems. <dict> keys:

  • lang:<str> → ASR System Language code (ISO-639-1).

  • id:<int> → System ID.

  • name:<str> → Name of the system.

  • description:<str> → Description of the system.

mt:<list:dict>

List of all available Machine Translation Systems. <dict> keys:

  • source_lang:<str> → MT System source language code (ISO-639-1).

  • target_lang:<str> → MT System target language code (ISO-639-1).

  • id:<int> → System ID.

  • name:<str> → Name of the system.

  • description:<str> → Description of the system.

tts:<list:dict>

List of all available Text-To-Speech Systems. <dict> keys:

  • lang:<str> → TTS System Language code (ISO-639-1).

  • id:<int> → System ID.

  • name:<str> → Name of the system.

  • description:<str> → Description of the system.

  • voice_gender:<str> → Gender of the voice that produces this system.

Note The system ID might be used to explicitly request to the Ingest Service the application of a particular ASR/MT/TTS system. For further information please see Requesting Subtitle Languages).

Output Examples

Systems List returned.
{
    "asr": [
        {
            "lang": "en",
            "id": 43,
            "name": "English ASR System",
            "description": ""
        },
        {
            "lang": "es",
            "id": 22,
            "name": "Spanish ASR System",
            "description": ""
        },
        {
            "lang": "ca",
            "id": 64,
            "name": "Catalan ASR System",
            "description": ""
        }
    ],
    "mt": [
        {
            "source_lang": "ca",
            "target_lang": "es",
            "id": 14,
            "name": "Catalan-Spanish MT System",
            "description": ""
        },
        {
            "source_lang": "es"
            "target_lang": "ca",
            "id": 11,
            "name": "Spanish-Catalan MT System",
            "description": ""
        },
        {
            "source_lang": "en",
            "target_lang": "es",
            "id": 73,
            "name": "English-Spanish MT System",
            "description": ""
        },
        {
            "source_lang": "es"
            "target_lang": "en",
            "id": 24,
            "name": "Spanish-English MT System",
            "description": ""
        }
    ],
    "tts": [
        {
            "lang": "en",
            "id": 71,
            "name": "English TTS System (Female)",
            "description": "",
            "voice_gender": "f"
        }
    ]
}

/metadata

Returns metadata and media file locations of a given media ID. For example, this operation is called by the TLP Player to get the main media file location.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 7. [/metadata] Get media metadata (GET with multiple parameters).
http://ttp.mllp.upv.es/api/metadata?id=MEDIA-ID-1234&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
 {
  "rcode": <int> ,
  "rcode_description": <str> ,
  "mediainfo": {
                   "language" : <str> ,
                   "title" : <str> ,
                   "category" : <str> ,
                   "duration" : <str> ,
                   "speakers" : [
                                 {
                                   "name" : <str>
                                 } ,
                                 ...
                                ]
                 } ,
  "media": [
             {
               "is_url" : <bool> ,
               "type_code" : <int> ,
               "media_format" : <str> ,
               "location" : <str>
             } ,
             ...
           ],
  "audiotracks": [
        {
            "lang": <str>,
            "voice_type": <str>,
            "id": <int>,
            "location": <str>,
            "media_format": <str>,
            "audio_type": <str>,
            "is_url": <bool>,
            "sub_type": <int>,
            "description": <str>
        } ,
        ...
    ],
  "attachments": [
             {
               "is_url" : <bool> ,
               "type_code" : <int> ,
               "media_format" : <str> ,
               "location" : <str>
             } ,
             ...
           ]
 }
rcode:<int>

Return code of the WS call.

  • 0 → Media ID exists.

  • 1 → Media ID does not exist.

rcode_description:<str>

Description of the return code (rcode).

mediainfo:<dict>

Media Metadata. <dict> keys:

  • language:<str> → Language of the media.

  • title:<str> → Title of the media.

  • duration:<int> → Media duration (in seconds).

  • speakers:<list:dict> → Info about the media’s speaker(s). <dict> keys:

    • name:<str> → Name of the speaker.

media:<list:dict>

List of media files available. <dict> keys:

  • is_url:<bool> → Defines whether location is an URL or not. If not, it is a relative physical path.

  • type_code:<int> → File type code.

    • 0 → Main media file (video or audio).

    • 3 → Small video thumbnail

    • 6 → PCM stream used to draw waveform

  • media_format:<str> → Format of the attachment (i.e. mp4, ogv, …)

  • location:<str> → Relative path or URL in which the file is located.

audiotracks:<list:dict>

List of available audiotracks. <dict> keys:

  • id:<int> → Audiotrack ID.

  • lang:<str> → Language of the audiotrack.

  • location:<str> → Relative path or URL in which the file is located.

  • is_url:<bool> → Defines whether location is an URL or not. If not, it is a relative physical path.

  • media_format:<str> → Audio format (i.e. mp3, wav, …)

  • voice_type:<str> → Voice type. Possible values:

    • tts → Text-To-Speech, synthesized audiotrack

    • rec → Human-recorded audiotrack.

  • voice_gender:<str> → Voice gender. Possible values:

    • m → Male voice.

    • f → Female voice.

  • sub_type:<int> → If voice_type is tts and the audiotrack was automatically generated form a subtitle file, it shows the corresponding subtitle type code. It defines the level of human supervision of the subtitles that generated the synthesized audiotrack.

    • 0Fully Automatic: The whole subtitles are automatic.

    • 1Partially Human: The subtitles have been supervised, but not for the whole video.

    • 2Fully Human: The subtitles have been fully supervised by humans.

  • description:<str> → A brief text description (if any) of the audiotrack.

attachments:<list:dict>

List of attachments available. <dict> keys:

  • is_url:<bool> → Defines whether location is an URL or not. If not, it is a relative physical path.

  • type_code:<int> → File type code.

    • 1 → Slides file (video, ppt, pdf, txt)

    • 2 → Related documents (ppt, pdf, txt)

  • media_format:<str> → Format of the attachment (i.e. mp4, ogv, …)

  • location:<str> → Relative path or URL in which the file is located.

Output Examples

Media ID does not exist.
 {
  "rcode" : 1,
  "rcode_description" : "Media ID [ 1234-abcd ] does not exist or has no media"
 }
Media ID exists.
  {
    "rcode": 0,
    "rcode_description": "Media list and info available.",
    "attachments": [
        {
            "is_url": true,
            "type_code": 1,
            "media_format": "txt",
            "location": "http://ttp.mllp.upv.es/data/9/9bc70b33e49c2/b3f0bee253651191cdd1f1ee6c865074.txt"
        },
        {
            "is_url": true,
            "type_code": 2,
            "media_format": "txt",
            "location": "http://ttp.mllp.upv.es/data/9/9bc70b33e49c2/d41d8cd98f00b204e9800998ecf8427e.txt"
        }
    ],
    "media": [
        {
            "is_url": true,
            "type_code": 0,
            "media_format": "mp4",
            "location": "http://ttp.mllp.upv.es/data/9/9bc70b33e49c2/1a4edf93069b3.mp4"
        },
        {
            "is_url": true,
            "type_code": 3,
            "media_format": "jpg",
            "location": "http://ttp.mllp.upv.es/data/9/9bc70b33e49c2/d7bbbe4210bd3.jpg"
        },
        {
            "is_url": true,
            "type_code": 6,
            "media_format": "pcm",
            "location": "http://ttp.mllp.upv.es/data/9/9bc70b33e49c2/75bae29f33670.pcm"
        }
    ],
    "lectureinfo": {
        "duration": 86,
        "speakers": [
            {
                "name": "John Snow"
            }
        ],
        "language": "en",
        "title": "I do know nothing"
    },
    "audiotracks": [
        {
            "lang": "es",
            "voice_type": "tts",
            "id": 68,
            "location": "http://ttp.mllp.upv.es/data/9/9bc70b33e49c2/audiotrack.es.mp3",
            "media_format": "mp3",
            "audio_type": null,
            "is_url": true,
            "sub_type": 2,
            "description": null
        }
    ]
  }

/langs

Returns list of subtitle languages available for a specific media ID.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 8. [/langs] Get list of subtitle languages available (GET with multiple parameters).
http://ttp.mllp.upv.es/api/langs?id=MEDIA-ID-1234&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
 {
  "rcode" : <int> ,
  "rcode_description" : <str> ,
  "media_lang" : <str> ,
  "subs_locked" : <bool> ,
  "langs" : [
              {
                "lang_code" : <str> ,
                "lang_name" : <str> ,
                "sup_status" : <str> ,
                "audiotracks": [
                                 {
                                   "aid": <int>,
                                   "voice_gender": <str>,
                                   "voice_type": <str>
                                 },
                                 ...
                               ]

              },
              ...
            ]
 }
rcode:<int>

Return code of the WS call.

  • 0 → Media ID exists.

  • 1 → Media ID does not exist.

rcode_description:<str>

Description of the return code (rcode).

media_lang:<str>

Language code (ISO-639-1) of the media’s original audio language.

subs_locked:<bool>

Lock status of subtitles. If true, only authors are allowed to send subtitle modifications.

langs:<list:dict>

List of languages available. <dict> keys:

  • lang_code:<str> → Subtitle language code (ISO-639-1).

  • lang_name:<str> → Local language name.

  • sup_status:<int> → Subtitle supervision status code. Defines the level of human supervision of the subtitles.

    • 0Fully Automatic: The whole subtitles are automatic.

    • 1Partially Human: The subtitles have been supervised, but not for the whole video.

    • 2Fully Human: The subtitles have been fully supervised by humans.

  • audiotracks:<list:dict> → List of audiotracks available in the corresponding language. <dict> keys:

    • voice_gender:<str> → Voice gender.

      • m → Male voice.

      • f → Female voice.

    • aid:<int> → Audiotrack ID.

    • voice_type:<str> → Voice type.

      • tts → Text-To-Speech, synthesized audiotrack

      • rec → Human-recorded audiotrack.

Output Examples

Media ID does not exist or has no subtitles.
 {
  "rcode" : 1 ,
  "rcode_description" : "ID 1234-abcd does not exist or has no subtitles" ,
 }
Media ID exists and has subtitles available.
{
    "rcode": 0,
    "rcode_description": "Language list available.",
    "media_lang": "es",
    "subs_locked": false,
    "langs": [
        {
            "lang_code": "es",
            "sup_status": 1,
            "lang_name": "Español",
            "audiotracks": []
        },
        {
            "lang_code": "ca",
            "sup_status": 0,
            "lang_name": "Català",
            "audiotracks": []
        },
        {
            "lang_code": "en",
            "sup_status": 2,
            "lang_name": "English",
            "audiotracks": [
                {
                    "aid": 12,
                    "voice_gender": "f",
                    "voice_type": "tts"
                }
            ]
        }
    ]
}

/subs

Returns subtitles for a specific media ID and language. By default, subtitles are sent in DFXP format, although they can be retrieved in many other formats.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

lang

str

Yes

Language code (ISO 639-1).

en

format

int

No

Subtitles format.

  • 0 → transLectures-extended DFXP format (default).

  • 1 → Non-extended DFXP format.

  • 2 → SRT format.

  • 3 → WebVTT format.

  • 4 → Plain Text.

2

session_id

int

No

Load subtitles modifications from the given session ID (if any). If format=0, modified segments will include the highlight attribute (h="1").

327

seg_filt_policy

int

No

Segment text filtering policy.

  • -1 → Empty all segments (removes text from all segments).

  • 0 → Filtering disabled.

  • 1 → Filters out special annotations (default).

1

sel_data_policy

int

No

Subtitle contents to be returned.

  • 0 → Return both former and current contents (only for DFXP format).

  • 1 → Return former contents (automatic transcription or translation).

  • 2 → Return current contents (current status of supervision of the subtitles) (default).

2

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 9. [/subs] Get English subtitles in srt format for media ID "MEDIA-ID-1234" (GET with multiple parameters).
http://ttp.mllp.upv.es/api/subs?id=MEDIA-ID-1234&lang=en&format=2&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

Subtitles file:
  • DFXP file: Content-Type = application/ttml+xml

  • SRT file: Content-Type = application/x-subrip

  • VTT file: Content-Type = text/vtt

  • TXT file: Content-Type = text/plain

JSON Object (if media ID or subtitles file does not exist):
  • Content-Type = application/json

 {
  "rcode" : <int> ,
  "rcode_description" : <int>
 }
rcode:<int>

Return code.

  • 1 → Media ID does not exist or subtitles in the specified language do not exist.

  • 2 → The provided Session ID does not exist.

  • 3 → The provided Session ID does not involve the given Media ID.

rcode_description:<str>

Description of the return code (rcode).

Output Examples

Media ID does not exist.
 {
  "rcode" : 1 ,
  "rcode_description" : "Media ID [MEDIA-ID-1234] does not exist"
 }
Returning English subtitles in srt format.
1
00:08:44,090 --> 00:08:48,680
Good morning, my name is Kit Harington, but people is used to call me John Snow.

2
00:08:48,680 --> 00:08:51,850
Winter is coming, isn't it?

/audiotrack

Sends in binary format an audiotrack file.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

lang

str

Yes

Language code (ISO 639-1).

en

aid

int

Yes

Audiotrack ID (from /metadata).

428

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 10. [/audiotrack] Download audiotrack ID 428 in English language from media ID MEDIA-ID-1234 (GET with multiple parameters).
http://ttp.mllp.upv.es/api/audiotrack?id=MEDIA-ID-1234&lang=en&aid=428&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

Audiotrack file:
  • A Content-Type different than application/json, typically audio/mpeg (mp3) or audio/wav (wav)

JSON Object (if media ID or audiotrack ID does not exist):
  • Content-Type = application/json

 {
  "rcode" : <int> ,
  "rcode_description" : <str>
 }
rcode:<int>

Return code.

  • 1 → Media ID does not exist or audiotrack ID does not exist.

rcode_description:<str>

Description of the return code (rcode).

Output Examples

Media ID does not exist.
 {
  "rcode" : 1 ,
  "rcode_description" : "Media ID [MEDIA-ID-1234] does not exist"
 }

/start_session

Starts an edition session to send and commit modifications of a subtitles file. Edition sessions are a mechanism devoted to avoid race conditions between different users when editing a subtitles file.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

author_id

str

Yes

Author ID, authorised by the API client, that starts the edition session.

jsnow21

author_name

str

No

Author Name

John Snow

author_conf

int

Yes

Confidence level of the author, from 0 to 100.

90

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 11. [/start_session] Start an edition session for media ID "MEDIA-ID-1234" (GET with multiple parameters).
http://ttp.mllp.upv.es/api/start_session?id=MEDIA-ID-1234&author_id=jsnow21&author_conf=90&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
 {
    "rcode": <int>,
    "rcode_description": <str>,
    "session_id": <int>,
    "author_id": <str>,
    "author_name": <str>,
    "author_conf": <int>,
    "author_type": <str>,
    "started_at": <str>,
    "last_update": <str>
 }
rcode:<int>

Return code.

  • 0 → Session successfully started, restarted, or continued. Returns information about the session.

  • 1 → Media ID does not exist.

  • 2 → Media ID has no subtitles.

  • 3 → Subtitles for the Media ID have been locked. Only authors are allowed to modify them.

  • 4 → Could not start session due to the existence of an started session by another author ID for the same Media ID. Returns information about the existing session.

rcode_description:<str>

Description of the return code (rcode).

session_id:<int>

Session ID.

author_id:<str>

Author ID that started the session and currently editing the Media ID.

author_conf:<int>

Confidence level of the author ID, from 0 to 100.

author_name:<int>

Author Name.

author_type:<str>

Author Type.

  • human → Session ID is being carried by a human.

  • computer → Session ID has been opened by the Ingest Service, in order to automatically regenerate subtitles and audiotracks.

started_at:<str>

Session start timestamp.

last_update:<str>

Timestamp of the last update (/mod call) made by the user on the session.

Output Examples

Session started.
 {
    "rcode": 0,
    "rcode_description": "Session started.",
    "session_id": 8,
    "author_id": "jsnow21",
    "author_name": "John Snow",
    "author_conf": 90,
    "author_type": "human",
    "started_at": "2015-07-04 20:44:55.042786",
    "last_update": "2015-07-04 20:44:55.042786",
 }
Another user is editing the same media ID.
 {
    "rcode": 4,
    "rcode_description": "Cannot start mod session: there exist an open mod session for the given media ID.",
    "session_id": 7,
    "author_id": "olly666",
    "author_name": "Olly",
    "author_conf": 100,
    "author_type": "human",
    "started_at": "2015-07-04 12:23:15.040186",
    "last_update": "2015-07-04 20:32:11.892843"
 }

/session_status

Returns the current status of the given session ID. If it is alive, it updates the last alive timestamp (last_update output key). This interface is commonly used to avoid the automatic end of session due to user inactivity.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

session_id

int

Yes

Session ID

8

author_id

str

Yes

Author ID, authorised by the API client, owner of the session_id.

jsnow21

author_conf

int

Yes

Confidence level of the author, from 0 to 100.

90

alive

int

No

Alive message type.

  • 0 → Alive and user update (default).

  • 1 → Only alive.

1

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 12. [/session_status] Check status of session ID 8 (GET with multiple parameters).
http://ttp.mllp.upv.es/api/session_status?id=MEDIA-ID-1234&session_id=8&author_id=jsnow21&author_conf=90&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
{
    "rcode": <int>,
    "rcode_description": <str>,
    "started_at": <str>,
    "last_update": <str>,
    "last_alive": <str>,
    "ended_at": <str>,
    "ended_by_id": <str>,
    "ended_by_name": <str>,
    "ended_by_type": <str>
}
rcode:<int>

Return code.

  • 0 → Session alive.

  • 1 → Session ID does not exist.

  • 2 → The provided Session ID does not involve the given Media ID.

  • 3 → Not allowed to check the status of the given session ID.

  • 4 → Session ID is not open (another user forced its closure, or was automatically closed after several hours of inactivity).

rcode_description:<str>

Description of the return code (rcode).

started_at:<str>

Session start timestamp.

last_update:<str>

Session last update timestamp.

last_alive:<str>

Session last alive timestamp.

ended_at:<str>

Session end timestamp. Will be null unless rcode=4.

ended_by_id:<str>

Author ID of the user that ended this session. Will be null unless rcode=4.

ended_by_name:<str>

Author name of the user that ended this session. Will be null unless rcode=4.

ended_by_type:<str>

Author type of the user that ended this session. Will be null unless rcode=4.

  • human → Session ID is was closed by a human.

  • computer → Session ID was automatically closed due to a large period of inactivity.

Output Examples

Session alive.
{
    "rcode": 0,
    "rcode_description": "Session alive.",
    "started_at": "2015-07-04 20:44:55.042786",
    "last_update": "2015-07-04 21:03:23.017192",
    "last_alive": "2015-07-04 21:03:51.932841",
    "ended_at": null,
    "ended_by_id": null,
    "ended_by_name": null
    "ended_by_type": null
}
Session closed.
{
    "rcode": 4,
    "rcode_description": "Session ID 8 is closed.",
    "started_at": "2015-07-04 20:44:55.042786",
    "last_update": "2015-07-04 20:44:55.042786",
    "last_alive": "2015-07-04 20:45:03.016531",
    "ended_at": "2015-07-04 22:28:51.075672",
    "ended_by_id": "lord_stark1",
    "ended_by_name": "Eddard Stark",
    "ended_by_type": "human"
}

/mod

Send and commit modifications of subtitles files made by a user under a session ID returned by /start_session interface.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

session_id

int

Yes

Session ID

8

author_id

str

Yes

Author ID, authorised by the API client, owner of the session_id.

jsnow21

author_conf

int

Yes

Confidence level of the author, from 0 to 100.

90

mods

json

Yes

JSON Dictionary containing as many key-values as subtitle languages has been modified, being keys a ISO 639-1 language code of the subtitle languages edited, and values a dictionary containing the following keys:

  • del:<list:int> → List of segment IDs that have been deleted.

  • txt:<list:dict> → List of segment IDs that have been modified. <dict> keys:

    • sI:<int> → Segment ID modified.

    • b:<float> → Segment begin/start time.

    • e:<float> → Segment end time.

    • t<<str> → Text of the specified segment ID.

Note: When using the GET method with multiple GET parameters, this JSON object must be encoded in Base64.

{ "en": {"txt":[ {"sI":16, "b":87.96, "e":91.37, "t":"She told me: You know nothing, John Snow"} ], "del":[3,7]}, "es": {"del":[9], "txt":[]} }

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 13. [/mod] Send modifications of the English and Spanish subtitles (see mods parameters example above) of media ID "MEDIA-ID-1234" under session ID 8 (GET with multiple parameters).
http://ttp.mllp.upv.es/api/mod?id=MEDIA-ID-1234&session_id=8&language=en&author_id=jsnow21&author_conf=90&mod=eyAiZW4iOiB7InR4dCI6WyB7InNJIjoxNiwgImIiOjg3Ljk2LCAiZSI6OTEuMzcsICJ0IjoiU2hlIHRvbGQgbWU6IFlvdSBrbm93IG5vdGhpbmcsIEpvaG4gU25vdyJ9IF0sICJkZWwiOlszLDddfSwgImVzIjogeyJkZWwiOls5XSwgInR4dCI6W119IH0=&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
{
  "rcode" : <int> ,
  "rcode_description" : <str> ,
  "details": [
        {
            "language": <str>,
            "rcode_description": <str>,
            "rcode": <int>
        },
        ...
  ]
}
rcode:<int>

Return code.

  • 0 → Changes succesfully stored in the session.

  • 1 → Session ID does not exist.

  • 2 → The provided Session ID does not involve the given Media ID.

  • 3 → Not allowed to send modifications under the given session ID.

  • 4mods parameter is an empty dictionary.

  • 5 → Format errors where found in all subtitle language modifications (see details key).

  • 6 → Session ID is not open (another user forced its closure, or was automatically closed after several hours of inactivity). In addition, format errors where found in some subtitle language modifications (see details key). changes without errors are not commited to the edition session, but they are backuped.

  • 7 → Session ID is not open (another user forced its closure, or was automatically closed after several hours of inactivity). Changes are not commited to the edition session, but they are backuped.

  • 8 → Some subtitle language modifications were successfully commited, but some other failed (see details key).

rcode_description:<str>

Description of the return code (rcode).

details:<array:dict>

List of dicts for every subtitle language modifications if format errors were found (rcode=5,6,8).

Output Examples

Changes successfully saved.
 {
  "rcode" : 0 ,
  "rcode_description" : "Changes successfully saved."
 }
Session ID is not open.
 {
  "rcode" : 4 ,
  "rcode_description" : "Session ID [ 8 ] is closed, changes have been backuped. Please contact system administrator."
 }
Not all subtitle modifications were saved.
 {
    "rcode_description": "Modifications for some subtitle languages were successfully saved, but other failed. Please see 'details' key.",
    "rcode": 8,
    "details": [
        {
            "rcode_description": "The given media ID does not have 'en' subtitles.",
            "rcode": 10,
            "language": "en"
        },
        {
            "rcode_description": "Changes successfully saved.",
            "rcode": 0,
            "language": "es"
        }
    ]
 }

/end_session

Ends an open edition session. Depending on the confidence of the user, editions are directly stored in the corresponding DFXP files or are left for revision.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

session_id

int

Yes

Session ID

8

author_id

str

Yes

Author ID, authorised by the API client, that closes the session.

jsnow21

author_name

str

No

Author Name

John Snow

author_conf

int

Yes

Confidence level of the author, from 0 to 100.

90

force

int

No

Force end session when user and/or author_id are not the owners.

  • 0 → Do not force end session (default).

  • 1 → Force end session.

1

regenerate

int

No

Request regeneration of subtitles and/or synthesized audiotracks immediately after closing the session.

  • 0 → Do not request regeneration of susbtitles and/or synthesized audiotracks.

  • 1 → Request regeneration of susbtitles and/or synthesized audiotracks (default).

0

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 14. [/end_session] End the edition session ID 8 for media ID "MEDIA-ID-1234" (GET with multiple parameters).
http://ttp.mllp.upv.es/api/end_session?id=MEDIA-ID-1234&session_id=8&author_id=jsnow21&author_conf=90&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
{
  "rcode" : <int> ,
  "rcode_description" : <str>
}
rcode:<int>

Return code.

  • 0 → Session successfully closed or already closed before.

  • 1 → Session ID does not exist.

  • 2 → The provided Session ID does not involve the given Media ID.

  • 3 → Session ID is owned by the Ingest Service (re-generation of subtitles and audiotracks), and it cannot be ended manually.

  • 4 → Unallowed end session request (different user, author_id and/or author_conf parameters; force parameter, if provided, was even insufficent).

rcode_description:<str>

Description of the return code (rcode).

Output Examples

Session closed.
 {
    "rcode": 0,
    "rcode_description": "Session succesfully closed."
 }

/lock_subs

Allow/disallow regular users to send subtitles modifications for an specific Media ID.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

lock

int

Yes

Lock action:

  • 1 → Lock.

  • 0 → Unlock.

1

author_id

str

Yes

Author ID, authorised by the API client, owner of the session_id.

jsnow21

author_conf

int

Yes

Confidence level of the author, from 0 to 100. Must be 100.

100

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 15. [/lock_subs] Disallow modifications for subtitles of media ID "MEDIA-ID-1234" (GET with multiple parameters).
http://ttp.mllp.upv.es/api/lock_subs?id=MEDIA-ID-1234&lock=1&author_id=lord_stark1&author_conf=100&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
{
  "rcode" : <int> ,
  "rcode_description" : <str>
}
rcode:<int>

Return code.

  • 0 → Subtitles successfully locked/unlocked.

rcode_description:<str>

Description of the return code (rcode).

Output Examples

Subtitles successfully locked.
 {
  "rcode" : 0 ,
  "rcode_description" : "Subtitles successfully locked."
 }

/edit_history

Returns a list of all edit sessions carried out over an specific Media ID. Session edits can be applied to a subtitles file calling the /subs interface and passing to it the proper Session ID to the session_id parameter.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 16. [/edit_history] Get list of edit sessions of a Media ID.
http://ttp.mllp.upv.es/api/edit_history?id=MEDIA-ID-1234&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
 {
    "rcode_description": <str>,
    "rcode": <int>,
    "edit_history": [
       {
           "session_id": <int>,
           "author_id": <str>,
           "author_conf": <int>,
           "author_name": <str>,
           "timestamp": <str>,
           "requires_revision": <bool>,
           "revised": <bool>,
           "revised_at": <str>,
           "revised_by_id": <int>,
           "revised_by_name": <str>,
           "revised_via": <str>,
           "revised_in_session_id": <int>,
           "edit_stats": <dict>
       },
       ...
    ]
 }
rcode:<int>

Return code.

  • 0 → Edit history available. Returns the history in edit_history.

  • 1 → Media ID does not exist.

edit_history:<array:dict>

List of dictionaries containing information about each edit session. <dict> keys:

  • session_id:<str>: Session ID.

  • author_id:<str>: Author ID owner of the session.

  • author_conf:<int>: Confidence level of the author, from 0 to 100.

  • author_name:<str>: Author name. Can be null.

  • timestamp:<str>: Session end timestamp.

  • requires_revision:<bool>: Indicates if session edits require revision to be approved.

  • revised:<bool>: Declares whether the session has been revised or not (see /mark_revised). Will be null if revised = False.

  • revised_by_id:<str>: Author ID that revised this session edits. Will be null if revised = False.

  • revised_by_name:<str>: Name of the author that revised this session edits. Will be null if revised = False.

  • revised_via:<str>: Specifies from which Web Service interface the revision was set as revised. Could be mark_revised, accept or reject. Will be null if revised = False.

  • revised_in_session_id:<int>: Session ID under the author ID revised this session edits. Can be null.

  • edit_stats:<dict>: Dictionary containing statistics of the session’s edits. Each dictionary key is a language code (ISO-639-1) whose value is another dictionary that contains edit statistics of the corresponding subtitles language. These dictionaries feature the following keys:

    • del_segs:<int>: Number of segments deleted.

    • edit_segs:<int>: Number of segments edited.

    • edit_time:<float>: Total amount of media time in seconds that has been covered in the edit.

    • edit_time_percent:<float>: Percentage of media time that has been covered in the edit with respect to the media length.

Output Examples

Edit history for a particular Media ID is returned.
{
    "rcode_description": "Edit history available.",
    "rcode": 0,
    "edit_history": [
        {
            "session_id": 9,
            "author_id": "lord_stark1",
            "author_conf": 100,
            "author_name": "Eddard Stark",
            "timestamp": "2015-07-05 12:14:15.824233",
            "requires_revision": false,
            "revised": null,
            "revised_at": null,
            "revised_by_id": null,
            "revised_by_name": null,
            "revised_via": null,
            "revised_in_session_id": null,
            "edit_stats": {
                "en": {
                    "del_segs": 2,
                    "edit_segs": 67,
                    "edit_time": 631.97,
                    "edit_time_percent": 100.00
                },
                "es": {
                    "del_segs": 3,
                    "edit_segs": 5,
                    "edit_time": 32.52,
                    "edit_time_percent": 5.12
                }
            }
        },
        {
            "session_id": 8,
            "author_id": "jsnow21",
            "author_conf": 90,
            "author_name": "John Snow",
            "timestamp": "2015-07-04 21:23:07.000031",
            "requires_revision": true,
            "revised": true,
            "revised_at": "2015-07-05 12:14:15.824233",
            "revised_by_id": "lord_stark1",
            "revised_by_name": "Eddard Stark",
            "revised_via": "mark_revised",
            "revised_in_session_id": 9,
            "edit_stats": {
                "en": {
                    "del_segs": 0,
                    "edit_segs": 21,
                    "edit_time": 197.23,
                    "edit_time_percent": 31.74
                }
            }
        },
        {
            "session_id": 7,
            "author_id": "olly666",
            "author_conf": 20,
            "author_name": "Olly",
            "timestamp": "2015-07-02 18:53:59.565720",
            "requires_revision": true,
            "revised": false,
            "revised_at": null,
            "revised_by_id": null,
            "revised_by_name": null,
            "revised_via": null,
            "revised_in_session_id": null,
            "edit_stats": {
                "en": {
                    "del_segs": 4,
                    "edit_segs": 3,
                    "edit_time": 13.22,
                    "edit_time_percent": 2.58
                }
            }
        }
    ]
}

/revisions

Returns a list of all edit sessions for all API user’s media files that are pending to be revised.

Input

Parameter name Type Required Description Example

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

Example 17. [/revisions] Get list of all pending revisions.
http://ttp.mllp.upv.es/api/revisions?user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
 [
    {
        "media_id": <str>,
        "session_id": <int>,
        "author_id": <str>,
        "author_conf": <int>,
        "author_name": <str>,
        "timestamp": <str>,
        "edit_stats": <dict>
    },
    ...
 ]
media_id:<str>

Media ID.

session_id:<str>

Session ID.

author_id:<str>

Author ID owner of the session.

author_conf:<int>

Confidence level of the author, from 0 to 100.

author_name:<str>

Author name. Can be null.

timestamp:<str>

Session end timestamp.

edit_stats:<dict>

Dictionary containing statistics of the session’s edits. Each dictionary key is a language code (ISO-639-1) whose value is another dictionary that contains edit statistics of the corresponding subtitles language. These dictionaries feature the following keys:

  • del_segs:<int>: Number of segments deleted.

  • edit_segs:<int>: Number of segments edited.

  • edit_time:<float>: Total amount of media time in seconds that has been covered in the edit.

  • edit_time_percent:<float>: Percentage of media time that has been covered in the edit with respect to the media length.

Output Examples

Pending revisions list returned.
 [
    {
        "media_id": "MEDIA-ID-1234",
        "session_id": 8,
        "author_id": "jsnow21",
        "author_conf": 90,
        "author_name": "John Snow",
        "timestamp": "2015-07-04 21:23:07.000031",
        "edit_stats": {
            "en": {
                "del_segs": 2,
                "edit_segs": 1,
                "edit_time": 3.41,
                "edit_time_percent": 1.97
            }
        }
    }
 ]

/mark_revised

Mark/unmark an edition session as revised.

Input

Parameter name Type Required Description Example

id

str

Yes

Media ID.

MEDIA-ID-1234

session_id

int

Yes

Session ID

8

author_id

str

Yes

Author ID, authorised by the API client, who revised the changes made in the Session ID session_id.

lord_stark1

author_name

str

No

Author Name

Eddard Stark

author_conf

int

Yes

Confidence level of the author. Must be 100.

100

revision_session_id

int

No

Session ID under which the given Author ID revised the changes made in the Session ID session_id.

12

unmark

int

No

Do the inverse process: delete revised mark.

  • 0 → Mark session as revised (default).

  • 1 → Remove revised mark from the session.

1

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 18. [/mark_revised] End the edition session ID 8 for media ID "MEDIA-ID-1234" (GET with multiple parameters).
http://ttp.mllp.upv.es/api/mark_revised?id=MEDIA-ID-1234&session_id=8&author_id=lord_stark1&author_conf=100&revision_session_id=12&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json
{
  "rcode" : <int> ,
  "rcode_description" : <str>
}
rcode:<int>

Return code.

  • 0 → Session ID successfully marked/unmarked as revised.

  • 1 → Media ID does not exit.

  • 2 → Session ID does not exist.

  • 3 → Session ID has not been closed yet.

  • 4 → The provided Session ID does not involve the given Media ID.

  • 5 → Session ID did not require revison.

  • 6 → Session ID was already revised before.

  • 7 → Revision Session ID does not exist.

  • 8 → Revision Session ID does not involve the given Media ID.

  • 9 → Revision Session ID is not owned by the author ID that is requesting this session ID to be marked as revised.

rcode_description:<str>

Description of the return code (rcode).

Output Examples

Edition session marked as revised.
 {
    "rcode": 0,
    "rcode_description": "Session ID marked as revised."
 }

/accept

Accept modifications of one or more pending edit sessions without having to revise them. Modifications are commited into the corresponding subtitles files.

Input

Parameter name Type Required Description Example

id

str

Yes

Comma-separated list of Session IDs whose edits are meant to be accepted by the given Author ID.

27,43,82,121

author_id

str

Yes

Author ID, authorised by the API client, who revised the changes made in the Session ID session_id.

lord_stark1

author_name

str

No

Author Name

Eddard Stark

author_conf

int

Yes

Confidence level of the author. Must be 100.

100

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 19. [/accept] Accept edits of session IDs 27, 43, 82, 121.
http://ttp.mllp.upv.es/api/accept?id=27,43,82,121&&author_id=lord_stark1&author_conf=100&revision_session_id=12&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json

Returns a list of dictionaries, one for each Session ID.

[
  {
    "session_id": <int>,
    "rcode" : <int> ,
    "rcode_description" : <str>
  },
  ...
]
session_id:<int>

Session ID.

rcode:<int>

Return code.

  • 0 → Changes from Session ID successfully accepted.

  • 1 → Session ID does not exist.

  • 2 → Unallowed access to the provided Session ID.

  • 3 → Session ID has not been closed yet.

  • 4 → Session ID did not require revison.

  • 5 → Session ID was already revised before.

  • 6 → Could not save Session ID changes to subtitles file.

rcode_description:<str>

Description of the return code (rcode).

Output Examples

Trying to accept edits of session IDs 27, 43, 82, 121.
[
    {
        "rcode_description": "Session ID does not exist.",
        "rcode": 1,
        "session_id": 27
    },
    {
        "rcode_description": "Session ID did not require revision.",
        "rcode": 4,
        "session_id": 43
    },
    {
        "rcode_description": "Session ID already revised.",
        "rcode": 5,
        "session_id": 82
    },
    {
        "rcode_description": "Changes from Session ID successfully accepted.",
        "rcode": 0,
        "session_id": 121
    }
]

/reject

Reject modifications of one or more pending edit sessions without having to revise them.

Input

Parameter name Type Required Description Example

id

str

Yes

Comma-separated list of Session IDs whose edits are meant to be accepted by the given Author ID.

27,43,82,121

author_id

str

Yes

Author ID, authorised by the API client, who revised the changes made in the Session ID session_id.

lord_stark1

author_name

str

No

Author Name

Eddard Stark

author_conf

int

Yes

Confidence level of the author. Must be 100.

100

user

str

Yes

TLP Username / API Username.

tluser

auth_token

str

Yes

Authentication token for the provided user.

edbab44c3a3f1ca8db4de8277a3b

expire

int

No

Expiration date UNIX timestamp of the request (seconds since 01-01-1970 in UTC to the expiration date). Required only if auth_token is a request key.

1434132876

Example 20. [/reject] Accept edits of session IDs 27, 43, 82, 121.
http://ttp.mllp.upv.es/api/reject?id=27,43,82,121&&author_id=lord_stark1&author_conf=100&revision_session_id=12&user=tluser&auth_token=edbab44c3a3f1ca8db4de8277a3b

Output

JSON Object: Content-Type = application/json

Returns a list of dictionaries, one for each Session ID.

[
  {
    "session_id": <int>,
    "rcode" : <int> ,
    "rcode_description" : <str>
  },
  ...
]
session_id:<int>

Session ID.

rcode:<int>

Return code.

  • 0 → Changes from Session ID successfully rejected.

  • 1 → Session ID does not exist.

  • 2 → Unallowed access to the provided Session ID.

  • 3 → Session ID has not been closed yet.

  • 4 → Session ID did not require revison.

  • 5 → Session ID was already revised before.

rcode_description:<str>

Description of the return code (rcode).

Output Examples

Trying to reject edits of session IDs 27, 43, 82, 121.
[
    {
        "rcode_description": "Session ID does not exist.",
        "rcode": 1,
        "session_id": 27
    },
    {
        "rcode_description": "Session ID did not require revision.",
        "rcode": 4,
        "session_id": 43
    },
    {
        "rcode_description": "Session ID already revised.",
        "rcode": 5,
        "session_id": 82
    },
    {
        "rcode_description": "Changes from Session ID successfully rejected.",
        "rcode": 0,
        "session_id": 121
    }
]

Appendix D: Generating a Request Key

The request key is an alternative authentication method for the TLP Web Service that avoids revealing the API secret key of the client in the requests to the Web Service. This method can be used only with a reduced subset of Web Service interfaces (see Web Service API Specification Preface).

A request key depends on the values of some call parameters, and therefore it has to be explicitly generated for each API call.

The request key token is divided in two parts. The first part - the basic request key - is mandatory for all interfaces, while the second part is needed only when calling the /mod interface. Both parts are SHA-1 sums of a string composed by the concatenation of different call parameters plus the user’s API secret key. The concatenation of both key parts becomes the full request key.

On the one hand, the first part (basic request key) consists of the SHA-1 sum (40 bytes length) of:

  • id API call parameter

  • expire API call parameter

  • user API call parameter

  • User’s secret key.

 SHA-1 (id + expire + user + secret_key)

On the other hand, the second part is the SHA-1 sum (40 bytes length) of:

  • author_id API call parameter

  • author_conf API call parameter

  • expire API call parameter

  • user API call parameter

  • User’s secret key.

 SHA-1 (author_id + author_conf + expire + user + secret_key)

The full request key (80 bytes length) is the concatenation of both parts:

 SHA-1 (id + expire + user + secret_key ) + SHA-1 (author_id + author_conf + expire + user + secret_key)

It is important to note that all Web Service interfaces will read only the first part (the first 40 bytes, ignoring the remaining bytes) of the request key, except the /mod one that requires the full request key.

Example 21. Generation of a request key.

Input parameters:

id = media-1234
expire = 1433875891
user = tluser
author_id = player_user_1234
author_conf = 100
secret_key = mhes28gfj7vfdg7ylnpapom26ksjfyvjmsoe

First part (Basic request key):

SHA-1 (id + expire + user + secret_key)
SHA-1 (media-12341433875891tlusermhes28gfj7vfdg7ylnpapom26ksjfyvjmsoe)
0cce9cd1c2a0f8d4cd1486bc317c24cb137fbf58

Second part:

SHA-1 (author_id + author_conf + expire + user + secret_key)
SHA-1 (player_user_12341001433875891tlusermhes28gfj7vfdg7ylnpapom26ksjfyvjmsoe)
5ff4dec6d50b3aac13f3d0a7de4e3bb111bc107d

Full request key:

0cce9cd1c2a0f8d4cd1486bc317c24cb137fbf585ff4dec6d50b3aac13f3d0a7de4e3bb111bc107d

Appendix E: Media Package Specification

A Media Package File is an uncompressed ZIP file that contains several media files and attachments plus a JSON file, named manifest.json, that declares in a JSON object all the uploaded media files and attachments included in the Media Package, in addition to other metadata. All files must be placed in the root on the ZIP package (not inside folders or sub-folders).

Media Package Files (MPF) are uploaded to TLP via the /ingest interface of the Web Service.

Manifest file JSON Specification

{
  "operation_code" : <int>,
  "media" : {
              "url" : <str> ,
              "filename" : <str> ,
              "fileformat" : <str> ,
              "md5" : <str>
            } ,
  "attachments" : [
                    {
                      "filename" : <str> ,
                      "fileformat" : <str> ,
                      "md5" : <str> ,
                      "type_code" : <str> ,
                      "language" : <str> ,
                      "human" : <bool>
                    },
                    ...
                  ] ,
  "metadata" : {
                 "external_id" : <str> ,
                 "language" : <str> ,
                 "title" : <str> ,
                 "topic" : <str> ,
                 "keywords" : <str> ,
                 "date" : <str> ,
                 "speakers" : [
                                {
                                  "speaker_id" : <int> ,
                                  "speaker_name" : <str> ,
                                  "speaker_gender" : <str> ,
                                  "speaker_email" : <str>
                                } ,
                                ...
                              ]

               },
  "requested_langs": <dict> ,
  "transLecture" : <int> ,
  "tL-regenerate": [ <str>, ...] ,
  "tL-force": <int> ,
  "delete_mode" : <str>,
  "test_mode" : <bool>
}
  • operation_code:<int> → Operation type:

    • 0New media. A new media will be processed by the Ingest Service and inserted into database.

    • 1Update media. An existing media will be updated after processing the input data by the Ingest Service.

    • 2Delete media. An existing media will be deleted from the database.

    • 3Cancel upload. An ongoing upload will be cancelled.

  • media:<dict> → Main media file to be transcribed and/or translated. <dict> keys:

    • url:<str> → URL to the main media file. If a url field is given the other fields are ignored.

    • filename:<str> → File name of the main media file.

    • fileformat:<str> → Format of the main media file (see "Allowed attachments" below).

    • md5:<str> → MD5 checksum of the main media file.

  • metadata:<dict> → Media metadata. <dict> keys:

    • external_id:<str> → Media ID (typically an internal ID in the client’s media repository database) used to identify the video in further queries to the Web Service, or Upload ID in case of Cancel Upload operation.

    • title:<str> → Title of the media.

    • language:<str> → Media language code in ISO 639-1 format (e.g. "en", "es").

    • speakers:<list:dict> → Information about the speaker(s) of the media. <dict> keys:

      • speaker_id:<int> → Speaker ID (client).

      • speaker_name:<str> → Full name of the speaker.

      • speaker_email:<str> → E-mail of the speaker (optional).

      • speaker_gender:<str> → Gender of the speaker (optional).

        • M → Male.

        • F → Female.

    • topic:<str> → Topic of the media (optional).

    • keywords:<str> → Media keywords (optional).

    • date:<str> → Publication date of the media (optional).

  • attachments:<list:dict> → Additional files that have been attached to the media package, such as slides, related documents or subtitles. <dict> keys:

    • filename:<str> → File name of the attachment.

    • fileformat:<str> → Format of the attachment (see "Allowed attachments" below).

    • md5:<str> → MD5 checksum of the attachment.

    • type_code:<int> → Attachment type code (see "Allowed attachments" below).

      • 0 → Media file.

      • 1 → Slides file.

      • 2 → Related text document file.

      • 3 → Video Snapshot/Thumbnail file.

      • 4 → Subtitles file.

      • 5 → Audiotrack file.

    • language:<str> → Language of the attachment, in case it is a subtitles file, in ISO 639-1 format (e.g. "en", "es") (optional, default null).

    • human:<bool> → If the attachment is a subtitles file, determine if provided subtitles have been generated by humans (optional, default true).

  • requested_langs:<dict> → Explicit request of subtitles and audiotrack languages, along with some advanced options. Please see Requesting Subtitle Languages section (optional).

  • tL-regenerate:<list:str> → On update operation: request a regeneration of translations and/or synthesized audiotracks. Must be a list of keywords (optional). Allowed Keywords:

    • tx → Request regeneration of the media transcription.

    • tl → Request regeneration of media translations.

    • tts → Request regeneration of synthesized audiotracks.

  • tL-force:<int> → On update operation: Force regeneration of automatic subtitles even if there exist human-supervised subtitles (optional).

    • 0 → Do not force regeneration of subtitles (default).

    • 1 → Force regeneration of subtitles.

  • transLecture:<int> → Enable/disable transcription and translation technologies (optional).

    • 0 → Disable automatic transcription and translation of the uploaded media. On new operations, an empty subtitles file for the spoken language is generated.

    • 1 → Enable automatic transcription and translation of the uploaded media (default).

  • delete_mode:<str> → On delete operation: Specify delete mode (optional).

    • soft → The media will be marked as "deleted" in the database and its files will be backed up.

    • hard → All media and subtitle files stored will be deleted, as well as all related entries in the Database (default).

  • test_mode:<bool> → Enable/Disable the Test Mode of the Ingest Service. When enabled, the uploaded media will be available inmediately with an empty subtitles file. This feature is very useful when executing integration tests with the /ingest interface. By default it is disabled (optional).

Requesting Subtitle Languages

The requested_langs option, as stated before, is used to request additional subtitle languages, specifying advanced transcription, translation or text-to-speech options. requested_langs is a JSON dictonary in which keys are ISO 639-1 language codes (e.g. "en", "es"), and values are dictionaries in which keys are the objects or outputs requested to be generated for that particular language, and values are dictionaries with advanced options. Those keys (objects) might be:

  • sub:<dict> → Generate subtitles for that specific language.

  • tts:<dict> → Generate text-to-speech audiotracks for that specific language.

Note sub and tts values can be empty dictionaries ({}) or null values, if no advanced options are specified.
 "requested_langs": {
    "es": {
       "sub": {}
     },
    "en": {
       "tts": {},
       "sub": {}
     }
 }

The example above means: "Generate Spanish and English subtitles and a English TTS audiotrack, using default options in all cases".

Advanced options:

  • sub options:

    • sid:<int> → Specify which System will be applied to generate the transcription, translation, or audiotrack file. If not specified, the default system is used.

    • lma:<bool> → Only for transcriptions (ASR): Enable or disable Language Model Adaptation. By default it is enabled.

    • tlpath:<list> → Only for translations (MT): Explicitly declare a Translation Path. This is useful to generate translations to a language which is not featured directly from the spoken language, using intermediate translation languages. It is declared as an ordered list of dictionaries, where each dictionary specifies the target language code l of the step, and optionally the System ID sid to apply.

      The example below shows how to request Catalan (ca) subtitles from the spoken language (XX) using English (en) and Spanish (es) as intermediate languages, thus defining the following translation path: XX->En->Es->Ca. The intermediate En->Es translation is generated using System ID 3.

       "requested_langs": {
          "ca": {
             "sub": {
                      "tlpath": [
                         { "l":"en" },
                         { "l":"es", "sid":3},
                         { "l":"ca"}
                       ]
              }
           }
       }

      If this option is not specified, the Ingest Service will assume that a direct translation from the spoken language is requested.

  • tts options:

    • sid:<int> → Specify which System ID will be applied to generate the synthesized audiotrack. If not specified, the default system is used.

The following example of the requested_langs option requests Estonian (et) subtitles disabling the Language Model Adaptation feature and making use of the System ID 22, as well as English (en) subtitles with default options and a synthesized English audiotrack using System ID 54.

 "requested_langs": {
    "et": {
       "sub": { "lma":False, "sid":22 }
     }
    "en": {
       "sub": { }
       "tts": { "sid":54 }
     }
 }

Ingest Service Behavior

This section explains how the Ingest Service will behave in different manners depending on both the data declared in the manifest file and the requested operation type (new, update, delete or cancel).

New Media operation

Required inputs:

  • A media file to be transcribed and/or translated in the media section.

  • Required metadata keys:

    • external_id → Media ID in the remote (client) repository.

    • title → Title of the Media. It must be as descriptive as possible, since it might be used to search and download from the Internet related documents in order to adapt the ASR system to the topic of the video and enhance the quality of the automatic subtitles.

    • language → Spoken language of the media file, in ISO 639-1 format (e.g. "en", "es").

    • speakers → Speaker(s) info.

Behavior changes with optional inputs

  • requested_langs option:

    • Not provided: By default the media only will be transcribed, if possible and if needed (depending on the provided attachments). No translation nor TTS processes are launched.

    • Provided: Actions that can be inferred from this option will be executed, unless the expected outputs are already provided in the attachments section.

  • attachments section:

    • Language Model Adaptation on transcription depending on provided textual attachments: If Language Model Adaptation is enabled for the ASR System that generates the transcription file, this ASR system will be adapted to the topic of the media using different textual resources:

      • No text attachments provided: The adaptation will be carried out using external resources automatically downloaded from the Internet based on a web search using the title of the media.

      • A slides file: The adaptation will be carried out using the text extracted from the slides file.

      • Related documents: The adaptation will be carried out using the text extracted from the attached documents.

      • A slides file + Related documents: The adaptation will be carried out using both the text extracted from the slides file and the text extracted from the attached documents.

    • Providing expected outputs:

      • Subtitles file (spoken language): Media won’t be transcribed. Subtitles will be translated and afterwards synthesized if explicitly requested in the requested_langs option.

      • Subtitles files (other languages): Media will be firstly transcribed, and then translated into other destination languages, except for the provided subtitles language, if explicitly requested in the requested_langs option. Synthesized audiotracks will be also generated if explicitly requested.

      • Subtitles files (spoken language + other languages): The media will be translated into the remaining available destination languages, if any and if explicitly requested in the requested_langs option. Synthesized audiotracks will be generated if explicitly requested.

      • Audiotrack files: Synthesized audiotracks for the languages of the attached audiotracks won’t be generated even if they were explicitly requested.

Update Media operation

Required inputs:

  • Required metadata keys:

    • external_id → Media ID in the remote (client) repository.

Behavior changes with optional inputs

  • media section:

    • Not Provided: the Ingest Service will re-generate transcriptions, translations and/or audiotracks depending on the provided attachments and other options (see below).

    • Provided: the uploaded media is assumed to be a re-recording of the existing media; therefore a new transcription file needs to be generated. The Ingest Service will behave as described in the New Media operation. The only difference between both cases is that the video ID is kept. Old media file and subtitles are backed up.

  • requested_langs option:

    • Not provided: By default the media only will be transcribed, if possible and if needed (depending on the provided attachments). No translation nor TTS processes are launched.

    • Provided: Actions that can be inferred from this option will be executed, unless the expected outputs are already provided in the attachments section. Actions that involve the re-generation of subtitles that have already been edited or supervised by users won’t be executed, unless the tL-force option is provided and set to 1.

  • tL-regenerate option:

    • tx: Transcription file will be automatically regenerated if not supervised before by a user, unless the tL-force option is provided and set to 1.

    • tl: Translation files will be automatically regenerated if not supervised before by a user, unless the tL-force option is provided and set to 1.

    • tts: Synthesized audiotracks will be automatically regenerated.

  • attachments section:

    • Language Model Adaptation on transcription regeneration depending on provided textual attachments: If Language Model Adaptation is enabled for the ASR System that generates the transcription file, this ASR system will be adapted to the topic of the media using different textual resources:

      • No text attachments provided: The adaptation will be carried out using external resources automatically downloaded from the Internet based on a web search using the title of the media.

      • A slides file: The adaptation will be carried out using the text extracted from the slides file.

      • Related documents: The adaptation will be carried out using the text extracted from the attached documents.

      • A slides file + Related documents: The adaptation will be carried out using both the text extracted from the slides file and the text extracted from the attached documents.

    • Providing outputs:

      • Subtitles file (spoken language): Current transcription file will be overwritten by the supplied one. Subtitles will be translated and afterwards synthesized if explicitly requested in the requested_langs option, or the tL-regenerate option contains the token tl for translation and tts for synthesis.

      • Subtitles files (other languages): Current translation files (if existing) will be overwritten by the supplied ones. Other subtitles languages will be generated if explicitly requested in the requested_langs option, or the tL-regenerate option contains the token tl. Synthesized audiotracks will be also generated if explicitly requested either with the requested_langs option or the tL-regenerate option.

      • Audiotrack files: Synthesized audiotracks for the languages of the attached audiotracks won’t be generated even if they were explicitly requested.

Delete Media operation

Required inputs:

  • Required metadata keys:

    • external_id:<str> → Media ID in the remote (client) repository.

Behavior changes with optional inputs

  • delete_mode option:

    • soft: media will be marked as "deleted" in the database and its files will be backed up.

    • hard: all media and subtitle files stored will be deleted, as well as all related entries in the Database.

Cancel Upload operation

Required inputs:

  • Required metadata keys:

    • external_id:<str> → Upload ID to be canceled.

Allowed attachments

Table 1. Allowed attachments by File Type
Type Code ID Type Code Name Allowed File Format List

0

Media (video)
[The TLP Player requires mp4 and ogv versions of the uploaded media in order to maximise compatibility with different browsers. Thus, the uploaded media will be converted in a final step into mp4 and ogv, if necessary.]

mp4, m4v, ogv, wmv, avi, mpg, flv.

0

Media (audio)
[In the case of audio files, mp4 and ogv videos will be generated as well from the audio signal.]

wav, mp3, oga, flac, aac.

1

Slides (text)
[Sorted by descending text extraction quality.]

txt, ppt, pptx, doc, docx, pdf.

1

Slides (video)
[For the best quality, a video slides file must be a video in which only the slides appear. Text from the slides is obtained using an OCR system.]

mp4, m4v, ogv, wmv, avi, mpg, flv.

2

Documents
[Sorted by descending text extraction quality]

txt, doc, docx, ppt, pptx, pdf.

3

Video Thumbnail

jpg.

4

Subtitles
[Non-DFXP subtitle formats are always converted into DFXP format.]

dfxp, srt, trs.

5

Audiotracks

wav, mp3, oga, flac, aac.

Manifest JSON Examples

New Media operation: Request English and Spanish subtitles, and a Spanish synthesized audiotrack. A video-slides file is attached to enhance transcription subtitles quality.
 {
    "operation_code": 0,

    "media": {
        "fileformat": "mp4",
        "md5": "05b59346bc3fe5d3eac7a0dcd0022fb6",
        "filename": "main_media.mp4"
    },

    "attachments": [
        { "filename":"awesome_slides_in_video_format.wmv",
           "fileformat":"wmv",
           "type_code":1,
           "md5":"c8722d0e8e27d4b5caaa7122a14676e3"
        }
    ],

    "requested_langs": {
        "es": {
            "sub": {},
            "tts": {}
        },
        "en": {
            "sub": {}
        }
    },

    "metadata": {
        "external_id": "9abc7230fe36a18b885c",
        "language": "en",
        "title": "To know something or not to know nothing: A brief essay about knowledge.",
        "speakers": [
            {
                "speaker_email": "kit@got.com",
                "speaker_name": "Kit Harington",
                "speaker_gender": "M",
                "speaker_id": "kit1234"
            }
        ]
    }
 }
Update media operation: Request re-generation of translations and synthesized audiotracks.
 {
  "operation_code": 1,
  "metadata": { "external_id":"9abc7230fe36a18b885c" },
  "tL-regenerate": [ "tl", "tts" ]
 }
Update media operation: Re-recording of the same media ID. Attached new media file. English subtitles will be generated as default.
 {
  "operation_code": 1,

  "media": { "filename": "main_media_NEW_VERSION.mp4" ,
              "fileformat": "mp4" ,
              "md5": "a86fae8e7af6fd6786efa876fa0e4212"
            },

  "metadata":
     {
       "external_id": "9abc7230fe36a18b885c",
       "language": "en",
       "title": "To know something or not to know nothing: A brief essay about knowledge. (UPDATED)",
        "speakers": [
            {
                "speaker_email": "kit@got.com",
                "speaker_name": "Kit Harington",
                "speaker_gender": "M",
                "speaker_id": "kit1234"
            }
        ]
     }
 }
Delete media operation.
 {
  "operation_code": 2,
  "metadata": { "external_id":"9abc7230fe36a18b885c" }
 }
Cancel upload operation.
 {
  "operation_code": 3,
  "metadata": { "external_id":"up-7249dec8-2b38-4413-b182-3481675c550c" }
 }

Appendix F: Python Commandline Utilities

TLP offers two useful Python command-line scripts to interact with the TLP Server. These scripts are ws-client.py and player-url-generator.py. Both of them make use of the libtlp python module. A configuration file is needed for both scripts.

On the client tools package, all this three files are located at:

 client-tools/python/scr/ws-client.py
 client-tools/python/scr/player-url-generator.py
 client-tools/python/scr/config.ini

Configuration File

Both scripts requires a configuration file (config.ini) properly set up in order to work. Each parameter of the configuration file is detailed below.

[general]
web_service_url = <str>

Base URL location of the TLP Web Service.

player_url = <str>

Base URL location of the TLP Player.

[http_auth]
enabled = <bool>

Enable or disable HTTP Authentication.

username = <str>

HTTP Auth username.

password = <str>

HTTP Auth Password.

[api_client_auth]
username = <str>

TLP username / API username.

secret_key = <str>

API Secret Key Authentication Token.

url_lifetime = <int>

Time slot, starting from the generation of the URL, in which the user will be allowed to access the Player (expire input parameter).

[player_user_info]
user_id = <user_id>

Client-side user ID of the user who will edit the subtitles (author_id input parameter).

user_full_name = <user_full_name>

Client-side user name of the user who will edit the subtitles (author_name input parameter). (optional)

user_confidence = <user_confidence>

Confidence level of the above user (author_conf input parameter).

Example of WSClient Config file
 [general]
 web_service_url = http://ttp.mllp.upv.es/api
 player_url = http://ttp.mllp.upv.es/player

 [http_auth]
 enabled = no
 username =
 password =

 [api_client_auth]
 username = tluser
 secret_key = akjsfd982323098qwjs209823id09321io3290d
 request_key_expire_lifetime = 1440

 [player_user_info]
 user_id = jsnow21
 user_full_name = John Snow
 user_confidence = 100

By default, both scripts will attempt to load a configuration file named config.ini from the same script directory. You can provide an alternative configuration file location with the --config-file option. The option --print-sample-config-file prints a sample config file to the standard output.

ws-client.py

ws-client.py is a Python command-line utility that interacts with the TLP Web Service API. It can perform queries to all Web Service interfaces. The configuration file has to be properly set up in order to work.

Software requirements
ws-client.py help:
usage: ws-client.py [-h] [-d] [-g] [-D] [-c <file>] [-C] [-I <user_tuple>]
                    [-B <username>] [-f <dest>]
                    {systems,uploadslist,ingest,status,langs,metadata,subs,audiotrack,mod}
                    ...

ws-client.py: TLP Web Service API client tool.

optional arguments:
  -h, --help            show this help message and exit
  -d, --debug           Print debug information
  -g, --use-get-query   Use GET HTTP queries instead of POST when possible
  -D, --use-data-param  Use a single base64-encoded JSON 'data' GET parameter
                        instead of multiple GET parameters
  -c <file>, --config-file <file>
                        Config file. Default: config.ini
  -C, --print-sample-config-file
                        Print sample config file and exit
  -I <user_tuple>, --api-client-auth <user_tuple>
                        API client user name and authentication token, in the
                        following format: USERNAME:AUTH_TOKEN. Default: from
                        config file.
  -B <username>, --su <username>
                        su (substitute user) option: specify username. Only
                        for admin users.
  -f <dest>, --store-output-file <dest>
                        Store the Web Service response in a file.

Web Service Interfaces:
  Use a subcommand to call the corresponding Web Service interface

  {systems,uploadslist,ingest,status,langs,metadata,subs,audiotrack,mod}
    systems             Get a list of all available ASR/MT/TTS Systems that
                        can be applied to transcribe/translate/synthesize a
                        media file.
    uploadslist         Get a list of all user's uploads.
    ingest              Upload media (audio/video) files and many other
                        attachments along with other metadata bundled into a
                        Media Package File (MPF) to the TLP Server.
    status              Check the current status of an upload.
    langs               Get a list of subtitle and audiotrack languages
                        available for a given media ID.
    metadata            Get metadata and media file locations for a given
                        media ID.
    subs                Download subtitles for a given media ID and language.
    audiotrack          Download audiotrack file for a given media ID and
                        language.
    mod                 Send and commit subtitle corrections made by a user.

The ws-client.py tool includes several subcommands, one for each Web Service interface.

ws-client.py ingest

ingest help:
usage: ws-client.py ingest [-h] [-F MPF] [-D <dir>] [-n] [-l <language>]
                           [-t <title>] [-S <speaker_tuple>] [-m <email>]
                           [-k <keywords>] [-i <topic>] [-M <file/URL>]
                           [-A <file/URL>] [-s <file>] [-r <file>]
                           [-b <sub_tuple>] [-K <track_tuple>] [-a <file>]
                           [-o {0,1,2,3}] [-p <lang>] [-P <lang>] [-L <JSON>]
                           [-R {tx,tl,tts}] [-f {0,1}] [-T {0,1}]
                           [-x {soft,hard}]
                           object_id

positional arguments:
  object_id             Object ID (Media ID for New, Update and Delete
                        operations; Upload ID for Cancel operation.

optional arguments:
  -h, --help            show this help message and exit
  -F MPF, --media-package-file MPF
                        Media Package File. Ingest existing media package file
                        and exit.
  -D <dir>, --data-dir <dir>
                        Directory to store media package. By default, it is
                        stored in a temp dir
  -n, --no-ingest       Create media package only, do not ingest
  -X, --test-mode       Use Ingest Service test mode

Metadata options:
  -l <language>, --language <language>
                        Media Language (ISO 639-1 code)
  -t <title>, --title <title>
                        Title of the media
  -S <speaker_tuple>, --speaker-info <speaker_tuple>
                        Speaker Info, in the following format:
                        'SPEAKER_ID:[FULL_NAME]:[GENDER]:[EMAIL]', where
                        GENDER={M,F}. Example: 'id1234:Pepita
                        Greus::pgreus@mymail.com'
  -m <email>, --mail <email>
                        List of e-mails separated by commas
  -k <keywords>, --keywords <keywords>
                        Media keywords
  -i <topic>, --topic <topic>
                        Topic of the media

Media file options:
  -M <file/URL>, --media-file <file/URL>
                        Main media file
  -A <file/URL>, --extra-media-file <file/URL>
                        Additional media files (main media file encoded in
                        other formats)
  -s <file>, --slides-file <file>
                        Slides file
  -r <file>, --docs-file <file>
                        Document file
  -b <sub_tuple>, --subtitle-file <sub_tuple>
                        Subtitle files, in the following format:
                        'LANG:FILE:[HUMAN]', where HUMAN={0,1 (def)}. Example:
                        'es:sub_es.srt:'
  -K <track_tuple>, --audiotrack-file <track_tuple>
                        Audiotrack files, in the following format:
                        'LANG:FILE:[HUMAN]', where HUMAN={0,1 (def)}. Example:
                        'es:es.tts.mp3:0'
  -a <file>, --thumbnail-file <file>
                        Video thumbnail file

Worflow options:
  -o {0,1,2,3}, --operation-code {0,1,2,3}
                        Operation Code. 0 -> New (def), 1 -> Update, 2 ->
                        Delete, 3 -> Cancel Upload
  -p <lang>, --requested-languages-subs <lang>
                        Request subtitle languages, in the following format:
                        'LANG[:SYS_ID]'. Example: '-p es -p en:17'
  -P <lang>, --requested-languages-tts <lang>
                        Request TTS languages, in the following format:
                        'LANG[:SYS_ID]'. Example: '-P en:23'
  -L <JSON>, --requested-languages-json <JSON>
                        "requested_langs" JSON string. Ignores -p and -P
                        options. Example: '{ "es":{"sub":{}},
                        "en":{"sub":{"sid":3}} }'
  -R {tx,tl,tts}, --tL-regenerate-opt {tx,tl,tts}
                        Set 'tL-regenerate' option: request re-generation of
                        automatic subtitles and/or TTS tracks. Example: '-R tx
                        -R tl' requests regeneration of transcription and
                        translations
  -f {0,1}, --tL-force-opt {0,1}
                        Set 'tL-force' option: overwrite any pre-existing
                        human-supervised subtitles. 0 -> Disabled, 1 -> Force
                        re-translation of supervised translations
  -T {0,1}, --transLecture-opt {0,1}
                        Set 'transLecture' option: enables or disables
                        automatic transcription and translation of the
                        ingested media. 0 -> Disabled, 1 -> Enabled (def)
  -x {soft,hard}, --delete-mode {soft,hard}
                        Delete mode
Example 22. ws-client.py ingest call: Create and upload a Media Package.
--$ ./ws-client.py ingest
     -M media.mp4
     -t "Introduction to Machine Learning"
     -l en
     -s "slides.pptx"
     -r "related-document.pdf"
     -r "lecture-notes.doc"
     "MEDIA-ID-1234"

ws-client.py uploadslist

uploadslist help:
usage: ws-client.py uploadslist [-h] [-o OBJECT_ID]

optional arguments:
  -h, --help            show this help message and exit
  -o OBJECT_ID, --object_id OBJECT_ID
                        Get list of uploads involving the given Object ID.
Example 23. ws-client.py uploadslist call: Get list of uploads.
--$ ./ws-client.py uploadlist

ws-client.py status

status help:
usage: ws-client.py status [-h] upload_id

positional arguments:
  upload_id   Upload ID

optional arguments:
  -h, --help  show this help message and exit
Example 24. ws-client.py status call: Check upload status.
--$ ./ws-client.py status "UPLOAD-ID-1234"

ws-client.py systems

systems help:
usage: ws-client.py systems [-h]

optional arguments:
  -h, --help  show this help message and exit
Example 25. ws-client.py systems call: Get list of available ASR/MT/TTS systems.
--$ ./ws-client.py systems

ws-client.py langs

langs help:
usage: ws-client.py langs [-h] [-k] [-K REQUEST_KEY_LIFETIME] [-H HASH_ID]
                          media_id

positional arguments:
  media_id              Media ID

optional arguments:
  -h, --help            show this help message and exit
  -k, --use-request-key
                        Use request-key as auth_token instead of secret_key.
  -K REQUEST_KEY_LIFETIME, --request-key-lifetime REQUEST_KEY_LIFETIME
                        Set request key lifetime in minutes. Default: from
                        config file.
  -H HASH_ID, --hash-id HASH_ID
                        Media Hash ID
Example 26. ws-client.py langs call: Get available subtitle languages for a specific media ID.
--$ ./ws-client.py langs "MEDIA-ID-1234"

ws-client.py metadata

metadata help:
usage: ws-client.py metadata [-h] [-H HASH_ID] [-k] [-K REQUEST_KEY_LIFETIME]
                             media_id

positional arguments:
  media_id              Media ID

optional arguments:
  -h, --help            show this help message and exit
  -H HASH_ID, --hash-id HASH_ID
                        Media Hash ID
  -k, --use-request-key
                        Use request-key as auth_token instead of secret_key.
  -K REQUEST_KEY_LIFETIME, --request-key-lifetime REQUEST_KEY_LIFETIME
                        Set request key lifetime in minutes. Default: from
                        config file.
Example 27. ws-client.py metadata call: Get metadata and media file locations for a specific media ID.
--$ ./ws-client.py metadata "MEDIA-ID-1234"

ws-client.py subs

subs help:
usage: ws-client.py subs [-h] [-f {dfxp,ttml,srt,vtt}] [-D {0,1,2}]
                         [-s {-1,0,1}] [-H HASH_ID] [-k]
                         [-K REQUEST_KEY_LIFETIME]
                         media_id language

positional arguments:
  media_id              Media ID
  language              Language ISO 639-1 code (i.e. en, es, ca, ...)

optional arguments:
  -h, --help            show this help message and exit
  -f {dfxp,ttml,srt,vtt,text}, --format {dfxp,ttml,srt,vtt,text}
                        Subtitles format
  -D {0,1,2}, --select-data-policy {0,1,2}
                        sel_data_policy parameter
  -s {-1,0,1}, --segment-filtering-policy {-1,0,1}
                        seg_filt_policy parameter
  -H HASH_ID, --hash-id HASH_ID
                        Media Hash ID
  -k, --use-request-key
                        Use request-key as auth_token instead of secret_key.
  -K REQUEST_KEY_LIFETIME, --request-key-lifetime REQUEST_KEY_LIFETIME
                        Set request key lifetime in minutes. Default: from
                        config file.
Example 28. ws-client.py subs call: Download English subtitles in SRT format for a specific media ID.
--$ ./ws-client.py subs "MEDIA-ID-1234" en -f srt

ws-client.py audiotrack

audiotrack help:
usage: ws-client.py subs [-h] [-f {dfxp,ttml,srt,vtt}] [-D {0,1,2}]
                         [-s {-1,0,1}] [-H HASH_ID] [-k]
                         [-K REQUEST_KEY_LIFETIME]
                         media_id language

positional arguments:
  media_id              Media ID
  language              Language ISO 639-1 code (i.e. en, es, ca, ...)

optional arguments:
  -h, --help            show this help message and exit
  -f {dfxp,ttml,srt,vtt}, --format {dfxp,ttml,srt,vtt}
                        Subtitles format
  -D {0,1,2}, --select-data-policy {0,1,2}
                        sel_data_policy parameter
  -s {-1,0,1}, --segment-filtering-policy {-1,0,1}
                        seg_filt_policy parameter
  -H HASH_ID, --hash-id HASH_ID
                        Media Hash ID
  -k, --use-request-key
                        Use request-key as auth_token instead of secret_key.
  -K REQUEST_KEY_LIFETIME, --request-key-lifetime REQUEST_KEY_LIFETIME
                        Set request key lifetime in minutes. Default: from
                        config file.
Example 29. ws-client.py audiotrack call: Download English audio track of an specific media and audiotrack IDs.
--$ ./ws-client.py audiotrack "MEDIA-ID-1234" en 12

ws-client.py mod

mod help:
usage: ws-client.py mod [-h] [-D DATA] [-i MEDIA_ID] [-l LANGUAGE]
                        [-a AUTHOR_ID] [-c AUTHOR_CONF] [-n AUTHOR_NAME]
                        [-t TXT_JSON_DICT] [-r DEL_SEGM_ID] [-H HASH_ID] [-k]
                        [-K REQUEST_KEY_LIFETIME]

optional arguments:
  -h, --help            show this help message and exit
  -D DATA, --data DATA  Directly provide input base64-encoded JSON data string
  -i MEDIA_ID, --media-id MEDIA_ID
                        Media Id
  -l LANGUAGE, --language LANGUAGE
                        Subtitle language
  -a AUTHOR_ID, --author-id AUTHOR_ID
                        Author Id
  -c AUTHOR_CONF, --author-conf AUTHOR_CONF
                        Author Confidence level (0-100)
  -n AUTHOR_NAME, --author-name AUTHOR_NAME
                        Author full name
  -t TXT_JSON_DICT, --txt-json-dict TXT_JSON_DICT
                        Segment modification dictionary: {"sI":<int>,
                        "b":<float, "e":<float>, "t":<str>}
  -r DEL_SEGM_ID, --del-segm-id DEL_SEGM_ID
                        Segment IDs to delete.
  -H HASH_ID, --hash-id HASH_ID
                        Media Hash ID
  -k, --use-request-key
                        Use request-key as auth_token instead of secret_key.
  -K REQUEST_KEY_LIFETIME, --request-key-lifetime REQUEST_KEY_LIFETIME
                        Set request key lifetime in minutes. Default: from
                        config file.
Example 30. ws-client.py mod call: Send modifications for Englsh subtitles of an specific media ID.
--$ ./ws-client.py mod
        --media-id MEDIA-ID-1234
        --language en
        --author-id jsnow21
        --author-conf 100
        -t '{"sI":1, "b":0.0, "e":2.0, "t":"Winter is coming."}'
        -t '{"sI":2, "b":3.0, "e":5.0, "t":"Valar Morghulis."}'
        -n "John Snow"

player-url-generator.py

player-url-generator.py is a Python command-line utility that generates valid URL links to the TLP Player according to these specifications. The configuration file has to be properly set up in order to work.

Software requirements
player-url-generator.py help
usage: player-url-generator.py [-h] [-d] [-C] [-c <file>] [-s START_TIME]
                               [-l LANGUAGE] [-t TIME_SLOT] [-a AUTHOR_ID]
                               [-n AUTHOR_NAME] [-k AUTHOR_CONF]
                               [-I <user_tuple>]
                               media_id

player-url-generator.py: Generate valid URLs for calling the TLP Player.

positional arguments:
  media_id              Media ID

optional arguments:
  -h, --help            show this help message and exit
  -d, --debug           Print debug information
  -C, --print-sample-config-file
                        Print sample config file and exit
  -c <file>, --config-file <file>
                        Config file. Default: config.ini
  -s START_TIME, --start-time START_TIME
                        Start time in seconds.
  -l LANGUAGE, --language LANGUAGE
                        Subtitles language.
  -t TIME_SLOT, --time-slot TIME_SLOT
                        Time slot for editing in minutes. Default: from config
                        file.
  -a AUTHOR_ID, --author-id AUTHOR_ID
                        Author ID ('author_id'). Default: from config file.
  -n AUTHOR_NAME, --author-name AUTHOR_NAME
                        Author Name. Default: from config file.
  -k AUTHOR_CONF, --author-conf AUTHOR_CONF
                        Author confidence level [0-100]. Default: from config
                        file.
  -I <user_tuple>, --api_user <user_tuple>
                        User name and authentication token, in the following
                        format: USERNAME:AUTH_TOKEN. Default: from config
                        file.
Example 31. player-url-generator.py call: Get URL for editing English subtitles of a given media ID:
--$ player-url-generator.py -l en MEDIA-ID-1234

Appendix G: DFXP Format Specification

This Appendix describes a format extension from the original DFXP format. This extension was made in order to reflect the needs of the transLectures EU project:

  • Confidence measures for automatic transcription and translations have to be reflected in the DFXP document.

  • Track needs to be kept of all subtitle edits made by human users, starting from an automatic transcription/translation.

For this purpose, new XML tags have been proposed. These tags belong to a new namespace called tl. Therefore, these new XML tags will be something like <tl:XXX>, where tl is the namespace and XXX is the tag. The root <tt> element has been extended in this way:

 <tt xml:lang="en" xmlns="http://www.w3.org/2006/04/ttaf1"
 xmlns:tts="http://www.w3.org/2006/10/ttaf1#style" xmlns:tl="translectures.eu">

The MLLP research group launched with TLP 1.2 an updated version of the DFXP format, namely DFXP v1.1, which enables the modification of the speech segmentation. The counterpart is that user edition history cannot be tracked inside the DFXP file.

DFXP Tags

Tags are defined at four levels: document, segment, group and word. Document tags are located at the head section, while segment, group and word tags are located at the body section. An additional tag to relate alternative transcriptions/translations is also included. A detailed explanation of tags follows:

  • <tl:document>: This tag defines the attributes of the transcription/translation at the top level. As the attributes are inherited, the value of the attributes defined here are the default values, unless otherwise redefined. It contains a specific attribute to associate the current file to a unique video ID. Abbreviation: <tl:d>.

  • <tl:current>: This tag defines the current status of the subtitle file with the last modifications made by users, it contains an ordered sequence of text segments or captions. Abbreviation: <tl:c>.

  • <tl:origin>: This tag defines the former status of the subtitle file (typically the automatic transcription/translation), it contains an ordered sequence of text segments or captions. Abbreviation: <tl:o>.

  • <tl:segment>:This tag defines text segments or captions. Abbreviation: <tl:s>

  • <tl:group>: This tag defines a group of words inside a segment. This tag will usually appear as a result of the interaction with the user. Abbreviation: <tl:g>

  • <tl:word>: A simple tag used to specify single word properties, mostly used for time alignments and confidence measures. Other attributes are generally inherited. Abbreviation: <tl:w>

Next, we define the set of attributes related to the tags just defined. Most of the attributes are applicable to all levels:

  • authorType: Type of author. Their values are automatic or human. Human for those transcriptions/translations generated by human experts or completely supervised by human experts. Automatic for those transcriptions/translations fully generated by an ASR/MT system. Abbreviation: aT.

  • authorId: Author identifier. For example: RWTH, XEROX, UPV, Maria Gialama, etc. Abbreviation: aI.

  • authorConf: Confidence measure of the author when the authorType is human. This attribute is coupled with an authorId. This tag could be useful for non-native users supervising a foreign language. Abbreviation: aC.

  • wordSegId: It identifies the system that performs the automatic segmentation at the word level. It could be different from the authorId, since groups of words supervised by the user may be segmented at the word level with a different system from that providing the automatic transcription. Abbreviation: wS.

  • timeStamp: Instant of creation or modification. The timestamp format is a combination of date and time of day in Chapter 5.4 of ISO 8601. The format is [-]CCYY-MM- DDThh:mm:ss[Z$|$(+$|$-)hh:mm]. Abbreviation: tS.

  • confMeasure: Confidence measure of the level. These values are generated by ASR and MT systems. Abbreviation: cM.

  • videoId: Tag only defined at the document level. It links the current transcription or translation DFXP file to a unique video. Abbreviation: vI.

  • segmentId: It is used to uniquely identify a segment in a transcription or translation file. As mentioned above, alternative segments have the same segmentId. Abbreviation: sI.

  • begin: Instant of the beginning of an audio portion of the current tag in seconds. Abbreviation: b.

  • end: Instant of the end of an audio portion of the current tag in seconds. Abbreviation: e.

  • elapsedTime: Processing time. Abbreviation: eT.

  • modelID: Model used by decoder. Abbreviation: mI.

  • processingSteps: Processing steps of decoder. Abbreviation: pS.

  • audioLength: Complete length of the video. Abbreviation: aL.

  • status: Supervision status of the subtitles. Abbreviation: st. Possible values:

    • fully_automatic → All segments are automatic.

    • partially_human → Some segments have been supervised.

    • fully_human → All segments have been supervised.

Note Special characters such as & “ < > ' must be escaped in the DFXP files according to the XML standard (see http://xml.silmaril.ie/specials.html).

Examples of Extended DFXP Tags

Examples at <head>

DFXP Format: Document tags
 <tl:d aT="automatic" aI="UPV-v1.0" tS="2012-10-03T21:32:52" aC="0.6"
  cM="0.75" vI="1234-abcd" b="0.0" e="400.6"/>

 <tl:d aT="human" aI="John Doe" tS="2012-10-03T21:32:52" aC="1.0"
  cM="1.0" videoId="1234-abcd" b="1.0" e="400.6"/>

Examples at <body>

DFXP Format: Segment tags
 <tl:s sI="1" aT="automatic" aI="UPV" wS="UPV" tS="2012-10-03T21:32:52" cM="0.62" aC="0.5" b="0.0" e="15.6">
   i am very hungry with you
 </tl:s>
DFXP Format: Group tags
 <tl:g aT="human" aI="John Doe" aC="0.75" tS="2012-10-03T21:32:52" cM="1.0" b="2.7" e="3.5">
    the way we train in IBM
 </tl:g>
DFXP Format: Word tags
 <tl:w aT="automatic" aI="UPV" aC="0.5" cM="0.61" tS="2012-10-03T21:32:52" cM="1.0" b="1.3" e="2.1">the</tl:w>
 <tl:w cM="1.3" b="1.6" e="2.1">way</tl:w>

Use cases

  • A transcription/translation is automatically generated by an automatic system creating a DFXP file from scratch.

  • A transcription/translation is manually generated by a human expert creating a new DFXP file.

  • A user supervises an automatic/manual transcription/translation.

Use case examples

A transcription/translation is generated by an automatic system creating a DFXP file from scratch.
 <?xml version="1.0" encoding="utf-8"?>
 <tt xml:lang="en" xmlns="http://www.w3.org/2006/04/ttaf1"
 xmlns:tts="http://www.w3.org/2006/10/ttaf1#style" xmlns:tl="translectures.eu">
  <head>
    <tl:d aT="automatic" aI="UPV-v1.0" wS="UPV-v1.0" tS="2012-10-03T21:32:52" aC="0.56" cM="0.75"
     videoId="00505-Profesores_Alcoy.M03.B01" b="0.0" e="12.50"/>
  </head>
  <body>
    <tl:c>
      <tl:s sI="1" cM="0.75" b="0.00" e="3.20">
        <tl:w cM="0.85" b="0.00" e="0.75">most</tl:w>
        <tl:w cM="0.89" b="0.75" e="0.95">of</tl:w>
        <tl:w cM="0.63" b="0.95" e="1.15">you</tl:w>
        <tl:w cM="0.40" b="1.15" e="1.35">are</tl:w>
        <tl:w cM="0.90" b="1.35" e="1.50">probably</tl:w>
        <tl:w cM="0.85" b="1.50" e="1.75">ventured</tl:w>
        <tl:w cM="0.55" b="1.75" e="2.00">the </tl:w>
        <tl:w cM="0.98" b="2.00" e="2.75">problem</tl:w>
        <tl:w cM="0.60" b="2.75" e="3.20">that</tl:w>
      </tl:s>
      <tl:s sI="2" cM="0.19" b="8.50" e="12.50">
        <tl:w cM="0.1" b="8.50" e="9.00">To</tl:w>
        <tl:w cM="0.2" b="9.00" e="10.00">solve</tl:w>
        <tl:w cM="0.1" b="10.00" e="10.70">on</tl:w>
        <tl:w cM="0.1" b="10.70" e="12.50">this</tl:w>
      </tl:s>
    </tl:c>
    <tl:o>
      <tl:s sI="1" cM="0.75" b="0.00" e="3.20">
        <tl:w cM="0.85" b="0.00" e="0.75">most</tl:w>
        <tl:w cM="0.89" b="0.75" e="0.95">of</tl:w>
        <tl:w cM="0.63" b="0.95" e="1.15">you</tl:w>
        <tl:w cM="0.40" b="1.15" e="1.35">are</tl:w>
        <tl:w cM="0.90" b="1.35" e="1.50">probably</tl:w>
        <tl:w cM="0.85" b="1.50" e="1.75">ventured</tl:w>
        <tl:w cM="0.55" b="1.75" e="2.00">the </tl:w>
        <tl:w cM="0.98" b="2.00" e="2.75">problem</tl:w>
        <tl:w cM="0.60" b="2.75" e="3.20">that</tl:w>
      </tl:s>
      <tl:s sI="2" cM="0.19" b="8.50" e="12.50">
        <tl:w cM="0.1" b="8.50" e="9.00">To</tl:w>
        <tl:w cM="0.2" b="9.00" e="10.00">solve</tl:w>
        <tl:w cM="0.1" b="10.00" e="10.70">on</tl:w>
        <tl:w cM="0.1" b="10.70" e="12.50">this</tl:w>
      </tl:s>
    </tl:o>
  </body>
 </tt>
A transcription/translation is manually generated by a human expert creating a new DFXP file
 <?xml version="1.0" encoding="utf-8"?>
 <tt xml:lang="en" xmlns="http://www.w3.org/2006/04/ttaf1"
 xmlns:tts="http://www.w3.org/2006/10/ttaf1#style" xmlns:tl="translectures.eu">
  <head>
    <tl:d aT="manual" aI="Maria" aC="1.0" videoId="00505-Profesores_Alcoy.M03.B01"
     tS="2012-10-03T21:32:52" cM="1.0" b="0.0" e="12.50"/>
  </head>
  <body>
    <tl:c>
      <tl:s sI="1" b="0.00" e="3.20">
        most of you have probably ventured into the problem set.
      </tl:s>
      <tl:s sI="2" b="8.50" e="12.50">
        The solution is:
      </tl:s>
    </tl:c>
    <tl:o>
      <tl:s sI="1" b="0.00" e="3.20">
        most of you have probably ventured into the problem set.
      </tl:s>
      <tl:s sI="2" b="8.50" e="12.50">
        The solution is:
      </tl:s>
    </tl:o>
  </body>
 </tt>
A user supervises an automatic transcription/translation by editing a segment.
 <?xml version="1.0" encoding="utf-8"?>
 <tt xml:lang="en" xmlns="http://www.w3.org/2006/04/ttaf1"
 xmlns:tts="http://www.w3.org/2006/10/ttaf1#style" xmlns:tl="translectures.eu">
  <head>
    <tl:d aT="automatic" aI="UPV-v1.0" wS="UPV-v1.0"
    tS="2012-10-03T21:32:52" aC="0.56" cM="0.75"
    videoId="00505-Profesores_Alcoy.M03.B01" b="0.0" e="12.50"/>
  </head>
  <body>
    <tl:c>
      <tl:s sI="1" aT="human" aC="0.81" cM="1.0" aI="John" b="0.17" e="3.32" tS="2012-10-04T13:31:45">
        most of you probably ventured into the problem set
      </tl:s>
      <tl:s sI="2" cM="0.19" b="8.5" e="12.50">
        <tl:w cM="0.1" b="8.5" e="9">To</tl:w>
        <tl:w cM="0.2" b="9" e="10">solve</tl:w>
        <tl:w cM="0.1" b="10" e="10.7">on</tl:w>
        <tl:w cM="0.1" b="10.7" e="12.5">this</tl:w>
      </tl:s>
    </tl:c>
    <tl:o>
      <tl:s sI="1" cM="0.75" b="0.00" e="3.20">
        <tl:w cM="0.85" b="0.00" e="0.75">most</tl:w>
        <tl:w cM="0.89" b="0.75" e="0.95">of</tl:w>
        <tl:w cM="0.63" b="0.95" e="1.15">you</tl:w>
        <tl:w cM="0.40" b="1.15" e="1.35">are</tl:w>
        <tl:w cM="0.90" b="1.35" e="1.50">probably</tl:w>
        <tl:w cM="0.85" b="1.50" e="1.75">ventured</tl:w>
        <tl:w cM="0.55" b="1.75" e="2.00">the </tl:w>
        <tl:w cM="0.98" b="2.00" e="2.75">problem</tl:w>
        <tl:w cM="0.60" b="2.75" e="3.20">that</tl:w>
      </tl:s>
      <tl:s sI="2" cM="0.19" b="8.50" e="12.50">
        <tl:w cM="0.1" b="8.50" e="9.00">To</tl:w>
        <tl:w cM="0.2" b="9.00" e="10.00">solve</tl:w>
        <tl:w cM="0.1" b="10.00" e="10.70">on</tl:w>
        <tl:w cM="0.1" b="10.70" e="12.50">this</tl:w>
      </tl:s>
    </tl:o>
  </body>
 </tt>