Speech Technology

	Home --> Technology --> Speech Technology
		speech technology

Speech Technology is also known as new generation technology as it allows users to access computers, IT applications, websites and all software's through voice commands. In other words it is also called hands free access. Speech Technology in broader view can be used in each and small part of human life.

Infynita Inc. Speech Technology is designed as a set of applications and protocols. We define these components as layers or sinks. These layers and sinks are integrated as needed to provide a solution that is optimized for platform, device and the customer's need.

Using Infynita Speech Technology it is possible to provide solutions across platforms (Windows, Linux, PalmOS and others) and across devices (Desktops, PDAs, Telephones etc) as an Internet based or a client/server application with a consistent Voice and Graphical User Interface and experience, using Natural Language as a major command and response interactive bridge. Infynita Speech Technology and Protocol components are developed largely on the C platform.

Infynita Speech Technology's basic architecture consists of the following building blocks:


1.	The Presentation Layer

2.	The Presentation Sink

3.	The Application Layer

4.	The System Sink

5.	2-Tier Speech Abstraction Layer

6.	The Speech Engine

Each block is designed to

	be modular; allow easy plug in or removal

	use minimum system resources and CPU time

	Memory management of each module is done independently.

1.	The Presentation Layer: -
	This layer manages the GUI of the application.

	1.	The look and feel of layer is designed to be as uniform as possible across platforms.

	2.	Focus is on consistent user experience.

	3.	This layer is platform dependent.

	4.	This layer interacts with Presentation Sink for any information that needs to be transmitted to Application Layer.

	Top

2.	The Presentation Sink: -
	1.	This layer is platform dependent.

	2.	It acts as a transmission media between Presentation layer and Application Layer.

	3.	All the calls from Sink to the Application layer are ID specific and are handled by a separate "Event Delegation Model, which forms a part of the sink".

	Top

3.	The Application Layer: -
	Application Layer consists the core functionalities the logic and business rules of the application.

	1.	This layer is platform independent, and can be easily ported to other platforms .

	2.	This layer is speech engine independent.

	3.	This layer is normally designed in 'C' with standards and conventions that will make it portable across multiple platforms.

	4.	Designed to provide a small footprint.

	5.	Application logic does not directly interact with OS, Presentation Layer and Speech engine.

	6.	This layer interacts with System Sink for allocation, de-allocation and management of system resources.

	7.	Application Layer is flexible enough to incorporate future enhancements.

	Top

4.	The System Sink Layer: -
	This layer is platform dependant.

	1.	This layer exposes a common set of API set that performs the critical job of system resource allocation.

	2.	Consists of an intelligent resource pooling mechanism that manages allocation of resources.

	3.	This layer interacts with the application layer.

	4.	If the application belongs to the multiple thread model then this layer also manages the thread management.

	Top

5.	2-Tier Speech Abstraction Layer
	I Speech Abstraction Layer: -

	This contains the basic API set that the application layer calls for Text To Speech (TTS) and Automatic

	Speech Recognition (SR) functionalities.

	1.	Speech Recognition (SR) functionalities.

	2.	Optimizes the grammar to make it natural.

	3.	Customizes and enhances Natural Language to make it speech engine compliant.

	4.	Has a built in self-training mechanism in which the abstraction layer trains the speech engine (and creates profiles) based on the consistent user input.

	II Telephonic Abstraction Layer

	All the APIs, which deal with telephonic I/P or O/P or any sort of communication with the telephonic H/W, will be wrapped in this layer.

	1.	This layer is platform specific (Windows) and needs to implement TAPI interfaces.

	2.	Session tracking is taken care of in which each telephonic conversation bearing a unique session id is maintained.

	3.	Has an intelligent mechanism to change the format of speech O/P depending on the end terminals (Device tracking).

	4.	Designed to have two modes of operation. In the first one the user can provide I/P to the system by pressing keys from the keypad or speak. The later case typically deals with Boolean type of questions in which the answer can be yes or no.

	5.	To incorporate voice I/P over telephone this layer has a high level of coupling with The Tier 1 (Speech Engine Layer), which works at the other end.

	Top

6.	The Speech Engine: -
	The Speech Engine is platform and client dependant . Present development is extensively based on MS SAPI 5.1 and Microsoft's proprietary Speech Engines.

	Criteria used in selecting the Speech Engine is :

	1.	The speech engine should support TTS and ASR functionalities simultaneously.

	2.	The efficiency and performance of the speech engine across every platform should be high whatever the speech engine may be.

	3.	Additional features that speech engine should satisfy are

	a.		The speech engine should support multi thread models.

	b.		Preferably the speech engine should be sharable across multiple applications in a user session.

	c.		The speech engine should support the streaming of audio I/O.

	d.		Should be capable of handling bookmarks, streaming of the O/P.

	Top

Infynita Advantage:

Infynita Inc. Speech Research Center, which is due for flag off in the last fortnight of Dec 2005, has also been chartered to develop ASR and TTS technologies based on custom requirements.

Business Use of Speech Technology

Telephony Application for auto attendant/call center attendant

Customer Order Management System,

Inventory status,

Stock quotes,

Financial alerts,

Insurance quotes,

User surveys,

Sports scores and information,

Banking information,

Air/Rail/Bus schedules & Arrival/Departure information,

Sales support systems,

Customer Service functions,

Fault reporting systems,

Directory information

Speech enabled email client software

Speech Enabled Banking Software

Speech enabled Catalog/E Commerce Software

Speech enabled Wrappers that allow Voice Command of any application such as MS Word, Win amp, or the Windows OS

	Speech First Case Study

	SpeechFirst is probably the most sophisticated speech interface available on the market. SpeechFirst uses Microsoft SAPI 5.1 TTS and ASR engine for voice recognition and text output.

Read more


	QUICK LINKS


	QUICK SOURCE
		Request For Proposal
		Request For Quotation
		Request For Information


	QUICK DOWNLOADS
		Infynita Inc. Corporate Flash Presentation
		Strategic Outsourcing White Paper
		Infynita Inc. Outsourcing Whitepaper
		Company Profile (PDF Brochure)


	INFYNITA SOURCE
		Offshore Development Center (ODC) presentation
		Flash walk-through of Infynita ODC services.

		Offshore Outsourcing-ROI Calculator
		Comparison between In-house, Onshore,Offshore and ODC ROI.

		Infynita Talk
		Infynita Monthly Ezine for offshore Outsourcing related issues.

		Infynita Case Studies
		A live projects case studies of Infynita.

Contact Infynita Inc. for Offshore Outsourcing and Offshore Development Center services.