|
SpeechTEK University
SpeechTEK University courses provide in-depth, 3-hour seminars on compelling topics for speech technology and information technology professionals. Experienced
speech technology practitioners teach each seminar in an intimate classroom setting to foster an educational and structured learning experience. If you are
considering deploying a speech application, looking to increase your knowledgebase in one of the key areas below, or you simply need a speech technology
refresher, attend a SpeechTEK University course. These courses are separately priced or may be purchased as part of your conference registration.
|
|
STKU-1 – Introduction to Speech Technologies
1:30 p.m - 4:30 p.m
|
Designed for attendees new to speech technology, this tutorial provides an overview of today’s key speech technologies. What are the major types of speech recognition engines and how do they work? What is the difference between statistical language models and grammars? Do speaker identification and verification really work? How do speech synthesis engines work and how do you specify synthetic voice characteristics? What does a dialogue engine do and how do you manage a user-computer dialogue? What makes intelligent agents intelligent? What are the dimensions of natural language? What’s needed to build a personal assistant? How will speech technologies change our lives in the next 3 years? |
|
STKU-2 – Introduction to Voice Interaction Design
1:30 p.m - 4:30 p.m
|
Taught by the author of Practical Speech User Interface Design, this session covers the current leading practices in speech user interface design for interactive voice response applications. Drawing from psychology, human-computer interaction, linguistics, and communication theory, this course provides a comprehensive yet concise survey of practical speech user interface (SUI) design, including practice-based and research-based guidance on how to design effective, efficient, and pleasant speech applications. The techniques for designing usable SUIs are not obvious, and to be effective, must be informed by a combination of critically interpreted scientific research and leading design practices. |
|
STKU-3 – Introduction to Natural Language Processing
1:30 p.m - 4:30 p.m
|
This session introduces natural language processing (NLP) and its role in speech applications. Learn what natural language is, the statistical language model (SLM) approach to NLP, when and how to use NLP techniques, and how to combine NLP techniques with grammars and directed dialogues to achieve optimal application performance. Additionally, this session covers commercially available natural language tools, research areas, and newer technology such as the IBM Watson Jeopardy system and Apple Siri. This tutorial is aimed at an audience with a general technical background. Experience developing speech applications would be helpful. |
|
|
STKU-4 – Speech Testing Methodologies and Best Practices
9:00 a.m - 12:00 p.m
Nava A Shaked, Head of Multidisiplinary studies - HIT Holon Institute of Technology. Israel
|
Testing and quality assurance (QA) are crucial to the success of any speech system, but testing IVR and multimodal systems present unique challenges. This session looks at the different layers of testing and QA, including engine quality tests, functional application testing, VUI testing, interfaces and redundancy infrastructure, load balancing, and backup and recovery. How to set the goals of the testing process, what metrics to use in testing, how to conduct the tests, who should be on the testing team, and tips for managing the testing process are also discussed. |
|
STKU-5 – Large-Scale, Open-Set Speaker Identification and Blacklisting (CANCELLED)
9:00 a.m - 12:00 p.m
Homayoon Beigi, President - Recognition Technologies, Inc. Adjunct Prof., Columbia University, Depts. of Computer Science and Mechanical Engineering
|
|
This workshop has been cancelled.
|
|
|
STKU-6 – Connecting Customers’ Browsers and Mobile Apps to Speech Systems and Customer Service Agents with WebRTC
9:00 a.m - 12:00 p.m
|
WebRTC is the newest of the communications standards growing at W3C, providing web developers with APIs to obtain access to local cameras and microphones and send that audio and video directly to any other browser, without plug-ins. The media flows peer-to-peer, without requiring a media gateway or relay in many cases. This course gives attendees a solid introduction into the standard that could soon become the foundation for all live person-to-person communication. The tutorial covers the VoIP technologies underlying WebRTC, common deployment models, and the APIs for using the standard. See demos, walkthroughs of real code, and—time permitting—some live coding. |
|
Break
12:00 p.m - 1:00 p.m
|
STKU-7 – Best Practices in Gap Analysis and Usability Methods
1:00 p.m - 4:00 p.m
Nancy Gardner, Principal Consultant, Professional Services - Verizon Business Group
|
Understand the business advantage of partnering best practices in gap analysis with usability methods. This often underutilized approach provides a foundation for strategic speech assessments. These methods are simple and low-cost with little disruption to the day-to-day development process and have a high impact on customer satisfaction. Participants review the current industry best practices, create a speech usability checklist, and then conduct an in-class gap analysis. This hands-on tutorial also covers how to analyze and interpret the findings as well as how to formulate and present recommendations that contribute to building a business case for every improvement. |
|
STKU-8 – After the Vendor Leaves (CANCELLED)
1:00 p.m - 4:00 p.m
|
This workshop has been cancelled. |
|
STKU-9 – Using W3C Standard Languages to Develop Multimodal Applications
1:00 p.m - 4:00 p.m
|
The Multimodal Architecture is a standard for integrating components of a multimodal system into a smoothly coordinated application through the use of standard life-cycle events. EMMA is used within the Multimodal Architecture to represent the semantics of user inputs in a modality-independent fashion. Learn how to use these standards to enhance existing applications with multimodality (e.g., typing, handwriting, speaking) using the Openstream's Cue-me platform and the AT&T Speech Mashup. Attendees can explore using EMMA and the MMI Architecture in a browser client. To illustrate the concepts, attendees can apply these standards to build a form-filling application that fills multiple form slots from a single utterance. |
|
|
Diamond Sponsor
Platinum Sponsors
Gold Sponsors
Corporate Sponsors
Press Room Sponsor
Tuesday Keynote Lunch Sponsor
Tuesday Evening Networking Reception Sponsor
Media Sponsors
Co-located with:
|