Speechace Speaking Test
  • Speechace speaking test documentation
    • Use cases
    • Test development
    • Test reliability
    • Test customization
    • Audio/Video recording
    • Usage with children
    • Usage and demographics
  • Getting Started
    • Sign up and create a workspace
    • Invite team members
    • Create an assessment
      • Clone from Library
      • Assessment with custom questions
    • Invite test takers
    • Review Reports
  • Speechace Workspaces: create, manage, and share speaking assessments
    • Create multiple workspaces
    • Switch between workspaces
    • Collaborating in a Workspace
      • Invite team members to a workspace
        • Accept a workspace invitation
      • Role-based access
    • Creating Assessments
      • Clone Assessments
        • Cloning from the Library
        • Clone a single assessment
      • Create new custom assessments
        • Select avatar and language
        • Custom assessment configurations
          • Configuration panel
          • Types of questions
            • Open-ended
            • Read-aloud
            • Task Achievement
            • Record Only
            • Writing
              • Describe image
              • Essay Writing
          • Weights, Max Scores, Timers
      • Rubrics (supported+default)
      • Audio or video assessment
      • Score settings
    • Manage existing assessments
      • Invite test-takers to take an assessment
      • Review scores
      • Edit assessment configurations
      • Edit assessment questions
      • Set up email notifications for completed tests
      • Clone an existing assessment
      • Move an assessment
      • Delete an assessment
      • Add more activities within an existing assessment
    • Customize avatars
      • Edit Avatar
        • Versioning
        • Change avatar and language
        • Edit avatar narration
        • Change Avatar Background
        • Upload Custom Videos
      • Generate avatar
      • Apply avatar
    • Manage tags of assessment results
      • Add new tags
      • Edit or Delete tags
    • Sharing individual activities within an assessment with test-takers
    • Sharing assessments with candidates
    • Inviting candidates to take an assessment
    • Usage tracking
    • Upgrade your plan
    • Embedding assessments in other apps and websites
      • oEmbed integration for Speechace speaking test
        • Receiving test results on specified callback
          • Results callback to 3rd party application backend
            • Export report in json
          • Test completion notification to 3rd party app front-end
      • iframe integration for Speechace speaking test
  • Detailed test report
  • Test taker guide
    • Test-taker's view of Scores
Powered by GitBook
On this page
  1. Speechace speaking test documentation

Test development

PreviousUse casesNextTest reliability

Last updated 2 years ago

The Speechace test is made fully automatic through extensive use of natural processing techniques. The test is capable of rating students within ±0.54 point of qualified IELTS examiners. This incredible breakthrough was achieved through painstaking data gathering, manual rating, machine learning modeling and user testing over an 18 month period.

The test was originally introduced as a practice activity in Speechace’s IELTSAce () app. The app is geared towards helping IELTS students prepare for the speaking section on the IELTS exam. The app is available for both Android and iOS devices and has been downloaded by over 1 millon students as evident below:

Audio samples collected from the app were transcribed samples using industry leading speech recognition and then manually graded by 3 qualified IELTS examiners on a variety of parameters including pronunciation, vocabulary, grammar, coherence and relevance. If 2 IELTS raters had more than one point difference in rating then the 3rd rater was asked to arbitrate. Here are the key statistics observed with regard to inter-rater agreement:

% of items on which raters gave exactly the same grade = 21.8% % of items on which raters were within 0.5 IELTS points = 72.7% % of items on which raters were within 1 IELTS points = 98.2% Cohen’s kappa = 0.794 Pearson’s correlation between raters = 0.883 RMSE = 0.674

Once the data was graded, we ran algorithms that evaluated thousands of English syntax and sematic rules to determine which rules mattered the most for assessing pronunciation, fluency, vocabulary, grammar, cohesion and relevance. We then built deep learning machine learning models on the filtered set of rules to accurately predict IELTS scores on a 9 point scale for any arbitrary audio sample. Note that test re-test reliability of our models is found to be 0.82.

https://play.google.com/store/apps/details?id=com.ielts.speechace.ieltsace&hl=en_US&gl=US